March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
I have a data set that contains columns for counties and states in the US, and COVID-19 case counts and deaths in each county each day. The numbers are cumulative...if there was one case on day one, and no new cases on day two, there will be a 1 for day one and a 1 for day two. If on day three there is an additional case, then the number will change to 2. I am trying to find an expression in either M or Dax (if I can't do it in Power Query) that will let me calculate the daily change in case numbers. So, for instance in the above example, I would have had a 1 day 1, a 0 for day two, and a 1 for day three. This would give me the total of new cases reported each day.
I realize that some sort of moving sum taking the difference between yesterday and today would work. One problem, though, is that I need the expression to let me also group by county as well as date, so I get the number of new cases in each county for each day. I would also have to put in some sort of "if the result is less than 0, then return 0" feature (because sometimes the counties mess up and report fewer cases instead of the cumulative number)...probably better to set it to return a truncated 7-day average, so I don't end up with spurious zeros in my time series. I have tried using the "Group by" option in Power Query, but it doesn't have a "moving" calculation. Anyone have any ideas?
Solved! Go to Solution.
Hi @ripstaur ,
First create an index column;
Then create a column as below:
Daily Cases =
VAR _maxvalue =
CALCULATE (
MAX ( 'Table'[Cases] ),
FILTER (
'Table',
'Table'[County] = EARLIER ( 'Table'[County] )
&& 'Table'[Index] < EARLIER ( 'Table'[Index] )
)
)
RETURN
'Table'[Cases] - _maxvalue
And you will see:
For the related .pbix file,pls see attached.
Best Regards,
Kelly
Did I answer your question? Mark my reply as a solution!
Power Query only alternative:
let
Source = YourSource,
#"Added Index" = Table.AddIndexColumn(Source, "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(
#"Added Index",
"Custom",
each
let
Previous = try #"Added Index"[Cases]{[Index] - 1} otherwise 0,
Result = if Previous > [Cases] then 0 else [Cases] - Previous
in
Result
)
in
#"Added Custom"
which adds a 0-based Index column and then a Custom column which gives your required results. Amend the line Source = YourSource accordingly to fit in with your previous steps.
Regards
Thanks, Jos,
I tried this, copying your code into the Advanced Editor, and I ended up with this:
If I expand all the columns in the "Data" column, I get the following:
Not sure where I'm going wrong....
Oh...I didn't get it all. Beyond Data.Column6 I have:
It's probably worth noting that the index column contains nothing but zeros, and the Custom column is all Errors.
Another note: My original query included steps to promote the first column as headers, the "change type" commands that always follow promoting headers, and a command to remove column 5 (a blank column in my source). When I added your code after those lines, I got exactly the same result as I did when I just wrote your code in after the source lines.
"Another note: My original query included steps to promote the first column as headers, the "change type" commands that always follow promoting headers, and a command to remove column 5 (a blank column in my source). When I added your code after those lines, I got exactly the same result as I did when I just wrote your code in after the source lines."
Yes, you should keep your original steps in. And when you added my code in, did you make sure that the (previous) step being referenced in my first line was the last of your initial steps (and not the first)?
Regards
Thank you , Edhans. This problem is like calculating moving ranges, but I need to be able to do it per county across the date range. I put together a sample (with an extra column showing the desired result of the calculation. When shifting from one county to another, the date is going to drop back to the starting date for the time period. In the result cell for that record, the value should equal the "cases" value for that day.
State | County | Date | Cases | Desired Result (Daily Cases) | |
Alabama | Abbot | 1/1/2020 | 1 | 1 | |
Alabama | Abbot | 1/2/2020 | 1 | 0 | |
Alabama | Abbot | 1/3/2020 | 2 | 1 | |
Alabama | Abbot | 1/4/2020 | 2 | 0 | |
Alabama | Abbot | 1/5/2020 | 3 | 1 | |
Alabama | Abbot | 1/6/2020 | 3 | 0 | |
Alabama | Abbot | 1/7/2020 | 3 | 0 | |
Alabama | Abbot | 1/8/2020 | 4 | 1 | |
Alabama | Abbot | 1/9/2020 | 4 | 0 | |
Alabama | Abbot | 1/10/2020 | 5 | 1 | |
Alabama | Billings | 1/1/2020 | 0 | 0 | |
Alabama | Billings | 1/2/2020 | 0 | 0 | |
Alabama | Billings | 1/3/2020 | 2 | 2 | |
Alabama | Billings | 1/4/2020 | 2 | 0 | |
Alabama | Billings | 1/5/2020 | 3 | 1 | |
Alabama | Billings | 1/6/2020 | 4 | 1 | |
Alabama | Billings | 1/7/2020 | 4 | 0 | |
Alabama | Billings | 1/8/2020 | 4 | 0 | |
Alabama | Billings | 1/9/2020 | 5 | 1 | |
Alabama | Billings | 1/10/2020 | 5 | 0 | |
Delaware | DeForbes | 1/1/2020 | 0 | 0 | |
Delaware | DeForbes | 1/2/2020 | 1 | 1 | |
Delaware | DeForbes | 1/3/2020 | 1 | 0 | |
Delaware | DeForbes | 1/4/2020 | 1 | 0 | |
Delaware | DeForbes | 1/5/2020 | 3 | 2 | |
Delaware | DeForbes | 1/6/2020 | 3 | 0 | |
Delaware | DeForbes | 1/7/2020 | 5 | 2 | |
Delaware | DeForbes | 1/8/2020 | 7 | 2 | |
Delaware | DeForbes | 1/9/2020 | 7 | 0 | |
Delaware | DeForbes | 1/10/2020 | 7 | 0 |
Hi @ripstaur ,
First create an index column;
Then create a column as below:
Daily Cases =
VAR _maxvalue =
CALCULATE (
MAX ( 'Table'[Cases] ),
FILTER (
'Table',
'Table'[County] = EARLIER ( 'Table'[County] )
&& 'Table'[Index] < EARLIER ( 'Table'[Index] )
)
)
RETURN
'Table'[Cases] - _maxvalue
And you will see:
For the related .pbix file,pls see attached.
Best Regards,
Kelly
Did I answer your question? Mark my reply as a solution!
Thanks, Kelly! This worked like a charm. I'm still trying to make a PowerQuery solution work, but this DAX solution is absolutely one great solution. My question now becomes - which is a more efficient solution--the DAX modeling or a PowerQuery one, if we can figure it out? I am going to replicate this for the number of deaths per county as well. Since there are about 3400 counties in the US, and the daily report also include extras (e.g., US territory counts, counts of cases that were reported without a county affiliation), I add 3400+ new records every day , with a State column, a county column, a lat and long column, a FIPS column, cumulative cases, cumulative deaths, daily cases and daily deaths. So would it be more efficient to do this in modeling, or querying?
Best regards,
Rip
Hello, this would be btter tackled with DAX but you can do it in power query.
1. Sort the data by say country, region and date.
2. add index starting at 0
3. duplicate table and alter the index to starting at 1
4. merge tables together with a full outer join using country and index as joins
5. you will now have the cumulative column and the cumulative column offset by one day in the same row. Subtract one from the other and you have your daily movement. Obviously for the first and last value for each country/region you will need to add some logic into the subtraction
Thanks, samdthompson! This is a good hint...if I can make it work, I will post the code as a solution.
How to get good help fast. Help us help you.
How To Ask A Technical Question If you Really Want An Answer
How to Get Your Question Answered Quickly - Give us a good and concise explanation
How to provide sample data in the Power BI Forum - Provide data in a table format per the link, or share an Excel/CSV file via OneDrive, Dropbox, etc.. Provide expected output using a screenshot of Excel or other image. Do not provide a screenshot of the source data. I cannot paste an image into Power BI tables.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingMarch 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
21 | |
16 | |
13 | |
13 | |
9 |
User | Count |
---|---|
36 | |
31 | |
20 | |
19 | |
17 |