The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi all,
I need a query that summarizes and generates a row for every date and all 99 MSOAs (geographical) area from my source table. For context I have a large dataset of records with a date and geographical data (in this case MSOA). I want to work out an incidence rate by day and by geography but to do this I need a row for each date by each geographical area. I.E (2 years) 730 rows X 99 geographical areas = around 70,000 rows.
I previously thought a calculated table using SUMMARIZECOLUMNS could solve this issue but I realised that the calculated table did not generate rows for the dates when there were no counts in particular geographical areas so I was missing rows that had a 0 count. I tried to summarise this issue below. This meant the incidence rate skipped to the previous date there is a count rather than counting the dates with a 0.
What I thought I need to do is to somehow summarize columns in the query editor before merging with my calendar table so I can generate a row for each date. Any ideas?
The alternative is to create a new table each time and filter my original table by the geographical area 99 times and then do a merge with the calendar table but this would take a long time.
Someone else suggested dates between function but I don't have a start or end date just a singular record date.
I need something simple like summarize columns but with the ability to generate dates between my records that have a 0 count. Any help is greatly appreciated.
Thanks.
Solved! Go to Solution.
Hi @Anonymous
Ok, it is all about the dates, so let's extract the min and max date in the original data, then generate every single day in between, have a try
let
Source = Table.Group(Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}), {"Preferred MSOA Name", "Specimen Date"}, {{"Count", each List.Sum([Count]), type nullable number}}),
Date = List.Transform( { Number.From(List.Min(Source[Specimen Date]))..Number.From(List.Max(Source[Specimen Date]))}, Date.From),
MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
#"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
#"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
#"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
#"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
#"Replaced Value"
try this (all steps done by GUI)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyNDJR0l32B/RwUQw1ApVgebnBGQYQyRM0KXM8YjB2IYQeSMsekzUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, MSOA = _t, Case = _t]),
#"Colonna trasformata tramite Pivot" = Table.Pivot(Source, List.Distinct(Source[Date]), "Date", "Case"),
#"Sostituito valore" = Table.ReplaceValue(#"Colonna trasformata tramite Pivot",null,"0",Replacer.ReplaceValue,{"01/01/2021", "02/01/2021", "03/01/2021"}),
#"Trasformate altre colonne tramite UnPivot" = Table.UnpivotOtherColumns(#"Sostituito valore", {"MSOA"}, "Attributo", "Valore"),
#"Ordinate righe" = Table.Sort(#"Trasformate altre colonne tramite UnPivot",{{"Attributo", Order.Ascending}})
in
#"Ordinate righe"
Could you use a DAX measure like:
Answer = Calculate(SUM(Table[Column], ALL(Dates[Date], DATESBETWEEN(MIN(Dates[Date]), MAX(Dates[Date])), Values(GeoTable[Area]))
Sorry, you said NOT DAX. Isn't this just a simple left join with full date table and your other table?
Hi there, thanks for your reply but it is probably a bit beyond my PBI skills to actually implement it. I'm not sure how to adapt the query to my tables, especially the source part.
I've created a dummy pbix in the hope that you wouldn't mind looking at it and providing a solution based on the tables and sources I am using please.
Thanks again
https://www.mediafire.com/file/zq52qekh0tznagh/Test+MSOA+dashboard.pbix/file
Hi @Anonymous
Right click to start a new blank query to paste it in Advanced Editor, let me know if it does not work as I am modifying the code without accessing your source folder...
let
Source = Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}),
Date = List.Distinct( Source[Specimen Date]),
MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
#"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
#"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
#"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
#"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
#"Replaced Value"
And one suggestion on reading from folder, please do not use Sample query generated by PBI for you, try to add a custom column to combine your files. You can search for it, there are many posts about it.
Hi @Vera_33 thanks again for your reply.
That has almost worked. I have a slight issue in that the query takes so long to process that it crashes before it finishes. I created a new dashboard and removed all other tables that it references and removed the columns except the count, the date and the MSOA. This then worked up to a point.
A further issue then was that your query did not sum the 'count' column and so I had to use summarizecolumns and use the sum function to do that, ending up with a calculated table again. This then did not create a row for every date from the calendar table. Instead, it created a row for each MSOA where there was a date that had a case. This is OK for now as there are cases every day in at least one MSOA but in time it is likely that there might be a day with no cases at all in any MSOA and so the query would not generate rows that contained 0 counts.
Does that make sense? Thanks again so far, I really appreciated it!
Hi @Anonymous
Understood, so sum up the Count first, try it
let
Source = Table.Group(Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}), {"Preferred MSOA Name", "Specimen Date"}, {{"Count", each List.Sum([Count]), type nullable number}}),
Date = List.Distinct( Source[Specimen Date]),
MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
#"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
#"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
#"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
#"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
#"Replaced Value"
Hi @Vera_33 that has worked to get the count column thank you. The only thing missing is still the lack of rows for every single date. I've taken a snip of the table when sorted in acending order by date. You can see there are rows for the 3rd March 2020 and the 5th March 2020 but the query hasn't generated rows for the 4th March and other dates when there isn't a case in any single MSOA on that date. The 4th March ideally should have 99 rows that have a count of 0. Is that possible? Thanks.
Hi @Anonymous
Ok, it is all about the dates, so let's extract the min and max date in the original data, then generate every single day in between, have a try
let
Source = Table.Group(Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}), {"Preferred MSOA Name", "Specimen Date"}, {{"Count", each List.Sum([Count]), type nullable number}}),
Date = List.Transform( { Number.From(List.Min(Source[Specimen Date]))..Number.From(List.Max(Source[Specimen Date]))}, Date.From),
MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
#"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
#"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
#"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
#"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
#"Replaced Value"
Perfect, this worked exactly as I wanted now. The only issue is speed. My system takes forever to run the query still. Wondering if this is the VPN I work on as I have 16gb ram to play with. I've noticed the CPU usage heading up to 85-90% when trying to run it. Probably need to contact IT.
Thanks again though!
@Anonymous
have you, by chance, tried the solution proposed by me?
Since it uses GUI functions it should be pretty fast, if it does what you ask (in fact, I don't remember what the problem was anymore)
Hi @Anonymous
In M, paste it in Advanced Editor. If you want to use DAX measure, I believe you can try CROSSJOIN
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyNDJR0l32B/RwUQw1ApVgebnBGQYQyRM0KXM8YjB2IYQeSMsekzUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, MSOA = _t, Case = _t]),
Date = List.Distinct( Source[Date]),
MSOA = Table.FromList( List.Distinct(Source[MSOA])),
#"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "MSOA"}}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Date", each Date),
#"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Date"),
#"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"MSOA", "Date"}, Source, {"MSOA", "Date"}, "Expanded Date", JoinKind.LeftOuter),
#"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Case"}, {"Case"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Case"})
in
#"Replaced Value"
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
180 | |
52 | |
38 | |
28 | |
25 |