Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
Anonymous
Not applicable

How to summarize columns using Power Query Editor NOT DAX

Hi all, 

I need a query that summarizes and generates a row for every date and all 99 MSOAs (geographical) area from my source table. For context I have a large dataset of records with a date and geographical data (in this case MSOA). I want to work out an incidence rate by day and by geography but to do this I need a row for each date by each geographical area. I.E (2 years) 730 rows X 99 geographical areas = around 70,000 rows.

I previously thought a calculated table using SUMMARIZECOLUMNS could solve this issue but I realised that the calculated table did not generate rows for the dates when there were no counts in particular geographical areas so I was missing rows that had a 0 count. I tried to summarise this issue below. This meant the incidence rate skipped to the previous date there is a count rather than counting the dates with a 0.

josselot1_0-1615910072197.jpeg

 

What I thought I need to do is to somehow summarize columns in the query editor before merging with my calendar table so I can generate a row for each date. Any ideas?

 The alternative is to create a new table each time and filter my original table by the geographical area 99 times and then do a merge with the calendar table but this would take a long time.

Someone else suggested dates between function but I don't have a start or end date just a singular record date.

I need something simple like summarize columns but with the ability to generate dates between my records that have a 0 count. Any help is greatly appreciated. 

Thanks. 


1 ACCEPTED SOLUTION

Hi @Anonymous 

 

Ok, it is all about the dates, so let's extract the min and max date in the original data, then generate every single day in between, have a try

 

let
    Source = Table.Group(Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}), {"Preferred MSOA Name", "Specimen Date"}, {{"Count", each List.Sum([Count]), type nullable number}}),
    Date = List.Transform( { Number.From(List.Min(Source[Specimen Date]))..Number.From(List.Max(Source[Specimen Date]))}, Date.From),
    MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
    #"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
    #"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
    #"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
    #"Replaced Value"

 

View solution in original post

12 REPLIES 12
Anonymous
Not applicable

try this (all steps done by GUI)

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyNDJR0l32B/RwUQw1ApVgebnBGQYQyRM0KXM8YjB2IYQeSMsekzUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, MSOA = _t, Case = _t]),
    #"Colonna trasformata tramite Pivot" = Table.Pivot(Source, List.Distinct(Source[Date]), "Date", "Case"),
    #"Sostituito valore" = Table.ReplaceValue(#"Colonna trasformata tramite Pivot",null,"0",Replacer.ReplaceValue,{"01/01/2021", "02/01/2021", "03/01/2021"}),
    #"Trasformate altre colonne tramite UnPivot" = Table.UnpivotOtherColumns(#"Sostituito valore", {"MSOA"}, "Attributo", "Valore"),
    #"Ordinate righe" = Table.Sort(#"Trasformate altre colonne tramite UnPivot",{{"Attributo", Order.Ascending}})
in
    #"Ordinate righe"

 

watkinnc
Super User
Super User

Could you use a DAX measure like:

Answer = Calculate(SUM(Table[Column], ALL(Dates[Date], DATESBETWEEN(MIN(Dates[Date]), MAX(Dates[Date])), Values(GeoTable[Area]))


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

Sorry, you said NOT DAX. Isn't this just a simple left join with full date table and your other table?


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!
Anonymous
Not applicable

Hi there, thanks for your reply but it is probably a bit beyond my PBI skills to actually implement it. I'm not sure how to adapt the query to my tables, especially the source part. 

I've created a dummy pbix in the hope that you wouldn't mind looking at it and providing a solution based on the tables and sources I am using please. 

Thanks again

https://www.mediafire.com/file/zq52qekh0tznagh/Test+MSOA+dashboard.pbix/file

Hi @Anonymous 

 

Right click to start a new blank query to paste it in Advanced Editor, let me know if it does not work as I am modifying the code without accessing your source folder...

 

Vera_33_1-1616036117968.png

 

 

let
    Source = Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}),
    Date = List.Distinct( Source[Specimen Date]),
    MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
    #"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
    #"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
    #"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
    #"Replaced Value"

 

 

Vera_33_0-1616036087803.png

 

And one suggestion on reading from folder, please do not use Sample query generated by PBI for you, try to add a custom column to combine your files. You can search for it, there are many posts about it.

Anonymous
Not applicable

Hi @Vera_33 thanks again for your reply. 

That has almost worked. I have a slight issue in that the query takes so long to process that it crashes before it finishes. I created a new dashboard and removed all other tables that it references and removed the columns except the count, the date and the MSOA. This then worked up to a point. 

A further issue then was that your query did not sum the 'count' column and so I had to use summarizecolumns and use the sum function to do that, ending up with a calculated table again. This then did not create a row for every date from the calendar table. Instead, it created a row for each MSOA where there was a date that had a case. This is OK for now as there are cases every day in at least one MSOA but in time it is likely that there might be a day with no cases at all in any MSOA and so the query would not generate rows that contained 0 counts. 

Does that make sense? Thanks again so far, I really appreciated it! 

Hi @Anonymous 

 

Understood, so sum up the Count first, try it

let
    Source = Table.Group(Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}), {"Preferred MSOA Name", "Specimen Date"}, {{"Count", each List.Sum([Count]), type nullable number}}),
    Date = List.Distinct( Source[Specimen Date]),
    MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
    #"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
    #"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
    #"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
    #"Replaced Value"
Anonymous
Not applicable

Hi @Vera_33 that has worked to get the count column thank you. The only thing missing is still the lack of rows for every single date. I've taken a snip of the table when sorted in acending order by date. You can see there are rows for the 3rd March 2020 and the 5th March 2020 but the query hasn't generated  rows for the 4th March and other dates when there isn't a case in any single MSOA on that date. The 4th March ideally should have 99 rows that have a count of 0. Is that possible? Thanks.


josselot1_0-1616502790457.png



Hi @Anonymous 

 

Ok, it is all about the dates, so let's extract the min and max date in the original data, then generate every single day in between, have a try

 

let
    Source = Table.Group(Table.SelectColumns(Pos_Case_DPH_2,{"Preferred MSOA Name", "Specimen Date", "Count"}), {"Preferred MSOA Name", "Specimen Date"}, {{"Count", each List.Sum([Count]), type nullable number}}),
    Date = List.Transform( { Number.From(List.Min(Source[Specimen Date]))..Number.From(List.Max(Source[Specimen Date]))}, Date.From),
    MSOA = Table.FromList( List.Distinct(Source[Preferred MSOA Name])),
    #"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "Preferred MSOA Name"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Specimen Date", each Date),
    #"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Specimen Date"),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"Preferred MSOA Name", "Specimen Date"}, Source, {"Preferred MSOA Name", "Specimen Date"}, "Expanded Date", JoinKind.LeftOuter),
    #"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Count"}, {"Count"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Count"})
in
    #"Replaced Value"

 

Anonymous
Not applicable

Perfect, this worked exactly as I wanted now. The only issue is speed. My system takes forever to run the query still. Wondering if this is the VPN I work on as I have 16gb ram to play with. I've noticed the CPU usage heading up to 85-90% when trying to run it. Probably need to contact IT. 

Thanks again though! 

Anonymous
Not applicable

@Anonymous 

have you, by chance, tried the solution proposed by me?

Since it uses GUI functions it should be pretty fast, if it does what you ask (in fact, I don't remember what the problem was anymore)

Vera_33
Resident Rockstar
Resident Rockstar

Hi @Anonymous 

 

In M, paste it in Advanced Editor. If you want to use DAX measure, I believe you can try CROSSJOIN

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjDUByIjAyNDJR0l32B/RwUQw1ApVgebnBGQYQyRM0KXM8YjB2IYQeSMsekzUYqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, MSOA = _t, Case = _t]),
    Date = List.Distinct( Source[Date]),
    MSOA = Table.FromList( List.Distinct(Source[MSOA])),
    #"Renamed Columns" = Table.RenameColumns(MSOA,{{"Column1", "MSOA"}}),
    #"Added Custom" = Table.AddColumn(#"Renamed Columns", "Date", each Date),
    #"Expanded Date" = Table.ExpandListColumn(#"Added Custom", "Date"),
    #"Merged Queries" = Table.NestedJoin(#"Expanded Date", {"MSOA", "Date"}, Source, {"MSOA", "Date"}, "Expanded Date", JoinKind.LeftOuter),
    #"Expanded Expanded Date" = Table.ExpandTableColumn(#"Merged Queries", "Expanded Date", {"Case"}, {"Case"}),
    #"Replaced Value" = Table.ReplaceValue(#"Expanded Expanded Date",null,"0",Replacer.ReplaceValue,{"Case"})
in
    #"Replaced Value"
 

 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.