Solved: How to group/summarize table in Power BI based on ...

Fahdalyk · ‎02-29-2024

I have been struggling with a problem for quite a while now. I have mainly been using Excel to go around this every time but I'm sure it can be done and automated in Power BI which is what I need help with. I will highlight my problem and step by step process I do to achieve my desired result and hopefully someone can help translate this to Power BI.

Image 1 is what my starting table looks like. The objective is to return only 1 row per 'Name'. First rule should be the Name, Segment 1, Segment 2 with the max number of 'Account ID' under it. The second is if incase there's a tie, we should get any or the row with non blank Segment 1 and Segment 2 (whatever is easier).

In the second step I take a count of each Account under said Name, Segment 1, Segment 2.

In the third step, I take max of count of Account under said Name.

This is the final desired output. Continuing from the third step, I applied a lookup on 'Name & Max of Acc ID' (Create a new ID column) against Image 2 (where I also create 'Name & Max of Acc ID' field, and return Segment 1 and Segment 2. The only manual adjustment I have made here is input Segment 1 and Segment 2 for 'Qwerty' as it had a tie on all 3 segments and I wanted the non blank ones. If this is too difficult, I am fine with returning first match or whatever here.

Would be grateful if someone can help out!

I have tried solving this problem using a DAX measure or Power Query but my Power BI skills are nowhere like my Excel skills. I managed to make some progress but still getting stuck midway.

dufoq3 · ‎03-02-2024

Hi @Fahdalyk, different approach here.

For Zebra - Segment 1 column there is no rule which one to choose, but it could be i.e. first or last from alphabetical order etc...

Result

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pZBLDsIgEIbvwroeog80jZooxRBtukCcVGKBBhuNt3eKTSxbDSz+mfnmWdck7fsOSEI4qKt1nWtfaFA7gB+ktgbsgHaa5QVpkt/pDpQzqCrwD63gjnIn2DpQmbT40IPipm2LaqnRpcbUTcnpf9SsJVIsYCc4exkmBx9WGIdkooiDi9wZAz50wC8oi4FZ5XFrQT/h/ROvMRUNdQ957J/lfS/Dj4zGGPfyMrXelquKNM0b", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Segment 1" = _t, #"Segment 2" = _t, #"Account ID" = _t]),
    GroupedRows1 = Table.Group(Source, {"Name", "Segment 1", "Segment 2"}, {{"Count of Account ID", each Table.RowCount(_), Int64.Type}}),
    GroupedRows2 = Table.Group(GroupedRows1, {"Name"}, {{"Max of Count of Account ID", each List.Max([Count of Account ID]), type number}}),
    MergedQueryItself = Table.NestedJoin(GroupedRows2, {"Name", "Max of Count of Account ID"}, GroupedRows1, {"Name", "Count of Account ID"}, "GroupedRows1", JoinKind.LeftOuter),
    Ad_GropuedRows1Record = Table.TransformColumns(MergedQueryItself, {{"GroupedRows1", each 
     Table.Max(
        Table.AddColumn(_, "Count", (x)=> List.Count(List.Select(Record.ToList(Record.SelectFields(x, {"Segment 1", "Segment 2"}) ), (y)=> y <> null and y <> "" ))),  //Add Column which counts not blank values in 'Segment 1' and 'Segment 2'
        "Count" ) }} ),
    #"Expanded GroupedRows1" = Table.ExpandRecordColumn(Ad_GropuedRows1Record, "GroupedRows1", {"Segment 1", "Segment 2"}, {"Segment 1", "Segment 2"})
in
    #"Expanded GroupedRows1"

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

dufoq3 · ‎03-02-2024

Hi @Fahdalyk, different approach here.

For Zebra - Segment 1 column there is no rule which one to choose, but it could be i.e. first or last from alphabetical order etc...

Result

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pZBLDsIgEIbvwroeog80jZooxRBtukCcVGKBBhuNt3eKTSxbDSz+mfnmWdck7fsOSEI4qKt1nWtfaFA7gB+ktgbsgHaa5QVpkt/pDpQzqCrwD63gjnIn2DpQmbT40IPipm2LaqnRpcbUTcnpf9SsJVIsYCc4exkmBx9WGIdkooiDi9wZAz50wC8oi4FZ5XFrQT/h/ROvMRUNdQ957J/lfS/Dj4zGGPfyMrXelquKNM0b", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Segment 1" = _t, #"Segment 2" = _t, #"Account ID" = _t]),
    GroupedRows1 = Table.Group(Source, {"Name", "Segment 1", "Segment 2"}, {{"Count of Account ID", each Table.RowCount(_), Int64.Type}}),
    GroupedRows2 = Table.Group(GroupedRows1, {"Name"}, {{"Max of Count of Account ID", each List.Max([Count of Account ID]), type number}}),
    MergedQueryItself = Table.NestedJoin(GroupedRows2, {"Name", "Max of Count of Account ID"}, GroupedRows1, {"Name", "Count of Account ID"}, "GroupedRows1", JoinKind.LeftOuter),
    Ad_GropuedRows1Record = Table.TransformColumns(MergedQueryItself, {{"GroupedRows1", each 
     Table.Max(
        Table.AddColumn(_, "Count", (x)=> List.Count(List.Select(Record.ToList(Record.SelectFields(x, {"Segment 1", "Segment 2"}) ), (y)=> y <> null and y <> "" ))),  //Add Column which counts not blank values in 'Segment 1' and 'Segment 2'
        "Count" ) }} ),
    #"Expanded GroupedRows1" = Table.ExpandRecordColumn(Ad_GropuedRows1Record, "GroupedRows1", {"Segment 1", "Segment 2"}, {"Segment 1", "Segment 2"})
in
    #"Expanded GroupedRows1"

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Fahdalyk · ‎03-06-2024

Thank you so much! This worked perfectly. You are my hero ❤️

A follow up question: In case the Account IDs were not the same in the starting dataset for each Name, Segment 1, Segment 2. So in Apple, Technology, Entertainment case we had ABCD for 1st row and DCBR for 2nd row- and we would need a distinct count instead of a row count. Can this code be easily updated or it would require a new approach? My bad for not asking this in my original post, but I actually need a distinct count of Account ID's for each Name, Segment1, Segment2 instead of a row count.

Ignore the follow up question, on further testing the code is giving me results as intended and luckily in my case row count works too ^^ continue having a good day sir!

lbendlin · ‎02-29-2024

Your third step is non-deterministic as you noted. Power Query can handle this via Table.Distinct over a single column but that again ends up being non-deterministic.

let

    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pZBLDsIgEIbvwroeog80jZooxRBtukCcVGKBBhuNt3eKTSxbDSz+mfnmWdck7fsOSEI4qKt1nWtfaFA7gB+ktgbsgHaa5QVpkt/pDpQzqCrwD63gjnIn2DpQmbT40IPipm2LaqnRpcbUTcnpf9SsJVIsYCc4exkmBx9WGIdkooiDi9wZAz50wC8oi4FZ5XFrQT/h/ROvMRUNdQ957J/lfS/Dj4zGGPfyMrXelquKNM0b", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Segment 1" = _t, #"Segment 2" = _t, #"Account ID" = _t]),

    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Segment 1", type text}, {"Segment 2", type text}, {"Account ID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Name", "Segment 1", "Segment 2"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
    #"Grouped Rows1" = Table.Group(#"Grouped Rows", {"Name", "Count"}, {{"Rows", each _, type table [Name=nullable text, Segment 1=nullable text, Segment 2=nullable text, Count=number]}}),
    #"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"Name", Order.Ascending}, {"Count", Order.Descending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Name"}),
    #"Expanded Rows" = Table.ExpandTableColumn(#"Removed Duplicates", "Rows", {"Segment 1", "Segment 2"}, {"Segment 1", "Segment 2"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Rows", each ([Segment 1] <> "")),
    #"Removed Duplicates1" = Table.Distinct(#"Filtered Rows", {"Name"})
in
    #"Removed Duplicates1"

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

How to group/summarize table in Power BI based on criteria

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

Join the #PBI10 DataViz contest

How to group/summarize table in Power BI based on criteria

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025