Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Fahdalyk
Frequent Visitor

How to group/summarize table in Power BI based on criteria

I have been struggling with a problem for quite a while now. I have mainly been using Excel to go around this every time but I'm sure it can be done and automated in Power BI which is what I need help with. I will highlight my problem and step by step process I do to achieve my desired result and hopefully someone can help translate this to Power BI.

1.png
Image 1 is what my starting table looks like. The objective is to return only 1 row per 'Name'. First rule should be the Name, Segment 1, Segment 2 with the max number of 'Account ID' under it. The second is if incase there's a tie, we should get any or the row with non blank Segment 1 and Segment 2 (whatever is easier).

2.png
In the second step I take a count of each Account under said Name, Segment 1, Segment 2.

3.png
In the third step, I take max of count of Account under said Name.

4.png

This is the final desired output. Continuing from the third step, I applied a lookup on 'Name & Max of Acc ID' (Create a new ID column) against Image 2 (where I also create 'Name & Max of Acc ID' field, and return Segment 1 and Segment 2. The only manual adjustment I have made here is input Segment 1 and Segment 2 for 'Qwerty' as it had a tie on all 3 segments and I wanted the non blank ones. If this is too difficult, I am fine with returning first match or whatever here.


Would be grateful if someone can help out!


I have tried solving this problem using a DAX measure or Power Query but my Power BI skills are nowhere like my Excel skills. I managed to make some progress but still getting stuck midway.

1 ACCEPTED SOLUTION
dufoq3
Super User
Super User

Hi @Fahdalyk, different approach here.

For Zebra - Segment 1 column there is no rule which one to choose, but it could be i.e. first or last from alphabetical order etc...

 

Result

dufoq3_0-1709379656028.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pZBLDsIgEIbvwroeog80jZooxRBtukCcVGKBBhuNt3eKTSxbDSz+mfnmWdck7fsOSEI4qKt1nWtfaFA7gB+ktgbsgHaa5QVpkt/pDpQzqCrwD63gjnIn2DpQmbT40IPipm2LaqnRpcbUTcnpf9SsJVIsYCc4exkmBx9WGIdkooiDi9wZAz50wC8oi4FZ5XFrQT/h/ROvMRUNdQ957J/lfS/Dj4zGGPfyMrXelquKNM0b", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Segment 1" = _t, #"Segment 2" = _t, #"Account ID" = _t]),
    GroupedRows1 = Table.Group(Source, {"Name", "Segment 1", "Segment 2"}, {{"Count of Account ID", each Table.RowCount(_), Int64.Type}}),
    GroupedRows2 = Table.Group(GroupedRows1, {"Name"}, {{"Max of Count of Account ID", each List.Max([Count of Account ID]), type number}}),
    MergedQueryItself = Table.NestedJoin(GroupedRows2, {"Name", "Max of Count of Account ID"}, GroupedRows1, {"Name", "Count of Account ID"}, "GroupedRows1", JoinKind.LeftOuter),
    Ad_GropuedRows1Record = Table.TransformColumns(MergedQueryItself, {{"GroupedRows1", each 
     Table.Max(
        Table.AddColumn(_, "Count", (x)=> List.Count(List.Select(Record.ToList(Record.SelectFields(x, {"Segment 1", "Segment 2"}) ), (y)=> y <> null and y <> "" ))),  //Add Column which counts not blank values in 'Segment 1' and 'Segment 2'
        "Count" ) }} ),
    #"Expanded GroupedRows1" = Table.ExpandRecordColumn(Ad_GropuedRows1Record, "GroupedRows1", {"Segment 1", "Segment 2"}, {"Segment 1", "Segment 2"})
in
    #"Expanded GroupedRows1"

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

3 REPLIES 3
dufoq3
Super User
Super User

Hi @Fahdalyk, different approach here.

For Zebra - Segment 1 column there is no rule which one to choose, but it could be i.e. first or last from alphabetical order etc...

 

Result

dufoq3_0-1709379656028.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pZBLDsIgEIbvwroeog80jZooxRBtukCcVGKBBhuNt3eKTSxbDSz+mfnmWdck7fsOSEI4qKt1nWtfaFA7gB+ktgbsgHaa5QVpkt/pDpQzqCrwD63gjnIn2DpQmbT40IPipm2LaqnRpcbUTcnpf9SsJVIsYCc4exkmBx9WGIdkooiDi9wZAz50wC8oi4FZ5XFrQT/h/ROvMRUNdQ957J/lfS/Dj4zGGPfyMrXelquKNM0b", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Segment 1" = _t, #"Segment 2" = _t, #"Account ID" = _t]),
    GroupedRows1 = Table.Group(Source, {"Name", "Segment 1", "Segment 2"}, {{"Count of Account ID", each Table.RowCount(_), Int64.Type}}),
    GroupedRows2 = Table.Group(GroupedRows1, {"Name"}, {{"Max of Count of Account ID", each List.Max([Count of Account ID]), type number}}),
    MergedQueryItself = Table.NestedJoin(GroupedRows2, {"Name", "Max of Count of Account ID"}, GroupedRows1, {"Name", "Count of Account ID"}, "GroupedRows1", JoinKind.LeftOuter),
    Ad_GropuedRows1Record = Table.TransformColumns(MergedQueryItself, {{"GroupedRows1", each 
     Table.Max(
        Table.AddColumn(_, "Count", (x)=> List.Count(List.Select(Record.ToList(Record.SelectFields(x, {"Segment 1", "Segment 2"}) ), (y)=> y <> null and y <> "" ))),  //Add Column which counts not blank values in 'Segment 1' and 'Segment 2'
        "Count" ) }} ),
    #"Expanded GroupedRows1" = Table.ExpandRecordColumn(Ad_GropuedRows1Record, "GroupedRows1", {"Segment 1", "Segment 2"}, {"Segment 1", "Segment 2"})
in
    #"Expanded GroupedRows1"

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Thank you so much! This worked perfectly. You are my hero ❤️

A follow up question: In case the Account IDs were not the same in the starting dataset for each Name, Segment 1, Segment 2. So in Apple, Technology, Entertainment case we had ABCD for 1st row and DCBR for 2nd row- and we would need a distinct count instead of a row count. Can this code be easily updated or it would require a new approach? My bad for not asking this in my original post, but I actually need a distinct count of Account ID's for each Name, Segment1, Segment2 instead of a row count.


Ignore the follow up question, on further testing the code is giving me results as intended and luckily in my case row count works too ^^ continue having a good day sir!

lbendlin
Super User
Super User

Your third step is non-deterministic as you noted.  Power Query can handle this via Table.Distinct over a single column but that again ends up being non-deterministic.

 

lbendlin_0-1709253845495.png

 

let

    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("pZBLDsIgEIbvwroeog80jZooxRBtukCcVGKBBhuNt3eKTSxbDSz+mfnmWdck7fsOSEI4qKt1nWtfaFA7gB+ktgbsgHaa5QVpkt/pDpQzqCrwD63gjnIn2DpQmbT40IPipm2LaqnRpcbUTcnpf9SsJVIsYCc4exkmBx9WGIdkooiDi9wZAz50wC8oi4FZ5XFrQT/h/ROvMRUNdQ957J/lfS/Dj4zGGPfyMrXelquKNM0b", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, #"Segment 1" = _t, #"Segment 2" = _t, #"Account ID" = _t]),

    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"Segment 1", type text}, {"Segment 2", type text}, {"Account ID", type text}}),
    #"Grouped Rows" = Table.Group(#"Changed Type", {"Name", "Segment 1", "Segment 2"}, {{"Count", each Table.RowCount(_), Int64.Type}}),
    #"Grouped Rows1" = Table.Group(#"Grouped Rows", {"Name", "Count"}, {{"Rows", each _, type table [Name=nullable text, Segment 1=nullable text, Segment 2=nullable text, Count=number]}}),
    #"Sorted Rows" = Table.Sort(#"Grouped Rows1",{{"Name", Order.Ascending}, {"Count", Order.Descending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Name"}),
    #"Expanded Rows" = Table.ExpandTableColumn(#"Removed Duplicates", "Rows", {"Segment 1", "Segment 2"}, {"Segment 1", "Segment 2"}),
    #"Filtered Rows" = Table.SelectRows(#"Expanded Rows", each ([Segment 1] <> "")),
    #"Removed Duplicates1" = Table.Distinct(#"Filtered Rows", {"Name"})
in
    #"Removed Duplicates1"

 

How to use this code: Create a new Blank Query. Click on "Advanced Editor". Replace the code in the window with the code provided here. Click "Done". Once you examined the code, replace the Source step with your own source.

 

 

 

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.