Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
Nilselmano
Frequent Visitor

Power query: Group by Date Column and keep Max for all other columns - Dynamic approach

Hi,
I am looking for help with the group by function in Power Query.
I am trying to consolidate files from a folder where historical data is placed in separate CSV files with a common key which is the date/time stamp, however, some periods might be overlapping when a new file is added.
I cannot simply remove duplicate date/time values since some "measuement" are taken at the exact same time and is thus not a duplicate, but others are.
My idea is to group by the date column and then keep the maximum value of the other columns and this works fine.

But since new data is added regularly and sometimes with new columns i want to do this without hardcoding column names.
This is the formula generated from the built in function in PQ:

 

= Table.Group(Source, {"Date"}, {{"Column 1", each List.Max([Column 1]), type nullable number}, {"Column 2", each List.Max([Column 2]), type number}, {"Column n", each List.Max([Column n]), type nullable number}})

 

As you can see Column name 1 to n is hardcoded.

So in short i want to group all other columns by the "Date" column and keep the max values.

Any pointers, thanks in advance.

1 ACCEPTED SOLUTION
ronrsnfld
Community Champion
Community Champion

You can create a dynamic List of functions, based on the column names, for both setting the data types, and also performing the aggregations in the Grouping.

 

In the aggregation transformation, C represents the column name; the underscore ( _ ) represents the subtable returned in the Table.Group function

 

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TY7RDQAhCEN34dtExTv1ZjHuv4ZQMMdHTekjxbWoZs5cKRH0yPOa2UmhDqxBuyvDSZcBIV9TAtEi1A7RFwDa1UzIgC51v9Tg/9wK/BcRTIcFrXsf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    
//List of column names EXCEPT for "Date"
    maxCols= List.RemoveItems(Table.ColumnNames(Source),{"Date"}),

//set data types
    #"Changed Type" = Table.TransformColumnTypes(Source,
        {{"Date", type date}} & List.Transform(maxCols, each {_, Int64.Type})),

//Group and create list of aggregation functions
    group = Table.Group(#"Changed Type",
            "Date",
            List.Transform(maxCols, (C) => {C, each List.Max(Table.Column(_, C)), type number})
)

in
    group

 

Source Data

ronrsnfld_0-1633047437559.png

 

Typed

ronrsnfld_1-1633047480539.png

Results

ronrsnfld_2-1633047520143.png

 

 

View solution in original post

2 REPLIES 2
ronrsnfld
Community Champion
Community Champion

You can create a dynamic List of functions, based on the column names, for both setting the data types, and also performing the aggregations in the Grouping.

 

In the aggregation transformation, C represents the column name; the underscore ( _ ) represents the subtable returned in the Table.Group function

 

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TY7RDQAhCEN34dtExTv1ZjHuv4ZQMMdHTekjxbWoZs5cKRH0yPOa2UmhDqxBuyvDSZcBIV9TAtEi1A7RFwDa1UzIgC51v9Tg/9wK/BcRTIcFrXsf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    
//List of column names EXCEPT for "Date"
    maxCols= List.RemoveItems(Table.ColumnNames(Source),{"Date"}),

//set data types
    #"Changed Type" = Table.TransformColumnTypes(Source,
        {{"Date", type date}} & List.Transform(maxCols, each {_, Int64.Type})),

//Group and create list of aggregation functions
    group = Table.Group(#"Changed Type",
            "Date",
            List.Transform(maxCols, (C) => {C, each List.Max(Table.Column(_, C)), type number})
)

in
    group

 

Source Data

ronrsnfld_0-1633047437559.png

 

Typed

ronrsnfld_1-1633047480539.png

Results

ronrsnfld_2-1633047520143.png

 

 

Hi ronrsnfld,

I tried this out today and it worked as intended, and i learned a lot from your comment.

Thank you!

Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

February 2024 Update Carousel

Power BI Monthly Update - February 2024

Check out the February 2024 Power BI update to learn about new features.

Fabric Career Hub

Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Top Solution Authors
Top Kudoed Authors