Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
Nilselmano
Frequent Visitor

Power query: Group by Date Column and keep Max for all other columns - Dynamic approach

Hi,
I am looking for help with the group by function in Power Query.
I am trying to consolidate files from a folder where historical data is placed in separate CSV files with a common key which is the date/time stamp, however, some periods might be overlapping when a new file is added.
I cannot simply remove duplicate date/time values since some "measuement" are taken at the exact same time and is thus not a duplicate, but others are.
My idea is to group by the date column and then keep the maximum value of the other columns and this works fine.

But since new data is added regularly and sometimes with new columns i want to do this without hardcoding column names.
This is the formula generated from the built in function in PQ:

 

= Table.Group(Source, {"Date"}, {{"Column 1", each List.Max([Column 1]), type nullable number}, {"Column 2", each List.Max([Column 2]), type number}, {"Column n", each List.Max([Column n]), type nullable number}})

 

As you can see Column name 1 to n is hardcoded.

So in short i want to group all other columns by the "Date" column and keep the max values.

Any pointers, thanks in advance.

1 ACCEPTED SOLUTION
ronrsnfld
Super User
Super User

You can create a dynamic List of functions, based on the column names, for both setting the data types, and also performing the aggregations in the Grouping.

 

In the aggregation transformation, C represents the column name; the underscore ( _ ) represents the subtable returned in the Table.Group function

 

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TY7RDQAhCEN34dtExTv1ZjHuv4ZQMMdHTekjxbWoZs5cKRH0yPOa2UmhDqxBuyvDSZcBIV9TAtEi1A7RFwDa1UzIgC51v9Tg/9wK/BcRTIcFrXsf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    
//List of column names EXCEPT for "Date"
    maxCols= List.RemoveItems(Table.ColumnNames(Source),{"Date"}),

//set data types
    #"Changed Type" = Table.TransformColumnTypes(Source,
        {{"Date", type date}} & List.Transform(maxCols, each {_, Int64.Type})),

//Group and create list of aggregation functions
    group = Table.Group(#"Changed Type",
            "Date",
            List.Transform(maxCols, (C) => {C, each List.Max(Table.Column(_, C)), type number})
)

in
    group

 

Source Data

ronrsnfld_0-1633047437559.png

 

Typed

ronrsnfld_1-1633047480539.png

Results

ronrsnfld_2-1633047520143.png

 

 

View solution in original post

2 REPLIES 2
ronrsnfld
Super User
Super User

You can create a dynamic List of functions, based on the column names, for both setting the data types, and also performing the aggregations in the Grouping.

 

In the aggregation transformation, C represents the column name; the underscore ( _ ) represents the subtable returned in the Table.Group function

 

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("TY7RDQAhCEN34dtExTv1ZjHuv4ZQMMdHTekjxbWoZs5cKRH0yPOa2UmhDqxBuyvDSZcBIV9TAtEi1A7RFwDa1UzIgC51v9Tg/9wK/BcRTIcFrXsf", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Column1 = _t, Column2 = _t, Column3 = _t, Column4 = _t]),
    
//List of column names EXCEPT for "Date"
    maxCols= List.RemoveItems(Table.ColumnNames(Source),{"Date"}),

//set data types
    #"Changed Type" = Table.TransformColumnTypes(Source,
        {{"Date", type date}} & List.Transform(maxCols, each {_, Int64.Type})),

//Group and create list of aggregation functions
    group = Table.Group(#"Changed Type",
            "Date",
            List.Transform(maxCols, (C) => {C, each List.Max(Table.Column(_, C)), type number})
)

in
    group

 

Source Data

ronrsnfld_0-1633047437559.png

 

Typed

ronrsnfld_1-1633047480539.png

Results

ronrsnfld_2-1633047520143.png

 

 

Hi ronrsnfld,

I tried this out today and it worked as intended, and i learned a lot from your comment.

Thank you!

Helpful resources

Announcements
Fabcon_Europe_Social_Bogo

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Power BI Carousel June 2024

Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

PBI_Carousel_NL_June

Fabric Community Update - June 2024

Get the latest Fabric updates from Build 2024, key Skills Challenge voucher deadlines, top blogs, forum posts, and product ideas.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Top Solution Authors
Top Kudoed Authors