Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi Everybody,
I am struggling with a problem that i can't find the solution for. Hopefully you guys have some advice!
I'm analyzing Public transport data extracted from several busses. Everytime a bus sends out a signal, a new row is created in our database. This generates a lot of redundant data. Below you'll find an example of the data. As you see there are multple rows generated for one JourneyNumber.
The final result that i'm aiming for are to only keep the orange rows and to filter them out by creating a formula that extracts one row that contains a unique combination of the columns "LinePlanningNumber", "Public name", JourneyStopType" and "JourneyNumber". I want to base this filter on the last known "LastUpdateTimeStamp" (as shown in the picture).
Does someone has experience on this matter?
Do these replies help solve your problem? If so, kindly Accept an appropriate reply as the solution. More people will benefit from it. If you are still confused about it, please provide more details about the problem so other community users can help it further.
Regards,
Community Support Team _ Jing
It does appear that the JourneyNumber does have the same values for the other columns you want to include in the unique combination, so using those other columns in the transformation step would not be required to keep the max date for each set. However, this is how you can achieve the result.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjUwNVDSUXLJLyoAUv4FCiWZWSlAliEQe/qFuAb5urp4Ooa4QoWMLHSNdY0MjIwUDA2sDMyswJpNjSxMjZVidSg0zRyHaV75RSmJeZ6JiXkUuNDSxIhaZpqjmBkLAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [LinePlanningNumber = _t, PublicName = _t, #"Klasse Arrival" = _t, LineDirection = _t, JourneyStopType = _t, #"Count of Klasse Arrival" = _t, LastUpdateTimeStamp = _t, JourneyNumber = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"LinePlanningNumber", Int64.Type}, {"PublicName", type text}, {"Klasse Arrival", type text}, {"LineDirection", Int64.Type}, {"JourneyStopType", type text}, {"Count of Klasse Arrival", Int64.Type}, {"LastUpdateTimeStamp", type text}, {"JourneyNumber", Int64.Type}}),
#"Merged Columns" = Table.AddColumn(#"Changed Type", "UniqueCombination", each Text.Combine ( { Number.ToText ( [LinePlanningNumber] ), [PublicName], [JourneyStopType], Number.ToText ( [JourneyNumber] ) }, "|" ), type text ),
#"Grouped Rows" = Table.Group(#"Merged Columns", {"UniqueCombination", "LinePlanningNumber", "PublicName", "Klasse Arrival", "LineDirection", "JourneyStopType", "Count of Klasse Arrival", "JourneyNumber"}, {{"MaxUpdateTimeStamp", each List.Max([LastUpdateTimeStamp]), type nullable text}, {"Data", each _, type table [LinePlanningNumber=nullable number, PublicName=nullable text, Klasse Arrival=nullable text, LineDirection=nullable number, JourneyStopType=nullable text, Count of Klasse Arrival=nullable number, LastUpdateTimeStamp=nullable text, JourneyNumber=nullable number, UniqueCombination=text]}})
in
#"Grouped Rows"
If Query folding is not very important (as Table.Distinct in the last step will break query folding), then you can use below solution.
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test (later on when you use the query on your dataset, you will have to change the source appropriately. If you have columns other than these, then delete Changed type step and do a Changed type for complete table from UI again)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjUwNVDSUXLJLyoAUo5AbAjEnnklQNJY38hC38jAyEjB0MDKwAwoYmqkFKtDiiZzojVZQjQZWBmB9ZjRWI8lNj2JScl4/APUYgQUsLAgSYsJyVqMDaFaYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [LinePlanningNumber = _t, PublicName = _t, #"Klasse Arrival" = _t, LineDirection = _t, JourneyStopType = _t, LastUpdateTimeStamp = _t, JourneyNumber = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"LinePlanningNumber", Int64.Type}, {"PublicName", type text}, {"Klasse Arrival", type text}, {"LineDirection", Int64.Type}, {"JourneyStopType", type text}, {"LastUpdateTimeStamp", type datetime}, {"JourneyNumber", Int64.Type}}),
#"Sorted Rows" = Table.Buffer(Table.Sort(#"Changed Type",{"LinePlanningNumber", "PublicName","JourneyStopType", "JourneyNumber", {"LastUpdateTimeStamp", Order.Descending}})),
#"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"LinePlanningNumber", "PublicName", "JourneyStopType", "JourneyNumber"})
in
#"Removed Duplicates"
Hi @Annonymous3991 ,
1) Multi-select (Ctrl+click) [LinePlanningNumber], [PublicName], [JourneyStopType], and [JourneyNumber].
2) Go to Home tab > Group By and add MAX [LastUpdateTimeStamp] as the aggregate column.
If you want to keep all the other columns as well, you will need to add an 'All Rows' aggregated column instead, then add a new custom column and use Table.Max to pick out the row with the maximum LastUpdate value, then expand the 'All Rows' column back out. *NB* This will break query folding if you're working on a foldabe source.
Pete
Proud to be a Datanaut!
Examining your data, it seems (in Power Query) you could Group by Journey Number and Target Arrival Time, and then retain last row of each sub table.
Or, possibly simpler, Group only by Journey Number but use the GroupKind.Local parameter
eg: (will need to adapt to your actual data)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjUwNVDSUXLJLyoAUv4FCiWZWSlAliEQe/qFuAb5urp4Ooa4QoWMLHSNdY0MjIwUDA2sDMyswJpNjSxMjZVidSg0zRyHaV75RSmJeZ6JiXkUuNDSxIhaZpqjmBkLAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [LinePlanningNumber = _t, PublicName = _t, #"Klasse Arrival" = _t, LineDirection = _t, JourneyStopType = _t, #"Count of Klasse Arrival" = _t, LastUpdateTimeStamp = _t, JourneyNumber = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"LinePlanningNumber", Int64.Type}, {"PublicName", type text}, {"Klasse Arrival", type text}, {"LineDirection", Int64.Type}, {"JourneyStopType", type text}, {"Count of Klasse Arrival", Int64.Type}, {"LastUpdateTimeStamp", type text}, {"JourneyNumber", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"JourneyNumber"}, {
{"Count", each Table.Last(_)}
}, GroupKind.Local),
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"JourneyNumber"}),
#"Expanded Count" = Table.ExpandRecordColumn(#"Removed Columns", "Count", {"LinePlanningNumber", "PublicName", "Klasse Arrival", "LineDirection", "JourneyStopType", "Count of Klasse Arrival", "LastUpdateTimeStamp", "JourneyNumber"}, {"LinePlanningNumber", "PublicName", "Klasse Arrival", "LineDirection", "JourneyStopType", "Count of Klasse Arrival", "LastUpdateTimeStamp", "JourneyNumber"})
in
#"Expanded Count"
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!