Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Applicable88
Impactful Individual
Impactful Individual

How to group rows which belongs together without common primary key?

Hello,

 

I have following sample table which I already sorted in PowerQuery according to wagon number and Starttime timestamp. Its only possible because PowerQuery can sort on multiple columns while Excel would mixed it up a bit:

 

MaterialNo. WagonNo TimestampStart
345434 111 13.01.2023 17:10:33
534534 111 13.01.2023 17:10:33
534533 111 13.01.2023 17:11:55
645343 555 13.01.2023 21:10:49
453453 555 13.01.2023 21:12:49
345434 555 13.01.2023 21:13:49
578975 111 13.01.2023 22:10:33
435434 111 13.01.2023 22:10:33
453455 222 13.01.2023 22:10:33
734543 222 13.01.2023 22:10:49

 

On the table are a record of the transport via different wagon we use to move items between A & B. The quantity of items per transport can differ each time. So every time we load some items on the wagon for it to transport from A to B, it comes back empty to be reloaded again. The MaterialNo. is random, so there isn't a common ID we got from each transport. The only thing I can do is to sort WagonNo. and TimestampStart ascendingly to get the quantity of items which were transported with that wagon.

 

Is there a way to produce a artificial Index? For example the first transport with WagonNO. "111" which were around 17h is not the same as the one happening at 22h on that same day. Both should get a different Index ID. 

For example :

IndexIDMaterialNo. WagonNo TimestampStart
1345434 111 13.01.2023 17:10:33
1534534 111 13.01.2023 17:10:33
1534533 111 13.01.2023 17:11:55
2645343 555 13.01.2023 21:10:49
2453453 555 13.01.2023 21:12:49
2345434 555 13.01.2023 21:13:49
3578975 111 13.01.2023 22:10:33
3435434 111 13.01.2023 22:10:33
4453455 222 13.01.2023 22:10:33
4734543 222 13.01.2023 22:10:49

 

With that I could do all sorts of average, and count calculation because I can differentiate the different transports.

Thank you very much in advance. 

Best. 

1 ACCEPTED SOLUTION
Daryl-Lynch-Bzy
Resident Rockstar
Resident Rockstar

Hi @Applicable88 - please try the following:

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ldDRCcMwDATQVYq+A/GdpLrxKib7rxHLkJIUIuiPweJxnNS7qLmpySIAxluwgisLdXxeqA2lqcq+dPFB/5CaSDT3Kd8RGWMfg19JRKZtU9oMTSRP+d3oSeopvX626g89yctGpsmV7jJ6RibJXNZZNJXRcz8A", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"TransportNo/" = _t, WagonNo = _t, TimeStampStartDate = _t, TimeStampStartTime = _t]),
    #"Parsed Time" = Table.TransformColumns(Source,{{"TimeStampStartTime", each Time.From(DateTimeZone.From(_)), type time}}),
    #"Parsed Date" = Table.TransformColumns(#"Parsed Time",{{"TimeStampStartDate", each Date.From(DateTimeZone.From(_)), type date}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Parsed Date",{{"WagonNo", Int64.Type}, {"TransportNo/", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "IndexTest", each try #"Added Index"{[Index=([Index]-1)]}[WagonNo] <> [WagonNo] otherwise true , Int64.Type),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Load Group ID", each if [IndexTest] then [Index] else null, Int64.Type),
    #"Filled Down" = Table.FillDown(#"Added Custom1",{"Load Group ID"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index", "IndexTest"})
in
    #"Removed Columns"

View solution in original post

4 REPLIES 4
Daryl-Lynch-Bzy
Resident Rockstar
Resident Rockstar

Hi @Applicable88 - please try the following:

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ldDRCcMwDATQVYq+A/GdpLrxKib7rxHLkJIUIuiPweJxnNS7qLmpySIAxluwgisLdXxeqA2lqcq+dPFB/5CaSDT3Kd8RGWMfg19JRKZtU9oMTSRP+d3oSeopvX626g89yctGpsmV7jJ6RibJXNZZNJXRcz8A", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"TransportNo/" = _t, WagonNo = _t, TimeStampStartDate = _t, TimeStampStartTime = _t]),
    #"Parsed Time" = Table.TransformColumns(Source,{{"TimeStampStartTime", each Time.From(DateTimeZone.From(_)), type time}}),
    #"Parsed Date" = Table.TransformColumns(#"Parsed Time",{{"TimeStampStartDate", each Date.From(DateTimeZone.From(_)), type date}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Parsed Date",{{"WagonNo", Int64.Type}, {"TransportNo/", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "IndexTest", each try #"Added Index"{[Index=([Index]-1)]}[WagonNo] <> [WagonNo] otherwise true , Int64.Type),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Load Group ID", each if [IndexTest] then [Index] else null, Int64.Type),
    #"Filled Down" = Table.FillDown(#"Added Custom1",{"Load Group ID"}),
    #"Removed Columns" = Table.RemoveColumns(#"Filled Down",{"Index", "IndexTest"})
in
    #"Removed Columns"

@Daryl-Lynch-Bzy thats great! Thank you very much. That would solved my problem. You using the INDEX number as new unique identifier.

I just changed one thing: Since my data has many rows it took very long time to evaluate with "try each Index -1 " . I guess it takes a long time to check if previous value has same string.

Instead I made two indexes, one starts at zero and another starts at 1. I made a left outer join to bring that wagonNo. up. And now I continued the same procedure what you did. The left join saved ton of time. 

 

Thank you very much !

Best. 

adudani
Super User
Super User

@Applicable88 ,

 

Have grouped by wagon number, end time, start time.

 

Additionally you can group by start date as well, if required.

 

paste the  below code into a blank query for my solution:

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ldDRCcMwDATQVYq+A/GdpLrxKib7rxHLkJIUIuiPweJxnNS7qLmpySIAxluwgisLdXxeqA2lqcq+dPFB/5CaSDT3Kd8RGWMfg19JRKZtU9oMTSRP+d3oSeopvX626g89yctGpsmV7jJ6RibJXNZZNJXRcz8A", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"TransportNo/" = _t, WagonNo = _t, TimeStampStartDate = _t, TimeStampStartTime = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"TransportNo/", Int64.Type}, {"WagonNo", Int64.Type}, {"TimeStampStartDate", type date}, {"TimeStampStartTime", type time}}),
#"Inserted Start of Hour" = Table.AddColumn(#"Changed Type", "Start of Hour", each Time.StartOfHour([TimeStampStartTime]), type time),
#"Inserted End of Hour" = Table.AddColumn(#"Inserted Start of Hour", "End of Hour", each Time.EndOfHour([Start of Hour]), type time),
#"Grouped Rows" = Table.Group(#"Inserted End of Hour", {"WagonNo", "Start of Hour", "End of Hour"}, {{"Count", each _, type table [#"TransportNo/"=nullable number, WagonNo=nullable number, TimeStampStartDate=nullable date, TimeStampStartTime=nullable time, Start of Hour=time, End of Hour=time]}}),
#"Added Index" = Table.AddIndexColumn(#"Grouped Rows", "Index", 1, 1, Int64.Type),
#"Expanded Count" = Table.ExpandTableColumn(#"Added Index", "Count", {"TransportNo/", "WagonNo", "TimeStampStartDate", "TimeStampStartTime"}, {"Count.TransportNo/", "Count.WagonNo", "Count.TimeStampStartDate", "Count.TimeStampStartTime"}),
#"Renamed Columns" = Table.RenameColumns(#"Expanded Count",{{"Count.TimeStampStartDate", "TimeStampStartDate"}, {"Count.TimeStampStartTime", "TimeStampStartTime"}})
in
#"Renamed Columns"

adudani_0-1673632933568.png

 

output

adudani_1-1673633047237.png

 

 

Appreciate a thumbs up if this is helpful.

 

Please accept as the solution if it resolves your query

 

 

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a Kudos 🙂
Kind Regards,
Avinash

Hello @adudani ,

 

thank you so far, but I think this won't work. 

 

Maybe my table didn't consider all the possiblities. Here is a little change

IndexIDMaterialNo. WagonNo TimestampStart
1345434 111 13.01.2023 17:10:33
1534534 111 13.01.2023 17:10:33
1534533 111 13.01.2023 17:11:55
2645343 555 13.01.2023 21:10:49
2453453 555 13.01.2023 21:12:49
2345434 555 13.01.2023 22:13:49
3578975 111 13.01.2023 22:10:33
3435434 111 13.01.2023 23:12:33
4453455 222 13.01.2023 22:10:33
4734543 222 13.01.2023 22:10:49

 

You simply added a extraction of the hour as a columns for start and end hours. But in reality the transports are very frequent or also hour later. For example the second "111"transport has two items "MaterialNo". 578975 and 435434. Both of them belongs to the same transport, but the second item was prepared much later until it got on the same wagon. When now use MaterialNo. , wagonno.  and the start and end hours and also the timestamp for  for grouping, it won't be counting the amount of materials of the the same transport correctly.

Hope you have a another idea. The concept is right, but your workaround won't consider items which are not  prepared in the same hour. It can also happen that on item got finish at the 59 minute, and the next one is already at the next hour minute 1.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors