Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Power BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.

Reply
Anonymous
Not applicable

Group different rows into one (same data)

Hi all.

 

I need to be able to group different rows, which correspond to the same data, into one. In the image you can see that SHIPMENTID and HUUID is the same for the 3 columns, the only difference is the extraction date. I need the data to look like desidered image.
The idea is to be able to transform 3 rows into 1, by ordering waypoits_actual as 1, 2, 3 in order of extraction date by naming 1frst , 2second and so on.... keeping always Wapoints at first. In case there are more just need to align the same...

Actual data: 

JoseBernardo_2-1660717382088.png

Source.NameShipmentIdHUUIDshipdateWaypointsWaypoints_actualExtraction_date__index_level_0__
DailyExtract_2022-08-01_part1.parquet130159218015faca5b3-ef53-347f-baf2-adc781156c3b30-Jul-22 KNUA-DE8-DEL-DELT8-CDGH-ETZA-XGBA1-Aug-2249253
DailyExtract_2022-07-30_part2.parquet130159218015faca5b3-ef53-347f-baf2-adc781156c3b30-Jul-22KNUA-DELRT-CDGH-ETZA-XGBA 30-Jul-221006581
DailyExtract_2022-07-31_part1.parquet130159218015faca5b3-ef53-347f-baf2-adc781156c3b30-Jul-22 KNUA-DE8-DEL-CDGH-ETZA-XGBA31-Jul-22292748

 

Desired data: 

JoseBernardo_0-1660717503263.png

Source.NameShipmentIdHUUIDshipdateWaypointsWaypoints_actual1frs REROUTE2nd rerouteExtraction_date__index_level_0__
 130159218015faca5b3-ef53-347f-baf2-adc781156c3b30-Jul-22KNUA-DELRT-CDGH-ETZA-XGBAKNUA-DE8-DEL-DELT8-CDGH-ETZA-XGBAKNUA-DE8-DEL-CDGH-ETZA-XGBAKNUA-DE8-DEL-DELT8-CDGH-ETZA-XGBA  

 

I have other columns that can be used to obtain what I need, such as Extraction date / index level (orders my dimesioal date table) etc..

Thanks!

@rohit_singh @Ashish_Mathur 

1 ACCEPTED SOLUTION
tamerj1
Super User
Super User

Hi @Anonymous 
I though I've had replied back to you with updated solution but for some reason I caanot see my reply. Heree is the sample file https://www.dropbox.com/t/mrD8taPTCHSWO8P2

Please try this solution if it works. However, this is not a perfect situation for a 17M rows data.

1.png

Shipments Summary 2 = 
VAR T1 = SUMMARIZE ( Shipments, Shipments[ShipmentId], Shipments[HUUID], "Waypoint", MAX ( Shipments[Waypoints] ) )
VAR T2 = 
    GENERATE ( 
        T1, 
        VAR T3 = CALCULATETABLE ( Shipments )
        VAR T4 = 
            ADDCOLUMNS ( 
                T3, 
                "@Rank", RANKX ( FILTER ( T3, [Waypoints_Actual] <> BLANK ( ) ), [Extraction_date] ) 
            )
        VAR FirstReroute = MAXX ( FILTER ( T4, [@Rank] = 2 ), [Waypoints_actual] )
        VAR SecondReroute = MAXX ( FILTER ( T4, [@Rank] = 1 ), [Waypoints_actual] )
        RETURN  
            ROW ( "1st Route", FirstReroute, "2nd Route", SecondReroute,"3rd Route", FirstReroute )
    )
RETURN
    T2

View solution in original post

36 REPLIES 36
Jeanxyz
Impactful Individual
Impactful Individual

Try use "remove duplicates" in Power Query. 

 

https://docs.microsoft.com/en-us/power-query/working-with-duplicates

tamerj1
Super User
Super User

Hi @Anonymous 
I though I've had replied back to you with updated solution but for some reason I caanot see my reply. Heree is the sample file https://www.dropbox.com/t/mrD8taPTCHSWO8P2

Please try this solution if it works. However, this is not a perfect situation for a 17M rows data.

1.png

Shipments Summary 2 = 
VAR T1 = SUMMARIZE ( Shipments, Shipments[ShipmentId], Shipments[HUUID], "Waypoint", MAX ( Shipments[Waypoints] ) )
VAR T2 = 
    GENERATE ( 
        T1, 
        VAR T3 = CALCULATETABLE ( Shipments )
        VAR T4 = 
            ADDCOLUMNS ( 
                T3, 
                "@Rank", RANKX ( FILTER ( T3, [Waypoints_Actual] <> BLANK ( ) ), [Extraction_date] ) 
            )
        VAR FirstReroute = MAXX ( FILTER ( T4, [@Rank] = 2 ), [Waypoints_actual] )
        VAR SecondReroute = MAXX ( FILTER ( T4, [@Rank] = 1 ), [Waypoints_actual] )
        RETURN  
            ROW ( "1st Route", FirstReroute, "2nd Route", SecondReroute,"3rd Route", FirstReroute )
    )
RETURN
    T2
Anonymous
Not applicable

Thanks!

I tried again , i replace the " shipment" for my data original data table 'Fact - IR_IM Raw data Azure' but i received the following sign "

JoseBernardo_0-1660741228889.png

"The expression refers to multiple columns. Multiple columns cannot be converted to a scalar value" any clue?

@Anonymous 

Would you please double check that you are creating a calculated table not a measure?

Anonymous
Not applicable

Last point would be to do exactly as you showed me, but this time i need only a column to show the lastedes  " waypoit actual". insted of having 1 reroute , 2 reorute . Just another "Waypotin" and nex to this a  column with LASTEST.. i tried different this with out succes..:(

Hi @Anonymous 
Would you like to have a quick meeting perhaps on zoom or teams?

Anonymous
Not applicable

Yes of course! Teams or  zoom for me is fine! please let me know your email and when it works for you? dont know your time zone...

Anonymous
Not applicable

Good news, it woorked fine!

I just see a small thing that dont make sense.I see 1 reroute also posted in 3 rerouted column. 
The point is IF, new "actual waypoits" shows up , another column will creat .. even if there are 4 or 5 rerouted.  In case of the example below, there are only 2 reroute , no 3 on the original data.. how can i modify that? 

JoseBernardo_0-1660745849558.png

 

Anonymous
Not applicable

Yes of course. here you ahve a screen shoot of it.

JoseBernardo_0-1660745182968.png

Is there anything i can do to avoid that sign? the table looks great if we could get the rest of the columns 

 

tamerj1
Super User
Super User

Hi @Anonymous 
If you are interested in a DAX solution then I have two options for you:
Option 1: Dynamic number of columns using a measure. This is the solution that I prefer. It is simple and completely dynamic.

You need to have a filter table that contains the Reroute numbers up to the maximum possible. You can just create it in excel and import it. Here is a simple DAX for up to 10 nos

2.png

Reroute Order = 
SELECTCOLUMNS ( 
    { 
        ( "1st Reroute", 1 ), 
        ( "2nd Reroute", 2 ), 
        ( "3rd Reroute", 3 ), 
        ( "4th Reroute", 4 ), 
        ( "5th Reroute", 5 ), 
        ( "6th Reroute", 6 ), 
        ( "7th Reroute", 7 ), 
        ( "8th Reroute", 8 ), 
        ( "9th Reroute", 9 ), 
        ( "10th Reroute", 10 ) 
    },
    "Reroute Order", [Value1], "# Reroute Order", [Value2] 
)

This will be just a disconnected table that will be used for slicing the matrix (will be placed on the columns of the matrix)

Then you can create your measure as follows. Place ShipmentId and HUUID in the Rows of the matrix and place the measure in the values. Then place the Reroute Order column from the new table in the columns.

1.png

Actual Waypoint = 
VAR T1 = FILTER ( Shipments, [Waypoints_Actual] <> BLANK ( ) )
VAR T2 = ADDCOLUMNS ( T1, "@Rank", RANKX ( T1, [Extraction_date] ) )
VAR CurrentRank = SELECTEDVALUE ( 'Reroute Order'[# Reroute Order] )
VAR T3 = FILTER ( T2, [@Rank] = CurrentRank )
RETURN
    MAXX ( T3, [Waypoints_actual] )

Option 2: Create a new calculated table as follows

3.png

Shipments Summary = 
VAR T1 = SUMMARIZE ( Shipments, Shipments[ShipmentId], Shipments[HUUID] )
VAR T2 = 
    GENERATE ( 
        T1, 
        VAR T3 = CALCULATETABLE ( Shipments )
        VAR T4 = 
            ADDCOLUMNS ( 
                T3, 
                "@Length1", LEN ( [Waypoints] ), 
                "@Lenth2", LEN ( [Waypoints_Actual] ), 
                "@Rank", RANKX ( FILTER ( T3, [Waypoints_Actual] <> BLANK ( ) ), [Extraction_date] ) 
            )
        VAR LongestWaypoint1 = MAXX ( TOPN ( 1, T4, [@Length1] ), [Waypoint] )
        VAR LongestWaypoint2 = MAXX ( TOPN ( 1, T4, [@Length1] ), [Waypoint_Actual] )
        VAR FirstReroute = MAXX ( FILTER ( T4, [@Rank] = 1 ), [Waypoints_actual] )
        VAR SecondReroute = MAXX ( FILTER ( T4, [@Rank] = 2 ), [Waypoints_actual] )
        RETURN  
            ROW ( "Waypoint", LongestWaypoint1, "Waypoint_Actual", LongestWaypoint2, "1st Route", FirstReroute, "2nd Route", SecondReroute )
    )
RETURN
    T2

 This solution is not dynamic you you need to hard-code the number of columns and the names of those columns.

Anonymous
Not applicable

Thanks! looks great both options, I think i am more interested into the 2 option since it would help to have a table which i cna use to set new mesurementes if its need it. 

So i tried the second one. My source data is  'Fact-IR_IM Raw data Azure' so i change it...

JoseBernardo_0-1660725716741.png

Seems to be very havy to deal with right? running out of memory? 

@Anonymous 

Do you want the original Wapoint and Waypoint-Actual columns? Actually I just assumed that you want to keep the one with the longest sting. Maybe this has no meaning. We can skip these unnecessary calculations and save alot of formula engine time. What do you think? Shall I update accordingly? On the other hand I would really recommend using the measure. It takes advantage of the existing filter context this it's more efficient and yet complicately dynamic. It depends on what other measures you are going to create and what is the ultimate report you are looking for but most probably we can solve this out. 

Anonymous
Not applicable

Yes thansk! 
Well i need to keep always the  column " Waypoint" and next to this as many new "waypoint actual " can came.. so if now the row has on 1 waypoint actual.. that would be 1 reroute... if this has 2 so it would have 1 re route and then 2 reroute... and so on.. NEW ACTUAL WAYPOINT= new column based on extraction date of course ..
Like i show it here
Acltual data

JoseBernardo_0-1660727316662.png

Desired data

JoseBernardo_1-1660727577277.png

 

 

 

 

@Anonymous 
Does the original Waypoints columns allways has onlt one value and the rest are blanks?

Tn this example you have 2 Waypoints_Actual. The one that you chose to present in the new table was chosen based on what?

2.png

Anonymous
Not applicable

Each shipment always has only 1 "Waypoints" and can have multiple "Waypoints_actuals", as the latter appear (based on "extraction date" they must be aligned in new columns ( 1 reroute / 2 reroute / 3 reroute...etc)
Let me clarify the picture becasu i made a mistake on one of the headears...

JoseBernardo_0-1660728584780.png

Hope this helps !

@Anonymous 
Now you have two "2nd Reroute"!! 

Anonymous
Not applicable

There it is...

 

JoseBernardo_0-1660729017976.png

Sorry i just copy paste to fast..

 

danextian
Super User
Super User

Hi @Anonymous ,

 

Enter this  M code in a blank query. This might be what you need.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("vZBNC4JAEIb/y56bmA8316OlFBUdwiAKkdUyAg8lCvXv2z4gqOgUwbzDDDwDD7Neq8juq3N8ampbNBkjM6ABpOxg64a6rh/bbaM6igRJB0wGyW26tIXVucC21ALi+SXktmSwm8I3RLpXSO4wQRi3FTC72dVktgghio3L9JrEwCAajiBOViEsh/3QMUgQtrv7iRewFpV2Plr6IHiz5F9aPhSn8+Rd7QUlxJ429MXvD198kxR6whyw7xmVphc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Source.Name = _t, ShipmentId = _t, HUUID = _t, shipdate = _t, Waypoints = _t, Waypoints_actual = _t, Extraction_date = _t, __index_level_0__ = _t]),
    #"Replaced Value" = Table.ReplaceValue(Source,"",null,Replacer.ReplaceValue,{"Waypoints", "Waypoints_actual"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"Source.Name", type text}, {"ShipmentId", Int64.Type}, {"HUUID", type text}, {"shipdate", type date}, {"Waypoints", type text}, {"Waypoints_actual", type text}, {"Extraction_date", type date}, {"__index_level_0__", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Waypoints2", each if [Waypoints_actual] <> null then [Waypoints_actual] else [Waypoints], type text),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Waypoints", "Waypoints_actual", "__index_level_0__", "Source.Name"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Waypoints2", "Waypoints_actual"}}),
    #"Sorted Rows" = Table.Sort(#"Renamed Columns",{{"Extraction_date", Order.Ascending}}),
    #"Grouped Rows" = Table.Group(#"Sorted Rows", {"ShipmentId", "HUUID"}, {{"Grouped", each _, type table [ShipmentId=nullable number, HUUID=nullable text, shipdate=nullable date, Extraction_date=nullable date, Waypoints_actual=text]}}),
    #"Added Index" = Table.AddIndexColumn(#"Grouped Rows", "Index", 0, 1, Int64.Type),
    #"Added Custom1" = Table.AddColumn(#"Added Index", "Grouped2", each let x = Table.AddIndexColumn([Grouped], "Index", 0, 1, Int64.Type),

AddCol = Table.AddColumn(x, "Route", each if [Index] = 0 then "Waypoints Actual"else "Reroute " & Text.From([Index] + 1) )
in  AddCol,  type table),
    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Grouped", "Index"}),
    #"Expanded Grouped2" = Table.ExpandTableColumn(#"Removed Columns1", "Grouped2", {"shipdate", "Waypoints_actual", "Route"}, {"shipdate", "Waypoints_actual", "Route"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded Grouped2",{{"shipdate", type date}, {"Waypoints_actual", type text},{"Route", type text}}),
    #"Pivoted Column" = Table.Pivot(#"Changed Type1", List.Distinct(#"Changed Type1"[Route]), "Route", "Waypoints_actual")
in
    #"Pivoted Column"

danextian_0-1660718164628.png

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.
Anonymous
Not applicable

Thanks!! do i need to replace the source on that?  every time that a new row with same ShimpentID show up wil be psoted as new column following the extractiond ate order? 

Hi @Anonymous ,

 

I used Enter Data feature so the source  was a decompressed binary code.

The Pivoted  Column step will automatically add  the columns but the names will be something like "
Reroute" and a number. You can change the word reroute into something else but naming it first, second, etc will be very complex. It is simpler that way.

If you want to connect to an excel file, replace the code in Source step with

= Excel.Workbook(File.Contents("full path to an excel file"), null, true){[Item="tab or sheet name",Kind="Sheet"]}[Data]

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

Helpful resources

Announcements
June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.