Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredPower BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.
Hi all.
I need to be able to group different rows, which correspond to the same data, into one. In the image you can see that SHIPMENTID and HUUID is the same for the 3 columns, the only difference is the extraction date. I need the data to look like desidered image.
The idea is to be able to transform 3 rows into 1, by ordering waypoits_actual as 1, 2, 3 in order of extraction date by naming 1frst , 2second and so on.... keeping always Wapoints at first. In case there are more just need to align the same...
Actual data:
Source.Name | ShipmentId | HUUID | shipdate | Waypoints | Waypoints_actual | Extraction_date | __index_level_0__ |
DailyExtract_2022-08-01_part1.parquet | 13015921801 | 5faca5b3-ef53-347f-baf2-adc781156c3b | 30-Jul-22 | KNUA-DE8-DEL-DELT8-CDGH-ETZA-XGBA | 1-Aug-22 | 49253 | |
DailyExtract_2022-07-30_part2.parquet | 13015921801 | 5faca5b3-ef53-347f-baf2-adc781156c3b | 30-Jul-22 | KNUA-DELRT-CDGH-ETZA-XGBA | 30-Jul-22 | 1006581 | |
DailyExtract_2022-07-31_part1.parquet | 13015921801 | 5faca5b3-ef53-347f-baf2-adc781156c3b | 30-Jul-22 | KNUA-DE8-DEL-CDGH-ETZA-XGBA | 31-Jul-22 | 292748 |
Desired data:
Source.Name | ShipmentId | HUUID | shipdate | Waypoints | Waypoints_actual | 1frs REROUTE | 2nd reroute | Extraction_date | __index_level_0__ |
13015921801 | 5faca5b3-ef53-347f-baf2-adc781156c3b | 30-Jul-22 | KNUA-DELRT-CDGH-ETZA-XGBA | KNUA-DE8-DEL-DELT8-CDGH-ETZA-XGBA | KNUA-DE8-DEL-CDGH-ETZA-XGBA | KNUA-DE8-DEL-DELT8-CDGH-ETZA-XGBA |
I have other columns that can be used to obtain what I need, such as Extraction date / index level (orders my dimesioal date table) etc..
Thanks!
Solved! Go to Solution.
Hi @Anonymous
I though I've had replied back to you with updated solution but for some reason I caanot see my reply. Heree is the sample file https://www.dropbox.com/t/mrD8taPTCHSWO8P2
Please try this solution if it works. However, this is not a perfect situation for a 17M rows data.
Shipments Summary 2 =
VAR T1 = SUMMARIZE ( Shipments, Shipments[ShipmentId], Shipments[HUUID], "Waypoint", MAX ( Shipments[Waypoints] ) )
VAR T2 =
GENERATE (
T1,
VAR T3 = CALCULATETABLE ( Shipments )
VAR T4 =
ADDCOLUMNS (
T3,
"@Rank", RANKX ( FILTER ( T3, [Waypoints_Actual] <> BLANK ( ) ), [Extraction_date] )
)
VAR FirstReroute = MAXX ( FILTER ( T4, [@Rank] = 2 ), [Waypoints_actual] )
VAR SecondReroute = MAXX ( FILTER ( T4, [@Rank] = 1 ), [Waypoints_actual] )
RETURN
ROW ( "1st Route", FirstReroute, "2nd Route", SecondReroute,"3rd Route", FirstReroute )
)
RETURN
T2
Try use "remove duplicates" in Power Query.
https://docs.microsoft.com/en-us/power-query/working-with-duplicates
Hi @Anonymous
I though I've had replied back to you with updated solution but for some reason I caanot see my reply. Heree is the sample file https://www.dropbox.com/t/mrD8taPTCHSWO8P2
Please try this solution if it works. However, this is not a perfect situation for a 17M rows data.
Shipments Summary 2 =
VAR T1 = SUMMARIZE ( Shipments, Shipments[ShipmentId], Shipments[HUUID], "Waypoint", MAX ( Shipments[Waypoints] ) )
VAR T2 =
GENERATE (
T1,
VAR T3 = CALCULATETABLE ( Shipments )
VAR T4 =
ADDCOLUMNS (
T3,
"@Rank", RANKX ( FILTER ( T3, [Waypoints_Actual] <> BLANK ( ) ), [Extraction_date] )
)
VAR FirstReroute = MAXX ( FILTER ( T4, [@Rank] = 2 ), [Waypoints_actual] )
VAR SecondReroute = MAXX ( FILTER ( T4, [@Rank] = 1 ), [Waypoints_actual] )
RETURN
ROW ( "1st Route", FirstReroute, "2nd Route", SecondReroute,"3rd Route", FirstReroute )
)
RETURN
T2
Thanks!
I tried again , i replace the " shipment" for my data original data table 'Fact - IR_IM Raw data Azure' but i received the following sign "
"The expression refers to multiple columns. Multiple columns cannot be converted to a scalar value" any clue?
@Anonymous
Would you please double check that you are creating a calculated table not a measure?
Last point would be to do exactly as you showed me, but this time i need only a column to show the lastedes " waypoit actual". insted of having 1 reroute , 2 reorute . Just another "Waypotin" and nex to this a column with LASTEST.. i tried different this with out succes..:(
Hi @Anonymous
Would you like to have a quick meeting perhaps on zoom or teams?
Yes of course! Teams or zoom for me is fine! please let me know your email and when it works for you? dont know your time zone...
Good news, it woorked fine!
I just see a small thing that dont make sense.I see 1 reroute also posted in 3 rerouted column.
The point is IF, new "actual waypoits" shows up , another column will creat .. even if there are 4 or 5 rerouted. In case of the example below, there are only 2 reroute , no 3 on the original data.. how can i modify that?
Yes of course. here you ahve a screen shoot of it.
Is there anything i can do to avoid that sign? the table looks great if we could get the rest of the columns
Hi @Anonymous
If you are interested in a DAX solution then I have two options for you:
Option 1: Dynamic number of columns using a measure. This is the solution that I prefer. It is simple and completely dynamic.
You need to have a filter table that contains the Reroute numbers up to the maximum possible. You can just create it in excel and import it. Here is a simple DAX for up to 10 nos
Reroute Order =
SELECTCOLUMNS (
{
( "1st Reroute", 1 ),
( "2nd Reroute", 2 ),
( "3rd Reroute", 3 ),
( "4th Reroute", 4 ),
( "5th Reroute", 5 ),
( "6th Reroute", 6 ),
( "7th Reroute", 7 ),
( "8th Reroute", 8 ),
( "9th Reroute", 9 ),
( "10th Reroute", 10 )
},
"Reroute Order", [Value1], "# Reroute Order", [Value2]
)
This will be just a disconnected table that will be used for slicing the matrix (will be placed on the columns of the matrix)
Then you can create your measure as follows. Place ShipmentId and HUUID in the Rows of the matrix and place the measure in the values. Then place the Reroute Order column from the new table in the columns.
Actual Waypoint =
VAR T1 = FILTER ( Shipments, [Waypoints_Actual] <> BLANK ( ) )
VAR T2 = ADDCOLUMNS ( T1, "@Rank", RANKX ( T1, [Extraction_date] ) )
VAR CurrentRank = SELECTEDVALUE ( 'Reroute Order'[# Reroute Order] )
VAR T3 = FILTER ( T2, [@Rank] = CurrentRank )
RETURN
MAXX ( T3, [Waypoints_actual] )
Option 2: Create a new calculated table as follows
Shipments Summary =
VAR T1 = SUMMARIZE ( Shipments, Shipments[ShipmentId], Shipments[HUUID] )
VAR T2 =
GENERATE (
T1,
VAR T3 = CALCULATETABLE ( Shipments )
VAR T4 =
ADDCOLUMNS (
T3,
"@Length1", LEN ( [Waypoints] ),
"@Lenth2", LEN ( [Waypoints_Actual] ),
"@Rank", RANKX ( FILTER ( T3, [Waypoints_Actual] <> BLANK ( ) ), [Extraction_date] )
)
VAR LongestWaypoint1 = MAXX ( TOPN ( 1, T4, [@Length1] ), [Waypoint] )
VAR LongestWaypoint2 = MAXX ( TOPN ( 1, T4, [@Length1] ), [Waypoint_Actual] )
VAR FirstReroute = MAXX ( FILTER ( T4, [@Rank] = 1 ), [Waypoints_actual] )
VAR SecondReroute = MAXX ( FILTER ( T4, [@Rank] = 2 ), [Waypoints_actual] )
RETURN
ROW ( "Waypoint", LongestWaypoint1, "Waypoint_Actual", LongestWaypoint2, "1st Route", FirstReroute, "2nd Route", SecondReroute )
)
RETURN
T2
This solution is not dynamic you you need to hard-code the number of columns and the names of those columns.
Thanks! looks great both options, I think i am more interested into the 2 option since it would help to have a table which i cna use to set new mesurementes if its need it.
So i tried the second one. My source data is 'Fact-IR_IM Raw data Azure' so i change it...
Seems to be very havy to deal with right? running out of memory?
@Anonymous
Do you want the original Wapoint and Waypoint-Actual columns? Actually I just assumed that you want to keep the one with the longest sting. Maybe this has no meaning. We can skip these unnecessary calculations and save alot of formula engine time. What do you think? Shall I update accordingly? On the other hand I would really recommend using the measure. It takes advantage of the existing filter context this it's more efficient and yet complicately dynamic. It depends on what other measures you are going to create and what is the ultimate report you are looking for but most probably we can solve this out.
Yes thansk!
Well i need to keep always the column " Waypoint" and next to this as many new "waypoint actual " can came.. so if now the row has on 1 waypoint actual.. that would be 1 reroute... if this has 2 so it would have 1 re route and then 2 reroute... and so on.. NEW ACTUAL WAYPOINT= new column based on extraction date of course ..
Like i show it here
Acltual data
Desired data
@Anonymous
Does the original Waypoints columns allways has onlt one value and the rest are blanks?
Tn this example you have 2 Waypoints_Actual. The one that you chose to present in the new table was chosen based on what?
Each shipment always has only 1 "Waypoints" and can have multiple "Waypoints_actuals", as the latter appear (based on "extraction date" they must be aligned in new columns ( 1 reroute / 2 reroute / 3 reroute...etc)
Let me clarify the picture becasu i made a mistake on one of the headears...
Hope this helps !
@Anonymous
Now you have two "2nd Reroute"!!
There it is...
Sorry i just copy paste to fast..
Hi @Anonymous ,
Enter this M code in a blank query. This might be what you need.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("vZBNC4JAEIb/y56bmA8316OlFBUdwiAKkdUyAg8lCvXv2z4gqOgUwbzDDDwDD7Neq8juq3N8ampbNBkjM6ABpOxg64a6rh/bbaM6igRJB0wGyW26tIXVucC21ALi+SXktmSwm8I3RLpXSO4wQRi3FTC72dVktgghio3L9JrEwCAajiBOViEsh/3QMUgQtrv7iRewFpV2Plr6IHiz5F9aPhSn8+Rd7QUlxJ429MXvD198kxR6whyw7xmVphc=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Source.Name = _t, ShipmentId = _t, HUUID = _t, shipdate = _t, Waypoints = _t, Waypoints_actual = _t, Extraction_date = _t, __index_level_0__ = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,"",null,Replacer.ReplaceValue,{"Waypoints", "Waypoints_actual"}),
#"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"Source.Name", type text}, {"ShipmentId", Int64.Type}, {"HUUID", type text}, {"shipdate", type date}, {"Waypoints", type text}, {"Waypoints_actual", type text}, {"Extraction_date", type date}, {"__index_level_0__", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Waypoints2", each if [Waypoints_actual] <> null then [Waypoints_actual] else [Waypoints], type text),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Waypoints", "Waypoints_actual", "__index_level_0__", "Source.Name"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Waypoints2", "Waypoints_actual"}}),
#"Sorted Rows" = Table.Sort(#"Renamed Columns",{{"Extraction_date", Order.Ascending}}),
#"Grouped Rows" = Table.Group(#"Sorted Rows", {"ShipmentId", "HUUID"}, {{"Grouped", each _, type table [ShipmentId=nullable number, HUUID=nullable text, shipdate=nullable date, Extraction_date=nullable date, Waypoints_actual=text]}}),
#"Added Index" = Table.AddIndexColumn(#"Grouped Rows", "Index", 0, 1, Int64.Type),
#"Added Custom1" = Table.AddColumn(#"Added Index", "Grouped2", each let x = Table.AddIndexColumn([Grouped], "Index", 0, 1, Int64.Type),
AddCol = Table.AddColumn(x, "Route", each if [Index] = 0 then "Waypoints Actual"else "Reroute " & Text.From([Index] + 1) )
in AddCol, type table),
#"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Grouped", "Index"}),
#"Expanded Grouped2" = Table.ExpandTableColumn(#"Removed Columns1", "Grouped2", {"shipdate", "Waypoints_actual", "Route"}, {"shipdate", "Waypoints_actual", "Route"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded Grouped2",{{"shipdate", type date}, {"Waypoints_actual", type text},{"Route", type text}}),
#"Pivoted Column" = Table.Pivot(#"Changed Type1", List.Distinct(#"Changed Type1"[Route]), "Route", "Waypoints_actual")
in
#"Pivoted Column"
Thanks!! do i need to replace the source on that? every time that a new row with same ShimpentID show up wil be psoted as new column following the extractiond ate order?
Hi @Anonymous ,
I used Enter Data feature so the source was a decompressed binary code.
The Pivoted Column step will automatically add the columns but the names will be something like "
Reroute" and a number. You can change the word reroute into something else but naming it first, second, etc will be very complex. It is simpler that way.
If you want to connect to an excel file, replace the code in Source step with
= Excel.Workbook(File.Contents("full path to an excel file"), null, true){[Item="tab or sheet name",Kind="Sheet"]}[Data]
User | Count |
---|---|
84 | |
80 | |
70 | |
47 | |
43 |
User | Count |
---|---|
108 | |
54 | |
50 | |
40 | |
40 |