Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
Anonymous
Not applicable

Help Merging duplicate data

Hey guys, 

 

Not sure if this is possible but... I'm trying to cleanse data (See link below). 

 

https://imgur.com/ygo7jhy 

 

What I need is to condence the Customer Acc column down so there are no duplicates, however Stage 1, Stage 2, Stage 3 needs to be merdged in a hierarchy type of system. With completed taking priority, then terminated, then abandoned, then blank/null. 

 

For example 

 

Acc       Stage 1           Stage 2            Stage 3

1           Completed     Abandoned    terminated 

1           Completed     Completed     Terminated 

1           Completed     Terminated    Completed 

2           Completed     Abandoned    terminated 

2           Completed     Completed     Terminated 

2           Completed     Terminated    Abandoned    

 

would brake down to 

 

Acc       Stage 1           Stage 2            Stage 3

1           Completed    Completed    Completed     

2           Completed     Completed     Terminated 

 

 

Sorry, don't think I've explained it partically well. Let me know your thoughts. 

 

1 ACCEPTED SOLUTION
Zubair_Muhammad
Community Champion
Community Champion

@Anonymous

 

Try this.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXLOzy3ISS1JTQGyHZMS81Ly88DskNSi3My8RJBErA6mUmQ2AaVI0sgSIKVGxDsAXSkeB6ArRXEAworYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Acc = _t, #"Stage 1" = _t, #"Stage 2" = _t, #"Stage 3" = _t]),
    mytable = #table({"Stage","Priority"},{{"Completed",1},{"Terminated",2},{"Abandoned",3}}),
    ChangedType = Table.TransformColumnTypes(Source,{{"Acc", Int64.Type}, {"Stage 1", type text},
 {"Stage 2", type text}, {"Stage 3", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(ChangedType, {"Acc"}, "Attribute", "Value"),
    #"Merged Queries" = Table.NestedJoin(#"Unpivoted Columns",{"Value"},mytable,{"Stage"},"Unpivoted Columns",JoinKind.LeftOuter),
    #"Expanded Unpivoted Columns" = Table.ExpandTableColumn(#"Merged Queries", "Unpivoted Columns", {"Priority"}, {"Priority"}),
    #"Grouped Rows" = Table.Group(#"Expanded Unpivoted Columns", {"Acc", "Attribute"}, {{"Priority", each List.Min([Priority]), type number}}),
    Custom1 = Table.NestedJoin(#"Grouped Rows",{"Priority"},mytable,{"Priority"},"Unpivoted Columns",JoinKind.LeftOuter),
    #"Expanded Unpivoted Columns1" = Table.ExpandTableColumn(Custom1, "Unpivoted Columns", {"Stage"}, {"Stage"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Unpivoted Columns1",{"Priority"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Stage")
in
    #"Pivoted Column"

Regards
Zubair

Please try my custom visuals

View solution in original post

5 REPLIES 5
Zubair_Muhammad
Community Champion
Community Champion

@Anonymous

 

Try this.

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXLOzy3ISS1JTQGyHZMS81Ly88DskNSi3My8RJBErA6mUmQ2AaVI0sgSIKVGxDsAXSkeB6ArRXEAworYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Acc = _t, #"Stage 1" = _t, #"Stage 2" = _t, #"Stage 3" = _t]),
    mytable = #table({"Stage","Priority"},{{"Completed",1},{"Terminated",2},{"Abandoned",3}}),
    ChangedType = Table.TransformColumnTypes(Source,{{"Acc", Int64.Type}, {"Stage 1", type text},
 {"Stage 2", type text}, {"Stage 3", type text}}),
    #"Unpivoted Columns" = Table.UnpivotOtherColumns(ChangedType, {"Acc"}, "Attribute", "Value"),
    #"Merged Queries" = Table.NestedJoin(#"Unpivoted Columns",{"Value"},mytable,{"Stage"},"Unpivoted Columns",JoinKind.LeftOuter),
    #"Expanded Unpivoted Columns" = Table.ExpandTableColumn(#"Merged Queries", "Unpivoted Columns", {"Priority"}, {"Priority"}),
    #"Grouped Rows" = Table.Group(#"Expanded Unpivoted Columns", {"Acc", "Attribute"}, {{"Priority", each List.Min([Priority]), type number}}),
    Custom1 = Table.NestedJoin(#"Grouped Rows",{"Priority"},mytable,{"Priority"},"Unpivoted Columns",JoinKind.LeftOuter),
    #"Expanded Unpivoted Columns1" = Table.ExpandTableColumn(Custom1, "Unpivoted Columns", {"Stage"}, {"Stage"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Unpivoted Columns1",{"Priority"}),
    #"Pivoted Column" = Table.Pivot(#"Removed Columns", List.Distinct(#"Removed Columns"[Attribute]), "Attribute", "Stage")
in
    #"Pivoted Column"

Regards
Zubair

Please try my custom visuals

@Anonymous

 

See file attached as well

 

Actually within the query we create a small table assigning priority to each stage value

 

staging.png

 

 


Regards
Zubair

Please try my custom visuals
Anonymous
Not applicable

Hey Zubair Muhammad, 

 

 

Sorry to be a pain, I'm very early on with MSBI still. Where do I paste the code into? 

HI @Anonymous

 

Please make a backup of your pbix file first.

 

Then copy paste the code as follows

 

copypaste Powerquerycode.png


Regards
Zubair

Please try my custom visuals
Anonymous
Not applicable

Hey Zubair Muhammad, 

 

 

The code you sent me seems to be along the right lines however it's sourcing the data from the example table on this thread. When I change the peramitors various errors arrive. 

 

Would it be eaiser if I link you to the excel sheet I'm sourcing the data from? I've created a download directory below if that's any help. Alternativly, happy to email it over?

 

http://www.filetolink.com/c43be72db3 

Helpful resources

Announcements
July 2024 Power BI Update

Power BI Monthly Update - July 2024

Check out the July 2024 Power BI update to learn about new features.

July Newsletter

Fabric Community Update - July 2024

Find out what's new and trending in the Fabric Community.