Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello, I am working with some data in the following type of format where column 3 is a list of values:
Column 1 | Column 2 | Column 3 |
John | Smith | A,B |
Sarah | Jones | B,C |
Jane | Wood | A |
Kyle | Ford | A,B,C |
John | Smith | B,A |
I would like to remove duplicates in this table. In column 3, the first and last rows are duplicated, but the values in the list are in different orders. Is there a way to identify and remove these duplicates? Ideally I would like the final result to look like this:
Column 1 | Column 2 | Column 3 |
John | Smith | A,B |
Sarah | Jones | B,C |
Jane | Wood | A |
Kyle | Ford | A,B,C |
Any help would be much appreciated.
Solved! Go to Solution.
NewStep=Table.Distinct(Table.TransformColumns(PreviousStepName,{"Column 3",each Text.Combine(List.Sort(Text.Split(_,",")),",")}))
You check
let
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Sorted Rows" = Table.Sort(#"Promoted Headers",{{"Col1", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1, Int64.Type),
#"Added Index1" = Table.AddIndexColumn(#"Added Index", "Index.1", 0, 1, Int64.Type),
#"Merged Queries" = Table.NestedJoin(#"Added Index1", {"Index"}, #"Added Index1", {"Index.1"}, "Added Index1", JoinKind.FullOuter),
#"Sorted Rows1" = Table.Sort(#"Merged Queries",{{"Index", Order.Ascending}}),
#"Expanded Added Index1" = Table.ExpandTableColumn(#"Sorted Rows1", "Added Index1", {"Col1"}, {"Added Index1.Col1"}),
#"Added Custom" = Table.AddColumn(#"Expanded Added Index1", "Custom", each if [Col1]=[Added Index1.Col1] then null else [Col1]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index", "Index.1", "Added Index1.Col1", "Col1"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Custom", "Col2", "Col3"}),
#"Filtered Rows" = Table.SelectRows(#"Reordered Columns", each ([Custom] <> null))
in
#"Filtered Rows"
NewStep=Table.Distinct(Table.TransformColumns(PreviousStepName,{"Column 3",each Text.Combine(List.Sort(Text.Split(_,",")),",")}))
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.