Hi, guys!
Please help me. I couldn't find out a solution for this.
I have a table for the product TAG and different properties:
I need to find product TAGs that are contained in the another, and then identify which property (column) is differente from the previous one, adding a new column.
The result should look like this:
In this example it's comparing the row with the last one, but ideally it would lookfor duplicates in the whole column to find TAGs which have duplicated parts.
Thanks in advance!
Solved! Go to Solution.
Hi @marcelmunk ,
You can try the following method. It's a bit complicated, and maybe someone else might have a better solution.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnRS0lFKyilNBVLFGYlFBUC6JCMzDySaX5SSWqQUqwNS5YypLMTD09kbVZ2ziytQoDI1Jye/HKQyNz+/JAPdxJzU4mKoamTFwR6OQQHY1cYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TAG = _t, Prop1 = _t, Prop2 = _t, Prop3 = _t, Prop4 = _t]),
#"Reversed Rows" = Table.ReverseRows(Source),
Custom1 = Table.AddFuzzyClusterColumn(#"Reversed Rows","TAG","New TAG",[IgnoreCase = true, IgnoreSpace = true]),
#"Reversed Rows1" = Table.ReverseRows(Custom1),
#"Removed Columns" = Table.RemoveColumns(#"Reversed Rows1",{"TAG"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"New TAG"}, {{"ALL", each _, type table [Prop1=nullable text, Prop2=nullable text, Prop3=nullable text, Prop4=nullable text, New TAG=text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([ALL],"Index",0)),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Prop1", "Prop2", "Prop3", "Prop4", "New TAG", "Index"}, {"Prop1", "Prop2", "Prop3", "Prop4", "New TAG", "Index"}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Expanded Custom", {"New TAG", "Index"}, "Attribute", "Value"),
#"Grouped Rows1" = Table.Group(#"Unpivoted Other Columns", {"New TAG", "Attribute"}, {{"ALL", each _, type table [New TAG=text, Index=number, Attribute=text, Value=text, Custom=any]}}),
#"Added Custom1" = Table.AddColumn(#"Grouped Rows1", "Custom", each let
AllDataTable = [ALL],
PreRowValue =
Table.AddColumn(
AllDataTable, "PreValue",
each try AllDataTable [Value] { [Index] - 1} otherwise [Value]
)
in
PreRowValue),
#"Removed Other Columns1" = Table.SelectColumns(#"Added Custom1",{"Custom"}),
#"Expanded Custom1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Custom", {"New TAG", "Index", "Attribute", "Value", "PreValue"}, {"New TAG", "Index", "Attribute", "Value", "PreValue"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Custom1", each ([Index] = 1)),
#"Added Conditional Column" = Table.AddColumn(#"Filtered Rows1", "Duplicate", each if [Value] <> [PreValue] then [Attribute]&" - "&[PreValue] else null),
#"Removed Columns1" = Table.RemoveColumns(#"Added Conditional Column",{"Index", "PreValue"}),
#"Grouped Rows2" = Table.Group(#"Removed Columns1", {"New TAG"}, {{"ALL", each _, type table [New TAG=text, Attribute=text, Value=text, Duplicate=nullable text]}}),
#"Added Custom2" = Table.AddColumn(#"Grouped Rows2", "Custom", each let
AllDataTable = [ALL],
ModifiedDuplicate =
Table.AddColumn(
AllDataTable, "ModifiedDuplicate",
each List.RemoveNulls(AllDataTable [Duplicate])
)
in
ModifiedDuplicate),
#"Removed Other Columns2" = Table.SelectColumns(#"Added Custom2",{"Custom"}),
#"Expanded Custom2" = Table.ExpandTableColumn(#"Removed Other Columns2", "Custom", {"New TAG", "Attribute", "Value", "Duplicate", "ModifiedDuplicate"}, {"New TAG", "Attribute", "Value", "Duplicate", "ModifiedDuplicate"}),
#"Extracted Values" = Table.TransformColumns(#"Expanded Custom2", {"ModifiedDuplicate", each Text.Combine(List.Transform(_, Text.From)), type text}),
#"Removed Columns2" = Table.RemoveColumns(#"Extracted Values",{"Duplicate"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns2", List.Distinct(#"Removed Columns2"[Attribute]), "Attribute", "Value"),
#"Reordered Columns" = Table.ReorderColumns(#"Pivoted Column",{"New TAG", "Prop1", "Prop2", "Prop3", "Prop4", "ModifiedDuplicate"}),
#"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns",{{"New TAG", type text}, {"Prop1", type text}, {"Prop2", type text}, {"Prop3", type text}, {"Prop4", type text}, {"ModifiedDuplicate", type text}})
in
#"Changed Type"
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @marcelmunk ,
You can try the following method. It's a bit complicated, and maybe someone else might have a better solution.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcnRS0lFKyilNBVLFGYlFBUC6JCMzDySaX5SSWqQUqwNS5YypLMTD09kbVZ2ziytQoDI1Jye/HKQyNz+/JAPdxJzU4mKoamTFwR6OQQHY1cYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [TAG = _t, Prop1 = _t, Prop2 = _t, Prop3 = _t, Prop4 = _t]),
#"Reversed Rows" = Table.ReverseRows(Source),
Custom1 = Table.AddFuzzyClusterColumn(#"Reversed Rows","TAG","New TAG",[IgnoreCase = true, IgnoreSpace = true]),
#"Reversed Rows1" = Table.ReverseRows(Custom1),
#"Removed Columns" = Table.RemoveColumns(#"Reversed Rows1",{"TAG"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"New TAG"}, {{"ALL", each _, type table [Prop1=nullable text, Prop2=nullable text, Prop3=nullable text, Prop4=nullable text, New TAG=text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([ALL],"Index",0)),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"Prop1", "Prop2", "Prop3", "Prop4", "New TAG", "Index"}, {"Prop1", "Prop2", "Prop3", "Prop4", "New TAG", "Index"}),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Expanded Custom", {"New TAG", "Index"}, "Attribute", "Value"),
#"Grouped Rows1" = Table.Group(#"Unpivoted Other Columns", {"New TAG", "Attribute"}, {{"ALL", each _, type table [New TAG=text, Index=number, Attribute=text, Value=text, Custom=any]}}),
#"Added Custom1" = Table.AddColumn(#"Grouped Rows1", "Custom", each let
AllDataTable = [ALL],
PreRowValue =
Table.AddColumn(
AllDataTable, "PreValue",
each try AllDataTable [Value] { [Index] - 1} otherwise [Value]
)
in
PreRowValue),
#"Removed Other Columns1" = Table.SelectColumns(#"Added Custom1",{"Custom"}),
#"Expanded Custom1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Custom", {"New TAG", "Index", "Attribute", "Value", "PreValue"}, {"New TAG", "Index", "Attribute", "Value", "PreValue"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Custom1", each ([Index] = 1)),
#"Added Conditional Column" = Table.AddColumn(#"Filtered Rows1", "Duplicate", each if [Value] <> [PreValue] then [Attribute]&" - "&[PreValue] else null),
#"Removed Columns1" = Table.RemoveColumns(#"Added Conditional Column",{"Index", "PreValue"}),
#"Grouped Rows2" = Table.Group(#"Removed Columns1", {"New TAG"}, {{"ALL", each _, type table [New TAG=text, Attribute=text, Value=text, Duplicate=nullable text]}}),
#"Added Custom2" = Table.AddColumn(#"Grouped Rows2", "Custom", each let
AllDataTable = [ALL],
ModifiedDuplicate =
Table.AddColumn(
AllDataTable, "ModifiedDuplicate",
each List.RemoveNulls(AllDataTable [Duplicate])
)
in
ModifiedDuplicate),
#"Removed Other Columns2" = Table.SelectColumns(#"Added Custom2",{"Custom"}),
#"Expanded Custom2" = Table.ExpandTableColumn(#"Removed Other Columns2", "Custom", {"New TAG", "Attribute", "Value", "Duplicate", "ModifiedDuplicate"}, {"New TAG", "Attribute", "Value", "Duplicate", "ModifiedDuplicate"}),
#"Extracted Values" = Table.TransformColumns(#"Expanded Custom2", {"ModifiedDuplicate", each Text.Combine(List.Transform(_, Text.From)), type text}),
#"Removed Columns2" = Table.RemoveColumns(#"Extracted Values",{"Duplicate"}),
#"Pivoted Column" = Table.Pivot(#"Removed Columns2", List.Distinct(#"Removed Columns2"[Attribute]), "Attribute", "Value"),
#"Reordered Columns" = Table.ReorderColumns(#"Pivoted Column",{"New TAG", "Prop1", "Prop2", "Prop3", "Prop4", "ModifiedDuplicate"}),
#"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns",{{"New TAG", type text}, {"Prop1", type text}, {"Prop2", type text}, {"Prop3", type text}, {"Prop4", type text}, {"ModifiedDuplicate", type text}})
in
#"Changed Type"
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks for your answer, @Icey
As you said, a bit complex and it took me some time to 'digest' it.
Though, I plugged in some real data and got some good and some weird results:
These are prints of the 'New TAG' reorderd after the AddFuzzyClusterColumn.
Many times it works in finding the duplicates, but other times it gives weird values, as in the second print.
Any hint of what may be wrong?
TIA!
Hi @marcelmunk ,
There's a paramter in Table.AddFuzzyClusterColumn function named "Threshold". It's a number between 0.00 and 1.00 that specifies the similarity score at which two values will be grouped. For example, "Grapes" and "Graes" (missing "p") are grouped together only if this option is set to less than 0.90. A threshold of 1.00 is the same as specifying an exact match criteria while grouping. The default value is 0.80.
Please try to adjust it manually and check if it works.
Best Regards,
Icey
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi, @Icey
Late response, but as I was getting the wrong results I supposed it was not a solution.
But it was my fault, as I had to adapt the column names and some parameters to my real data. And I made a mistake in the 'New TAG' column, so I was not getting the desired results.
With a little more time now, I 'readapted' the solution and it's working just fine. I marked your answer as 'solution'.
Thanks!
Just to make it more clear, 'AB' is contained in 'ABC', and 'CD' is contained in 'CDE'. It can be both ways.
Also I have more than 4 properties, but I can extend the expression to accomodate it.
User | Count |
---|---|
134 | |
82 | |
64 | |
57 | |
55 |
User | Count |
---|---|
212 | |
109 | |
88 | |
82 | |
76 |