Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
So I have searched the forums on duplicates, and I have found some good solutions on grouping duplicates, but not exactly what I need.
So let us say that I have the following table (below is fake data I created that more or less shows how my real data is, couldn't put real data in for privacy/security issues).
The EmployeeID field does contain duplicate values.
RequestID | EmployeeID | RequestType |
1 | 333333 | Access |
2 | 111111 | Access |
3 | 222222 | Service |
4 | 333333 | Access |
5 | 222222 | Access |
6 | 444444 | Service |
7 | 333333 | Service |
I want to create a new column with a value of either original or duplicate.
If an ID is unique it will have the value of Original.
If an ID appears more than once, only the first duplicate ID will the value of Original and the rest of the duplicate with have a value of Duplicate.
The table below shows what I am trying to do a lot better.
RequestID | EmployeeID | RequestType | Flag |
1 | 333333 | Access | Original |
2 | 111111 | Access | Original |
3 | 222222 | Service | Original |
4 | 333333 | Access | Duplicate |
5 | 222222 | Access | Duplicate |
6 | 444444 | Service | Original |
7 | 333333 | Service | Duplicate |
Now the reason I want to do this is because I need to create 4 reports
I know this might be confusing, or it seems unnecessary, however this is how they are asking me to do it at my work
Any help is greatly appreciated.
Solved! Go to Solution.
Hi moonpie,
It seems that you want to create a column based on condition in Table, right? If so, you could try to use below M code to achieve this goal
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGAyDDMTk5tbhYKVYnWskIyDUEA1RxkDIjMAAyglOLyjKTU8ESJjgMMkXWgCRuBuSagAGaQebIBsElYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [RequestID = _t, EmployeeID = _t, RequestType = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"RequestID", Int64.Type}, {"EmployeeID", Int64.Type}, {"RequestType", type text}}), #"Sorted Rows" = Table.Sort(#"Changed Type",{{"RequestID", Order.Ascending}}), #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1), #"Grouped Rows" = Table.Group(#"Added Index1", {"EmployeeID"}, {{"group", each _, type table [RequestID=number, EmployeeID=number, RequestType=text]}}), #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([group],"index",1,1)), #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"RequestID", "RequestType", "index"}, {"Custom.RequestID", "Custom.RequestType", "Custom.index"}), #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"group"}), #"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each if [Custom.index]=1 then "Original" else "Duplicate"), #"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Custom.index"}) in #"Removed Columns1"
Best Regards,
Zoe Zhi
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @dax ,
I am really struggling to adapt this for my scenario.
I have created a concatenated colum to identify duplicates and I have sorted the data based on the concatenated row. Now I want to create a duplicate flag (1 if duplicate and 0 if not duplicate). Here is an example of my data, the table is called api_data program_4378 and the column is called Concatenate (text format)
Warm regards,
Katie
Hi moonpie,
It seems that you want to create a column based on condition in Table, right? If so, you could try to use below M code to achieve this goal
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTIGAyDDMTk5tbhYKVYnWskIyDUEA1RxkDIjMAAyglOLyjKTU8ESJjgMMkXWgCRuBuSagAGaQebIBsElYgE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [RequestID = _t, EmployeeID = _t, RequestType = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"RequestID", Int64.Type}, {"EmployeeID", Int64.Type}, {"RequestType", type text}}), #"Sorted Rows" = Table.Sort(#"Changed Type",{{"RequestID", Order.Ascending}}), #"Added Index1" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1), #"Grouped Rows" = Table.Group(#"Added Index1", {"EmployeeID"}, {{"group", each _, type table [RequestID=number, EmployeeID=number, RequestType=text]}}), #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([group],"index",1,1)), #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"RequestID", "RequestType", "index"}, {"Custom.RequestID", "Custom.RequestType", "Custom.index"}), #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom",{"group"}), #"Added Custom1" = Table.AddColumn(#"Removed Columns", "Custom", each if [Custom.index]=1 then "Original" else "Duplicate"), #"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Custom.index"}) in #"Removed Columns1"
Best Regards,
Zoe Zhi
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi Zoe,
Thank you very much for your help! It worked.
-moonpie
User | Count |
---|---|
96 | |
85 | |
77 | |
66 | |
63 |
User | Count |
---|---|
110 | |
96 | |
96 | |
67 | |
59 |