This is best Fabric, Power BI, SQL and AI community event. How do we know? The last event sold out! Save €200 with code FABCMTY200.
Register nowA new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.
I'm attempting to identify duplicate groups and assign identifiers based on 2 columns, Name and SSN:
| Group ID | Name | SSN |
| Jane Doe | 111111111 | |
| Jane Doe | 222222222 | |
| Jane Doe | 111111111 | |
| John Doe | 333333333 | |
| John Doe |
Match criteria is both Name and SSN match or Name matches and SSN is blank:
| Group ID | Name | SSN |
| 1 | Jane Doe | 111111111 |
| 2 | Jane Doe | 222222222 |
| 1 | Jane Doe | 111111111 |
| 3 | John Doe | 333333333 |
| 3 | John Doe |
Any help is much appreciated!
Solved! Go to Solution.
Hello @Anonymous
you can Fill-down the SSN-column, Group by Name and SSN and add a index column. Expand the grouped table afterwards. Here the M-code
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8krMS1VwyU9V0lEyhAGlWB0UCSMYQJdA05GfkQeVMIYBdAkFpdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, SSN = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"SSN", Int64.Type}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"SSN"}),
#"Grouped Rows" = Table.Group(#"Filled Down", {"Name", "SSN"}, {{"AllRows", each _, type table [Name=text, SSN=number]}}),
#"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows",{"AllRows"}),
#"Added Index" = Table.AddIndexColumn(#"Removed Other Columns", "Index", 1, 1),
#"Expanded AllRows" = Table.ExpandTableColumn(#"Added Index", "AllRows", {"Name", "SSN"}, {"Name", "SSN"})
in
#"Expanded AllRows"
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Hello @Anonymous
you can Fill-down the SSN-column, Group by Name and SSN and add a index column. Expand the grouped table afterwards. Here the M-code
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45W8krMS1VwyU9V0lEyhAGlWB0UCSMYQJdA05GfkQeVMIYBdAkFpdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, SSN = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Name", type text}, {"SSN", Int64.Type}}),
#"Filled Down" = Table.FillDown(#"Changed Type",{"SSN"}),
#"Grouped Rows" = Table.Group(#"Filled Down", {"Name", "SSN"}, {{"AllRows", each _, type table [Name=text, SSN=number]}}),
#"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows",{"AllRows"}),
#"Added Index" = Table.AddIndexColumn(#"Removed Other Columns", "Index", 1, 1),
#"Expanded AllRows" = Table.ExpandTableColumn(#"Added Index", "AllRows", {"Name", "SSN"}, {"Name", "SSN"})
in
#"Expanded AllRows"
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Sign up to receive a private message when registration opens and key events begin.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.