Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more
Hi,
I am building a check for my database to eliminate duplicates. Straight up duplicates are easy to remove but sometimes people make typos or something, and I need to remove those too.
This is a rough example of my data:
| Reference | Date | Description |
| ref1 | 1-1-2020 | Blub |
| ref2 | 3-1-2020 | Blob |
| ref3 | 4-1-2020 | Blab |
| ref3-1 | 4-1-2020 | Blab |
| ref1 | 1-1-2020 | Blib |
Now I fuzzy matched it on itself to create this:
| Reference | Date | Description | Dup.Reference | Dup.Date | Dup.Description |
| ref1 | 1-1-2020 | Blub | ref1 | 1-1-2020 | Blib |
| ref2 | 3-1-2020 | Blob | ref2 | 3-1-2020 | Blob |
| ref3 | 4-1-2020 | Blab | ref3-1 | 4-1-2020 | Blab |
| ref3-1 | 4-1-2020 | Blab | ref3 | 4-1-2020 | Blab |
| ref1 | 1-1-2020 | Blib | ref1 | 1-1-2020 | Blub |
As you can see the same record effectively is still double, appearing once on the left as x-y and once on the right as y-x.
Now I need to find a way to eliminate these cross column duplicates and end up with this:
| Reference | Date | Description | Dup.Reference | Dup.Date | Dup.Description |
| ref1 | 1-1-2020 | Blub | ref1 | 1-1-2020 | Blib |
| ref3 | 4-1-2020 | Blab | ref3-1 | 4-1-2020 | Blab |
I need both sides to check whether the double record really is double or intended that way.
Thank you!
Power Query or DAX?
In DAX you can remove duplicates by using SELECTCOLUMNS to select each column or columns into a table VAR. Then UNION and then DISTINCT.
Hej,
No M in Power Query.
And I'm not sure how that would work...
Thanks though 🙂
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.