This is best Fabric, Power BI, SQL and AI community event. How do we know? The last event sold out! Save €200 with code FABCMTY200.
Register nowA new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.
Hi everyone, I'm hoping someone might be able to help. I'm looking to create a column that shows groupings of similarly-named strings.
The column I need should show unique numbers. These groupings represent 'buckets' of similarly-named strings from a 'File Name' column.
A grouping should be a collection of similar file names - similar to a fuzzy matching threshold of 0.9.
If there is a string that doesn't have any other matches, it should have a Grouping number of 0.
Here's a table with some sample data, and the 'Grouping' column I need:
Note: The groupings I've suggested above might not exactly align to a similarity threshold of 0.9, I've used these groupings just as examples.
Any help would really be appreciated! 🙂
Solved! Go to Solution.
Hi @MichaelHutchens ,
Create a copy of the table by duplicating the table in Power query.
Perform Fuzzy Merge - Self-Join
In the merged table, remove duplicates so each pair appears only once by using Table.Distinct
Create a group ID table using DAX to assign a group ID to every file participating in a fuzzy match. You can then join it back to your original file list, and if any file name is not present in the Grouping table, assign that group to 0.
If this post helps, please give us Kudos and consider marking it Accept as solution to assist other members in finding it more easily.
Regards,
Chaithra
Hi @MichaelHutchens ,
Create a copy of the table by duplicating the table in Power query.
Perform Fuzzy Merge - Self-Join
In the merged table, remove duplicates so each pair appears only once by using Table.Distinct
Create a group ID table using DAX to assign a group ID to every file participating in a fuzzy match. You can then join it back to your original file list, and if any file name is not present in the Grouping table, assign that group to 0.
If this post helps, please give us Kudos and consider marking it Accept as solution to assist other members in finding it more easily.
Regards,
Chaithra
Thank you so much @v-echaithra , that worked perfectly 🙂 I really appreciate your time 🙂
Hi @MichaelHutchens - DAX does not support fuzzy matching natively, and more importantly, it cannot create calculated columns based on similarity clustering across rows.
suggest, You must use Power Query (M) or external processing (e.g., Python or R script in Power BI) for fuzzy clustering
some reference links:
Create a fuzzy match (Power Query) - Microsoft Support
Fuzzy match / merging in Power BI Desktop (October 2018)
Proud to be a Super User! | |
Thanks for that @rajendraongole1 , very useful. I've amended my original post to remove the DAX requirement. I'll take what solutions I can get.
Hi @MichaelHutchens ,
May I ask if you have gotten this issue resolved?
If it is solved, please mark the helpful reply or share your solution and accept it as solution, it will be helpful for other members of the community who have similar problems as yours to solve it faster.
Regards,
Chaithra.
Check out the May 2026 Power BI update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
| User | Count |
|---|---|
| 30 | |
| 28 | |
| 23 | |
| 19 | |
| 17 |
| User | Count |
|---|---|
| 49 | |
| 47 | |
| 41 | |
| 21 | |
| 19 |