Solved: Column to show string groupings

MichaelHutchens · ‎05-11-2025

Hi everyone, I'm hoping someone might be able to help. I'm looking to create a column that shows groupings of similarly-named strings.

The column I need should show unique numbers. These groupings represent 'buckets' of similarly-named strings from a 'File Name' column.

A grouping should be a collection of similar file names - similar to a fuzzy matching threshold of 0.9.

If there is a string that doesn't have any other matches, it should have a Grouping number of 0.

Here's a table with some sample data, and the 'Grouping' column I need:

File Path	File Name	Grouping
https://sharepointsite.com/sites/review_library/filename_content_reviewAug2021.pdf	filename_content_reviewAug2021.pdf	1
https://sharepointsite.com/sites/review_library/filename_content_reviewAug2022.pdf	filename_content_reviewAug2022.pdf	1
https://sharepointsite.com/sites/review_library/filename_content_reviewAug2023.pdf	filename_content_reviewAug2023.pdf	1
https://sharepointsite.com/sites/policy_library/mypolicy.doc	mypolicy.doc	2
https://sharepointsite.com/sites/content_library/mypolicy.doc	mypolicy.doc	2
https://sharepointsite.com/sites/backup_library/mypolicy.doc	mypolicy.doc	2
https://sharepointsite.com/sites/backup_library/strategy_document.pdf	strategy_document.pdf	0
https://sharepointsite.com/sites/policy_library/community_review_plan.xlsx	community_review_plan.xlsx	3
https://sharepointsite.com/sites/content_library/review plan for communities.doc	review plan for communities.doc	3
https://sharepointsite.com/sites/review_library/detailed plan for community review.xlsx	detailed plan for community review.xlsx	3
https://sharepointsite.com/sites/review_library/office-layout-guide.doc	office-layout-guide.doc	0

Note: The groupings I've suggested above might not exactly align to a similarity threshold of 0.9, I've used these groupings just as examples.

Any help would really be appreciated! 🙂

v-echaithra · ‎05-27-2025

Hi @MichaelHutchens ,

Create a copy of the table by duplicating the table in Power query.

Perform Fuzzy Merge - Self-Join

Go to Files table.
Home > Merge Queries > Merge Queries as New
Primary table: Table
Secondary table: Duplicated Table
Join on File Name using Fuzzy Matching:
Check Use fuzzy matching
Click Fuzzy matching options:
Similarity threshold = tweak as needed
check ignore case
Max matches: blank

In the merged table, remove duplicates so each pair appears only once by using Table.Distinct

Create a group ID table using DAX to assign a group ID to every file participating in a fuzzy match. You can then join it back to your original file list, and if any file name is not present in the Grouping table, assign that group to 0.

If this post helps, please give us Kudos and consider marking it Accept as solution to assist other members in finding it more easily.

Regards,
Chaithra

View solution in original post

v-echaithra · ‎05-27-2025

Hi @MichaelHutchens ,

Create a copy of the table by duplicating the table in Power query.

Perform Fuzzy Merge - Self-Join

Go to Files table.
Home > Merge Queries > Merge Queries as New
Primary table: Table
Secondary table: Duplicated Table
Join on File Name using Fuzzy Matching:
Check Use fuzzy matching
Click Fuzzy matching options:
Similarity threshold = tweak as needed
check ignore case
Max matches: blank

In the merged table, remove duplicates so each pair appears only once by using Table.Distinct

Create a group ID table using DAX to assign a group ID to every file participating in a fuzzy match. You can then join it back to your original file list, and if any file name is not present in the Grouping table, assign that group to 0.

If this post helps, please give us Kudos and consider marking it Accept as solution to assist other members in finding it more easily.

Regards,
Chaithra

MichaelHutchens · ‎05-28-2025

Thank you so much @v-echaithra , that worked perfectly 🙂 I really appreciate your time 🙂

rajendraongole1 · ‎05-12-2025

Hi @MichaelHutchens - DAX does not support fuzzy matching natively, and more importantly, it cannot create calculated columns based on similarity clustering across rows.

suggest, You must use Power Query (M) or external processing (e.g., Python or R script in Power BI) for fuzzy clustering

some reference links:

Create a fuzzy match (Power Query) - Microsoft Support

Fuzzy match / merging in Power BI Desktop (October 2018)

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

MichaelHutchens · ‎05-12-2025

Thanks for that @rajendraongole1 , very useful. I've amended my original post to remove the DAX requirement. I'll take what solutions I can get.

v-echaithra · ‎05-13-2025

Hi @MichaelHutchens ,

May I ask if you have gotten this issue resolved?

If it is solved, please mark the helpful reply or share your solution and accept it as solution, it will be helpful for other members of the community who have similar problems as yours to solve it faster.

Regards,
Chaithra.

MichaelHutchens · ‎05-13-2025

Thanks for checking in @v-echaithra, no it's not resolved yet

Column to show string groupings

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025

How to Get Your Question Answered Quickly

Join us at FabCon Vienna from September 15-18, 2025

Column to show string groupings

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025

How to Get Your Question Answered Quickly