Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now

Reply

Column to show string groupings

Hi everyone, I'm hoping someone might be able to help. I'm looking to create a column that shows groupings of similarly-named strings.

 

The column I need should show unique numbers. These groupings represent 'buckets' of similarly-named strings from a 'File Name' column.

 

A grouping should be a collection of similar file names - similar to a fuzzy matching threshold of 0.9.

 

If there is a string that doesn't have any other matches, it should have a Grouping number of 0.

 

Here's a table with some sample data, and the 'Grouping' column I need:

File PathFile NameGrouping
https://sharepointsite.com/sites/review_library/filename_content_reviewAug2021.pdffilename_content_reviewAug2021.pdf1
https://sharepointsite.com/sites/review_library/filename_content_reviewAug2022.pdffilename_content_reviewAug2022.pdf1
https://sharepointsite.com/sites/review_library/filename_content_reviewAug2023.pdffilename_content_reviewAug2023.pdf1
https://sharepointsite.com/sites/policy_library/mypolicy.docmypolicy.doc2
https://sharepointsite.com/sites/content_library/mypolicy.docmypolicy.doc2
https://sharepointsite.com/sites/backup_library/mypolicy.docmypolicy.doc2
https://sharepointsite.com/sites/backup_library/strategy_document.pdfstrategy_document.pdf0
https://sharepointsite.com/sites/policy_library/community_review_plan.xlsxcommunity_review_plan.xlsx3
https://sharepointsite.com/sites/content_library/review plan for communities.docreview plan for communities.doc3
https://sharepointsite.com/sites/review_library/detailed plan for community review.xlsxdetailed plan for community review.xlsx3
https://sharepointsite.com/sites/review_library/office-layout-guide.docoffice-layout-guide.doc0

 

Note: The groupings I've suggested above might not exactly align to a similarity threshold of 0.9, I've used these groupings just as examples.

Any help would really be appreciated! 🙂

1 ACCEPTED SOLUTION
v-echaithra
Community Support
Community Support

Hi @MichaelHutchens ,

Create a copy of the table by duplicating the table in Power query.

 

Perform Fuzzy Merge - Self-Join

 

  • Go to Files table.
  • Home > Merge Queries > Merge Queries as New
  • Primary table: Table
  • Secondary table: Duplicated Table
  • Join on File Name using Fuzzy Matching:
  • Check Use fuzzy matching
  • Click Fuzzy matching options:
  • Similarity threshold = tweak as needed
  • check ignore case
  • Max matches: blank

 

In the merged table, remove duplicates so each pair appears only once by using Table.Distinct

 

Create a group ID table using DAX to assign a group ID to every file participating in a fuzzy match. You can then join it back to your original file list, and if any file name is not present in the Grouping table, assign that group to 0.

If this post helps, please give us Kudos and consider marking it Accept as solution to assist other members in finding it more easily.

Regards,
Chaithra

View solution in original post

6 REPLIES 6
v-echaithra
Community Support
Community Support

Hi @MichaelHutchens ,

Create a copy of the table by duplicating the table in Power query.

 

Perform Fuzzy Merge - Self-Join

 

  • Go to Files table.
  • Home > Merge Queries > Merge Queries as New
  • Primary table: Table
  • Secondary table: Duplicated Table
  • Join on File Name using Fuzzy Matching:
  • Check Use fuzzy matching
  • Click Fuzzy matching options:
  • Similarity threshold = tweak as needed
  • check ignore case
  • Max matches: blank

 

In the merged table, remove duplicates so each pair appears only once by using Table.Distinct

 

Create a group ID table using DAX to assign a group ID to every file participating in a fuzzy match. You can then join it back to your original file list, and if any file name is not present in the Grouping table, assign that group to 0.

If this post helps, please give us Kudos and consider marking it Accept as solution to assist other members in finding it more easily.

Regards,
Chaithra

Thank you so much @v-echaithra , that worked perfectly 🙂 I really appreciate your time 🙂

rajendraongole1
Super User
Super User

Hi @MichaelHutchens  - DAX does not support fuzzy matching natively, and more importantly, it cannot create calculated columns based on similarity clustering across rows.

suggest, You must use Power Query (M) or external processing (e.g., Python or R script in Power BI) for fuzzy clustering

some reference links:

Create a fuzzy match (Power Query) - Microsoft Support

Fuzzy match / merging in Power BI Desktop (October 2018)

 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!





Thanks for that @rajendraongole1 , very useful. I've amended my original post to remove the DAX requirement. I'll take what solutions I can get.

Hi @MichaelHutchens ,

May I ask if you have gotten this issue resolved?

 

If it is solved, please mark the helpful reply or share your solution and accept it as solution, it will be helpful for other members of the community who have similar problems as yours to solve it faster.

 

Regards,
Chaithra.

Thanks for checking in @v-echaithra, no it's not resolved yet

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.