Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
Hercules78
New Member

Check similarity percentage between names using DAX

Hi all 
Can Power BI's DAX language help determine how similar or different the names of thousands of customers are in a single column? The names are entered in various formats with different spacing or punctuation.

By calculating a similarity percentage, I want to identify which names are completely unique and which ones are similar to others.

This will allow me to manually investigate and correct the similar names.
I need to add , that this column exist in an Excel file , so maybe we could implement this out Power Query , if some one know another way , please share it her 🙂 
Thanks in advance 
Regards 

4 REPLIES 4
v-yangliu-msft
Community Support
Community Support

Hi  @Hercules78 ,

I created some data:

vyangliumsft_0-1689659102716.png

In DAX/Power Query, we can't check the result of each character in two columns in comparison, for example, ABC and ACB, if you don't consider this character order, you can try the following:

1. Add Column.

vyangliumsft_1-1689659102720.png

 

 

=Text.ToList([Group2])

 

 

vyangliumsft_4-1689659149270.png

Extends the value to the current table.

vyangliumsft_5-1689659163853.png

 

 

if Text.Contains([Group],[Name2 List]) then 1 else 0

 

 

vyangliumsft_0-1689659300637.png

Group BY:

vyangliumsft_1-1689659316260.png

 

 

Text.Length([Group])

 

 

vyangliumsft_2-1689659316262.png

[Match]/[Custom]

vyangliumsft_3-1689659344072.png

Set [Custom.1]  -- Change Type -- Percentage

2. Result:

vyangliumsft_4-1689659344073.png

 

This is the related document, you can view this content:

Fuzzy Clustering in Power BI using Power Query: Finding similar values - RADACAD

 

Best Regards,

Liu Yang

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Thanks a lot 
I just want to add that I dont have 2 columns , I have only 1 column and I want to check similarity betweens rows !! 
Thansk again 

There are scripts to implement Levenshtein in SQL. Feel free to adapt these for Power Query.

 

You can also do a Pearson correlation of a column against itself.

lbendlin
Super User
Super User

There is no official SOUNDEX or Levenshtein etc. implementation in DAX. You can consider running Python or R scripts inside of Power BI for that.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.