Reply
P_Allert
Frequent Visitor
Partially syndicated - Outbound

Comparing list values to find similarities

I have a list of values, each 2-4 characters long. This is an example of such a list:

7R
31A
1AB
3ZB
1EM
8Z
9UZ
7FH
42AB

 

I would like to find out if any of the values in the list is similar to any other value in the list.

 

By similar I mean for example do they share more than half of their characters, so that in the example above 1AB and 42AB would have a Similarity, whereas 1AB and 3ZB wouldn't be similar as they only share the B.

 

I came up with this, but the Step "Similarity" only creates Lists that contain Errors and I have not figured out so far how to fix that.

let

Source = {"7R", "31A", "1AB", "3ZB", "1EM", "8Z", "9UZ", "7FH", "42AB"},
Table = Table.FromList(Source, Splitter.SplitByNothing(), {"Value"}),

// Add a custom column to compare each value with every other value
Similarity = Table.AddColumn(Table, "Similarity", each List.Select(Table[Value], each Text.Length(Text.Combine(Text.ToList([Value] & _), "")) > Text.Length([Value]) * 0.5)),

// Remove duplicate rows and expand the Similarity column to display the similar values
RemoveDuplicates = Table.Distinct(Similarity),
ExpandSimilarity = Table.ExpandListColumn(RemoveDuplicates, "Similarity"),

// Remove rows where the value is similar to itself
FilterOutSameValue = Table.SelectRows(ExpandSimilarity, each [Value] <> [Similarity])

in

FilterOutSameValue

Can anybody help me with this?

Thanks a lot.

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

Syndicated - Outbound

@P_Allert Could you just use Fuzzy matching in a join/merge query? Table.FuzzyJoin and Table.FuzzyNestedJoin. If that does not provide enough flexibility, I created a DAX version that you can tweak and tune as you see fit.

Fuzzy - Microsoft Fabric Community



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

2 REPLIES 2
Greg_Deckler
Super User
Super User

Syndicated - Outbound

@P_Allert Could you just use Fuzzy matching in a join/merge query? Table.FuzzyJoin and Table.FuzzyNestedJoin. If that does not provide enough flexibility, I created a DAX version that you can tweak and tune as you see fit.

Fuzzy - Microsoft Fabric Community



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

Syndicated - Outbound

Hi Greg,

thanks for your reply.

 

The FuzzyNestedJoin-Hint did it. I had already played around with it a few hours ago, but didn't succeed then.

 

The Fuzzy Solution in DAX looks very impressive and I'll definately try to figure out how it works, even though may limited DAX skills will make this an "interesting" journey... 😃

 

Thanks again.

 

Cheers
Patrick

 

 

avatar user

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors (Last Month)
Top Kudoed Authors (Last Month)