cancel
Showing results for
Did you mean:

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a \$100 discount. Register Now

Anonymous
Not applicable

## Measure equavalent for string similarity formula

I'm currently using the formula below to compare strings from two columns and give me a line-by-line % match:

let
Source = Text.ToList([Name1]),
Reference = Text.ToList([Name2]),
SourceCount = List.Count(Source),
SourceNotInReferenceCount = List.Count(List.Difference(Source, Reference, Comparer.OrdinalIgnoreCase)),
PercentSourceInReference = 1 - (SourceNotInReferenceCount / SourceCount)
in
PercentSourceInReference

However, I now want to apply the same formula to a calculated table, but I can't, because I'm not able to use Power Query. Is there a way to achieve the same outcome with a Measure?

Any help that can be provided would really be appreciated.

1 ACCEPTED SOLUTION
Super User

@Anonymous

It is remarkable how much more adept than DAX M is at cases like this. What in the query editor looks relatively straightforward becomes rather irksome here. You can create your calculated column in the table you show (Table1):

Edited: Removed var _Word2 which was no longer necessary

```Similarity =
VAR _String1 =
LOWER ( Table1[Name1] )
VAR _String2 =
LOWER ( Table1[Name2] )
VAR _Word1 =
GENERATESERIES ( 1; LEN ( _String1 ) );
"Letter"; MID ( _String1; [Value]; 1 )
)
VAR _SourceNotInReferenceCount =
SUMX (
SUMMARIZE (
_Word1;
[Letter];
"Occurrences"; MAX (
( LEN ( _String1 ) - LEN ( SUBSTITUTE ( _String1; [Letter]; "" ) ) )
- ( LEN ( _String2 ) - LEN ( SUBSTITUTE ( _String2; [Letter]; "" ) ) );
0
)
);
[Occurrences]
)
VAR _SourceCount =
LEN ( _String1 )
VAR _PercentSourceInReference =
1 - DIVIDE ( _SourceNotInReferenceCount; _SourceCount )
RETURN
_PercentSourceInReference```

Code formatted with

6 REPLIES 6
Super User

Hi @Anonymous

It would help immensly if you showed an example based on your data with the expected result. So that what you need is clear.

Additionally, please always show your sample data in text-tabular format in addition to (or instead of) the screen captures. That allows people trying to help to readily copy the data and run a quick test, plus it increases the likelihood of your question being answered. Just use 'Copy table' in Power BI and paste it here.

That said, and if I understand your requirement correctly, try this, where Table1 is the table that contains the two columns [Name1] and [Name2]

```Measure =
VAR SourceCount =
COUNTROWS ( Table1 )
VAR SourceNotInReferenceCount =
COUNTROWS ( FILTER ( Table1, Table1[Name1] = Table1[Name2] ) )
VAR PercentSourceInReference =
1 - DIVIDE ( SourceNotInReferenceCount, SourceCount )
RETURN
PercentSourceInReference```

or, in another interpretation, since duplicate values are supported in List.Difference:

```Measure =
VAR SourceCount =
COUNTROWS ( Table1 )
VAR SourceNotInReferenceCount =
SUMX (
DISTINCT ( Table1[Name1] ),
MIN (
CALCULATE ( COUNT ( Table1[Name1] ) )
- CALCULATE (
COUNT ( Table1[Name2] ),
FILTER ( Table1, Table1[Name2] = EARLIER ( Table1[Name1] ) )
),
0
)
)
VAR PercentSourceInReference =
1 - DIVIDE ( SourceNotInReferenceCount, SourceCount )
RETURN
PercentSourceInReference```
Anonymous
Not applicable

Hi @AlB, thanks or your response, and your advice. Apologies for not being as clear as I could have been. I've mocked up some example results using the initial formula, the results of which are below:

Name1                                Name2                                Similarity

 Cat Cat 1 Dog Dog 1 Tree Trees 1 Marked Marking 0.666667 Hello Goodbye 0.4 Hello how are you Hello how are you 1 Goodbye how are you Goodbye how are you 1 Stop to smell the flowers I don't care about flowers 0.64 Knock on the door Rattle the window latch 0.647059

I've had a play around with your measures (thanks again), but unfortunately all I'm getting is a similarity rating of '1' for each line. I think I wasn't as clear as I should have been with my requirements.

Hopefully this helps to clear things up a bit?

New Member

Yo quiero revisar similitudes pero en una misma columna como podria hacer eso?}

Super User

@Anonymous

It is remarkable how much more adept than DAX M is at cases like this. What in the query editor looks relatively straightforward becomes rather irksome here. You can create your calculated column in the table you show (Table1):

Edited: Removed var _Word2 which was no longer necessary

```Similarity =
VAR _String1 =
LOWER ( Table1[Name1] )
VAR _String2 =
LOWER ( Table1[Name2] )
VAR _Word1 =
GENERATESERIES ( 1; LEN ( _String1 ) );
"Letter"; MID ( _String1; [Value]; 1 )
)
VAR _SourceNotInReferenceCount =
SUMX (
SUMMARIZE (
_Word1;
[Letter];
"Occurrences"; MAX (
( LEN ( _String1 ) - LEN ( SUBSTITUTE ( _String1; [Letter]; "" ) ) )
- ( LEN ( _String2 ) - LEN ( SUBSTITUTE ( _String2; [Letter]; "" ) ) );
0
)
);
[Occurrences]
)
VAR _SourceCount =
LEN ( _String1 )
VAR _PercentSourceInReference =
1 - DIVIDE ( _SourceNotInReferenceCount; _SourceCount )
RETURN
_PercentSourceInReference```

Code formatted with

The code seems to work great. For those who are going to use it, please make sure you replace semicolons with comas.

Anonymous
Not applicable

"It is glaring how much more adept than DAX M is for cases like this. What in the query editor looks relatively straightforward becomes rather irksome here."

Looking at your solution, the statement above appears to be a bit of an understatement! And that's not meant to sound like a criticism - the solution works perfectly, thank you so much @AlB 🙂

Announcements

#### Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

#### Power BI Monthly Update - February 2024

Check out the February 2024 Power BI update to learn about new features.

#### Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

#### Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Top Solution Authors
Top Kudoed Authors