Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI DataViz World Championships are on! With four chances to enter, you could win a spot in the LIVE Grand Finale in Las Vegas. Show off your skills.

Reply
TheOriginalDeeb
Frequent Visitor

Find & Count Matching Words in Lists

I'm having troble understanding List functions, which I think will be necessary to find a solution to my problem.

 

So I have basically 2 table columns: Phrase & Words

 

So for example, phrase 1 = '10 best aristotle quotes' and what I want to do is to compare that to a list of words AND count how many of those words appear in the phrase and create a percentage as the desired calculation

 

Phrase: 10 best aristotle quotes

Words: best quotes

Match: 2

Percentage: 50% (2 matches/4 words in phrase)

I'd appreaciate all ideas for a Power Query solution.

Also, if this explanation is not clear, please let me know and I will try to make it clearer. Thanks!

 

1 ACCEPTED SOLUTION

Hi @TheOriginalDeeb.

 

If you prefer M code solution. Then @Greg_Deckler's solution will be a great choice.

 

1. Go to Edit Queries => New Query, Blank Query => Advanced Editor. 

 

let
    fnCountMatches = (List1 as list, List2 as list) =>

let
    MyList = List1,  //{"ten","best","aristotle","quotes"},
    MyOtherList = List2,  //{"best","quotes"},
    
    MyNumerator = List.Count(List.Difference(MyList, MyOtherList)),
    MyDenominator = List.Count(MyList),
    MyResult = MyNumerator / MyDenominator

in
    MyResult
in
    fnCountMatches

2. Then go back to your source table. Split the columns first, then create a new custom column and invoke this function with expression:

 

fnCountMatches(#"Changed Type2"[Phrase],#"Changed Type2"[Words])

3.PNG

 

The entire M code:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjRQSEotLlFILMosLskvyUlVKCzNL0ktVtJRAotDebGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Phrase = _t, Words = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Phrase", type text}, {"Words", type text}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"Phrase", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Phrase"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Phrase", type text}}),
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type1", {{"Words", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Words"),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Words", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Custom", each fnCountMatches(#"Changed Type2"[Phrase],#"Changed Type2"[Words])),
    #"Changed Type3" = Table.TransformColumnTypes(#"Added Custom",{{"Custom", Percentage.Type}})
in
    #"Changed Type3"

Please be noticed that custom M function cannot be used in Power BI Service currently.

 

Thanks,
Xi Jin.

 

 

View solution in original post

7 REPLIES 7
v-xjiin-msft
Solution Sage
Solution Sage

Hi @TheOriginalDeeb,

 

To achieve your requirement, you can refer to following method:

 

Sample data is like:

 

1.PNG

 

Go to Edit Queries -> Select column Phrase and choose Split Column option. Same to column Words

 

2.PNG3.PNG

 

Then Close & Apply.  Create a measure or a calculated column with following expression to calculate the Percentage:

 

Percentage =
CALCULATE (
    DISTINCTCOUNT ( 'Table'[Words] ),
    FILTER ( 'Table', 'Table'[Phrase] = 'Table'[Words] ),
    ALLEXCEPT ( 'Table', 'Table'[ID] )
)
    / CALCULATE (
        DISTINCTCOUNT ( 'Table'[Phrase] ),
        ALLEXCEPT ( 'Table', 'Table'[ID] )
    )

 

4.PNG

 

Thanks,
Xi Jin.

Thanks so much for taking the time to work on this and help me. 2 slight problems....

 

1. I'd really prefer an M code solution

 

2. I don't want to compare both columns on an individual word level, just phrase to word.

 

I've been able to use Text.Split to add 2 columns with the contents split out into Lists. But I'm stuck on how to count and match

Hi @TheOriginalDeeb.

 

If you prefer M code solution. Then @Greg_Deckler's solution will be a great choice.

 

1. Go to Edit Queries => New Query, Blank Query => Advanced Editor. 

 

let
    fnCountMatches = (List1 as list, List2 as list) =>

let
    MyList = List1,  //{"ten","best","aristotle","quotes"},
    MyOtherList = List2,  //{"best","quotes"},
    
    MyNumerator = List.Count(List.Difference(MyList, MyOtherList)),
    MyDenominator = List.Count(MyList),
    MyResult = MyNumerator / MyDenominator

in
    MyResult
in
    fnCountMatches

2. Then go back to your source table. Split the columns first, then create a new custom column and invoke this function with expression:

 

fnCountMatches(#"Changed Type2"[Phrase],#"Changed Type2"[Words])

3.PNG

 

The entire M code:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjRQSEotLlFILMosLskvyUlVKCzNL0ktVtJRAotDebGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Phrase = _t, Words = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Phrase", type text}, {"Words", type text}}),
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type", {{"Phrase", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Phrase"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Phrase", type text}}),
    #"Split Column by Delimiter1" = Table.ExpandListColumn(Table.TransformColumns(#"Changed Type1", {{"Words", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Words"),
    #"Changed Type2" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Words", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type2", "Custom", each fnCountMatches(#"Changed Type2"[Phrase],#"Changed Type2"[Words])),
    #"Changed Type3" = Table.TransformColumnTypes(#"Added Custom",{{"Custom", Percentage.Type}})
in
    #"Changed Type3"

Please be noticed that custom M function cannot be used in Power BI Service currently.

 

Thanks,
Xi Jin.

 

 

Thanks Xi Jin.

 

I'm still new to Power Query, and not too good with functions and parameters. I can follow Greg's example with the hard-coded list.

But I'm not sure how to pass in my own data (from a table) into the function.

 

I did get the formulas to work for the most part. But there is a problem with stemming. It does not match plural to singular (It won't find 'quote' in 'quotes')

 

I'm wondering if there any other List functions that might work?

Hi @TheOriginalDeeb,

 

You can refer to my sample report here: https://1drv.ms/u/s!AlqSnZZUVHmshBwJn8RyiDli-bVl

 

And for "quote" and "quotes". As I know there's no List function which can do the job. To resolve this issue, I think you need to hard code these words like create a new custom column with expression like below, then pass new column to the function. 

 

if Text.From([Phrase])="quotes" then "quote" else [Phrase]

Thanks,
Xi Jin.

 

Greg_Deckler
Super User
Super User

So, something like this?

 

let
    fnCountMatches = () =>

let
    MyList = {"ten","best","aristotle","quotes"},
    MyOtherList = {"best","quotes"},
    
    MyNumerator = List.Count(List.Difference(MyList, MyOtherList)),
    MyDenominator = List.Count(MyList),
    MyResult = MyNumerator / MyDenominator

in
    MyResult
in
    fnCountMatches


Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

Hi Greg,

 

I haven't used a lot of functions, so I'm not sure about this setup for analyzing multiple phrases.

 

On thing that I should have pointed out before is that the list or [Words] will vary by row most of the time.

 

So, if there are 100 rows in the [Phrase column], I'll also have 100 different lists in each row of [Words].

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.