Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
msmays5
Helper II
Helper II

Does Text Contain All Words

Hi all,

 

I have a table RequiredWords with columns [Template] and [Words].

 

TemplateWords
Template 1PERSON;WOMAN;MAN
Template 2CAMERA;TV
Template 3CAT;DOG

 

I also have Table1[Value] that contains free text. My goal is to determine if the text in [Value] contains all of the words for at least one of the templates. While it is not currently a requirement to return the template that matched, I have a feeling that could change in the future, so that'd be a bonus.

 

ValueResult
PERSON random text WOMAN more random text MANMatch (contains all three words of template 1)
DOG and CAMERANo match (does not contain all words for any of the templates)

 

Thanks so much for all of your help!

3 ACCEPTED SOLUTIONS
mahoneypat
Microsoft Employee
Microsoft Employee

Here is one way to do it with these two queries of your example data.  It also shows you which Template had the match (bonus kudos?).  To see how it works, just create a blank query, go to Advanced Editor, and replace the text there with the M code below.

Note that M is case sensitive, and this works because the matching words were all UPPERCASE.  If that is not the case with your real data, add a step to make everything upper or lower case before the lists and comparisons.

 

 

//call this one "Words"
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCknNLchJLElVMFTSUQpwDQr297MO9/d19LMGYqVYHSQVRkAVzo6+rkGO1iFhqFLGYKkQaxd/d6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Template = _t, Words = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Template", type text}, {"Words", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "WordList", each Text.Split([Words], ";")),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Words"})
in
    #"Removed Columns"

 

 

 

//call this one "Text"
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VczBCsIwEIThVxlyUujFRyhaPDWVevAQegjtlghNVpIF9e1dWhG8/sx8zplL0187i+zTxBFCL8Gta2uLyJn+skZTmdbLGLAbOYm/pwK/LJCQifDkPBXwrOv4WLwQDnszVM6cujPUwbFum75WwzLixkxMBYkFX2/lNmjmrK/3Cgb6oUXN4QM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t, Result = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", type text}, {"Result", type text}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Value"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Other Columns", "TextList", each Text.Split([Value], " ")),
    #"Added Custom" = Table.AddColumn(#"Added Custom1", "Custom", each Words),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Template", "WordList"}, {"Template", "WordList"}),
    #"Added Custom2" = Table.AddColumn(#"Expanded Custom", "Match", each if List.ContainsAll([TextList], [WordList]) then "Yes" else "No"),
    #"Removed Other Columns1" = Table.SelectColumns(#"Added Custom2",{"Value", "Template", "Match"})
in
    #"Removed Other Columns1"

 

 

If this works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


View solution in original post

@msmays5 - try this. Two tables/queries:
This is called Words, and it looks like this - starts with your initial table above, then I added a column to convert it to lists. You can see what one of the embedded lists looks like:

edhans_0-1598321924750.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCknNLchJLElVMFTSUQpwDQr297MO9/d19LMGYqVYHSQVRkAVzo6+rkGO1iFhqFLGYKkQaxd/d6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Template = _t, Words = _t]),
    #"Added Custom" = Table.AddColumn(Source, "WordList", each Text.Split([Words], ";"))
in
    #"Added Custom"

Next table is this. It also converts your sentences to a list, then uses List.Generate to cycle through the Words lists (above). A 1 means a match was found in one of the templates, 0 means it wasn't. You can do what you want with the 1/0 records at that point.

edhans_1-1598322054759.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCnANCvb3UyhKzEvJz1UoSa0oUQj393X0U8jNL0pFEQYKKsXqRCu5+LsrAIUVnB19XYMclWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    #"Added Custom" = Table.AddColumn(Source, "AllWords", each Text.Split([Value], " ")),
    ContainsWords = 
        Table.AddColumn(
            #"Added Custom",
            "Contains Words",
            each
                let
                    varCurrentWords = [AllWords],
                    varTemplates = Table.RowCount(Words)
                in
            Record.Field(
                Record.Combine(
                    List.Generate(
                        () => [x = 0, y = 0],
                        each [x] < varTemplates,
                        each
                            [
                                y = if List.ContainsAll(varCurrentWords, Words[WordList]{[x]})
                                    then [y] + 1
                                    else [y],
                                x = [x] + 1
                            ]
                    )
                ), 
                "y")
            )
in
    ContainsWords

 

How to use M code provided in a blank query:
1) In Power Query, select New Source, then Blank Query
2) On the Home ribbon, select "Advanced Editor" button
3) Remove everything you see, then paste the M code I've given you in that box.
4) Press Done
5) See this article if you need help using this M code in your model.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

ziying35
Impactful Individual
Impactful Individual

Hi, @msmays5 

First, convert the RequiredWords form to the following lists form

1.png

// RequiredWords
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WCknNLchJLElVsoIzFQyVdJTC84tSioGCAa5Bwf5+1uH+vo5+1kCsVKuDXZMRkiZnR1/XIEfrkDCcqo1RVIdYu/i7K9XGAgA=", BinaryEncoding.Base64),Compression.Deflate))),
    trans = Table.TransformColumns(Source,{"Words", each Text.Split(_,";")}),
    toRows = Table.ToRows(trans)
in
    toRows

 

Then Table1's query code is written like this:

// Table1
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WCkvMKU1VslIKcA0K9vdTKErMS8nPVShJrShRCPf3dfRTyM0vSkURBgoq1eogdLr4uysApRWcHX1dgxxxSIXoQOXB/JAwpdpYAA==", BinaryEncoding.Base64),Compression.Deflate))),
    result = Table.AddColumn(
                 Source, 
                 "Result", 
                 (r)=>let match_template = Text.Combine(
                                              List.Transform(
                                                  RequiredWords, 
                                                  each if List.ContainsAll({r[Value]}, _{1}, Text.Contains) then _{0} else null 
                                               ), 
                                              ", "
                                            ) 
                      in if match_template ="" then "No match" else "Match: " & match_template
              )
in
    result

 

The converted Table1 table is shown below:

2.png

View solution in original post

8 REPLIES 8
ziying35
Impactful Individual
Impactful Individual

Hi, @msmays5 

First, convert the RequiredWords form to the following lists form

1.png

// RequiredWords
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WCknNLchJLElVsoIzFQyVdJTC84tSioGCAa5Bwf5+1uH+vo5+1kCsVKuDXZMRkiZnR1/XIEfrkDCcqo1RVIdYu/i7K9XGAgA=", BinaryEncoding.Base64),Compression.Deflate))),
    trans = Table.TransformColumns(Source,{"Words", each Text.Split(_,";")}),
    toRows = Table.ToRows(trans)
in
    toRows

 

Then Table1's query code is written like this:

// Table1
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WCkvMKU1VslIKcA0K9vdTKErMS8nPVShJrShRCPf3dfRTyM0vSkURBgoq1eogdLr4uysApRWcHX1dgxxxSIXoQOXB/JAwpdpYAA==", BinaryEncoding.Base64),Compression.Deflate))),
    result = Table.AddColumn(
                 Source, 
                 "Result", 
                 (r)=>let match_template = Text.Combine(
                                              List.Transform(
                                                  RequiredWords, 
                                                  each if List.ContainsAll({r[Value]}, _{1}, Text.Contains) then _{0} else null 
                                               ), 
                                              ", "
                                            ) 
                      in if match_template ="" then "No match" else "Match: " & match_template
              )
in
    result

 

The converted Table1 table is shown below:

2.png

Thanks, @ziying35! There is a complication I wasn't aware of (that your answer solves for!), which is how to handle required phrases. (@edhans and @mahoneypat my apologies for not realizing that need - I'm sure you both would have great solutions to that as well, but because you split at the space, your solutions didn't work for this completely changed requirement)

 

One thing that was not working with @ziying35 solution is that for Template 1, the word MAN is not actually required to return a value -- this value "WOMAN abc PERSON OF THE YEAR" returns true even though the word MAN is not included.  To fix this, I prepended and appended a space to the required Words and the Value fields to ensure that they are unique words. Thanks so much to all of you for your help, this is such a great community

ziying35
Impactful Individual
Impactful Individual

@msmays5 

Try this:

// RequiredWords
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WCknNLchJLElVsoIzFQyVdJTC84tSioGCAa5Bwf5+1uH+vo5+1kCsVKuDXZMRkiZnR1/XIEfrkDCcqo1RVIdYu/i741JrgqTUyLC4BOKKWAA=", BinaryEncoding.Base64),Compression.Deflate))),
    trans = Table.TransformColumns(Source,{"Words", each Text.Split(_,";")}),
    toRows = Table.ToRows(trans)
in
    toRows

 

// Table1
let
    Source = Table.FromRecords(Json.Document(Binary.Decompress(Binary.FromText("i65WCkvMKU1VslIKcA0K9vdTKErMS8nPVShJrShRCPf3dfRTyM0vSkURBgoq1eogdLr4uysApRWcHX1dgxxxSIXoQOXB/JAwFGUQi6AKccgYGRaXoEghS4AMd1GqjQUA", BinaryEncoding.Base64),Compression.Deflate))),
    result = Table.AddColumn(
                 Source, 
                 "Result", 
                 (r)=>let match_template = Text.Combine(
                                              List.Transform(
                                                  RequiredWords, 
                                                  each if List.ContainsAll(Text.SplitAny(r[Value], Text.Remove(Text.Combine({" ".."@"}), {"0".."9"})) , _{1}) then _{0} else null 
                                               ), 
                                              ", "
                                            ) 
                      in if match_template ="" then "No match" else "Match: " & match_template
              )
in
    result
mahoneypat
Microsoft Employee
Microsoft Employee

Here is one way to do it with these two queries of your example data.  It also shows you which Template had the match (bonus kudos?).  To see how it works, just create a blank query, go to Advanced Editor, and replace the text there with the M code below.

Note that M is case sensitive, and this works because the matching words were all UPPERCASE.  If that is not the case with your real data, add a step to make everything upper or lower case before the lists and comparisons.

 

 

//call this one "Words"
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCknNLchJLElVMFTSUQpwDQr297MO9/d19LMGYqVYHSQVRkAVzo6+rkGO1iFhqFLGYKkQaxd/d6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Template = _t, Words = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Template", type text}, {"Words", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "WordList", each Text.Split([Words], ";")),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Words"})
in
    #"Removed Columns"

 

 

 

//call this one "Text"
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("VczBCsIwEIThVxlyUujFRyhaPDWVevAQegjtlghNVpIF9e1dWhG8/sx8zplL0187i+zTxBFCL8Gta2uLyJn+skZTmdbLGLAbOYm/pwK/LJCQifDkPBXwrOv4WLwQDnszVM6cujPUwbFum75WwzLixkxMBYkFX2/lNmjmrK/3Cgb6oUXN4QM=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t, Result = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", type text}, {"Result", type text}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Value"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Other Columns", "TextList", each Text.Split([Value], " ")),
    #"Added Custom" = Table.AddColumn(#"Added Custom1", "Custom", each Words),
    #"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Template", "WordList"}, {"Template", "WordList"}),
    #"Added Custom2" = Table.AddColumn(#"Expanded Custom", "Match", each if List.ContainsAll([TextList], [WordList]) then "Yes" else "No"),
    #"Removed Other Columns1" = Table.SelectColumns(#"Added Custom2",{"Value", "Template", "Match"})
in
    #"Removed Other Columns1"

 

 

If this works for you, please mark it as the solution.  Kudos are appreciated too.  Please let me know if not.

Regards,

Pat





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


@msmays5 - try this. Two tables/queries:
This is called Words, and it looks like this - starts with your initial table above, then I added a column to convert it to lists. You can see what one of the embedded lists looks like:

edhans_0-1598321924750.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCknNLchJLElVMFTSUQpwDQr297MO9/d19LMGYqVYHSQVRkAVzo6+rkGO1iFhqFLGYKkQaxd/d6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Template = _t, Words = _t]),
    #"Added Custom" = Table.AddColumn(Source, "WordList", each Text.Split([Words], ";"))
in
    #"Added Custom"

Next table is this. It also converts your sentences to a list, then uses List.Generate to cycle through the Words lists (above). A 1 means a match was found in one of the templates, 0 means it wasn't. You can do what you want with the 1/0 records at that point.

edhans_1-1598322054759.png

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCnANCvb3UyhKzEvJz1UoSa0oUQj393X0U8jNL0pFEQYKKsXqRCu5+LsrAIUVnB19XYMclWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    #"Added Custom" = Table.AddColumn(Source, "AllWords", each Text.Split([Value], " ")),
    ContainsWords = 
        Table.AddColumn(
            #"Added Custom",
            "Contains Words",
            each
                let
                    varCurrentWords = [AllWords],
                    varTemplates = Table.RowCount(Words)
                in
            Record.Field(
                Record.Combine(
                    List.Generate(
                        () => [x = 0, y = 0],
                        each [x] < varTemplates,
                        each
                            [
                                y = if List.ContainsAll(varCurrentWords, Words[WordList]{[x]})
                                    then [y] + 1
                                    else [y],
                                x = [x] + 1
                            ]
                    )
                ), 
                "y")
            )
in
    ContainsWords

 

How to use M code provided in a blank query:
1) In Power Query, select New Source, then Blank Query
2) On the Home ribbon, select "Advanced Editor" button
3) Remove everything you see, then paste the M code I've given you in that box.
4) Press Done
5) See this article if you need help using this M code in your model.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

@mahoneypat @edhans Thank you both for your quick answers! They both worked perfectly (and @mahoneypat, extra kudos for returning the template name!)

Greg_Deckler
Community Champion
Community Champion

@msmays5 - I feel like you are going to need Text to Table - https://community.powerbi.com/t5/Quick-Measures-Gallery/Text-to-Table/td-p/1312929

 

Beyond that, I'll have to do a little work, but I think that will point you in the right direction and you may get to what you want before I have time to sit down and work through it in detail.



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Thanks @Greg_Deckler for your quick response, I'll take a look. What you linked was DAX (which I'm not opposed to, but I was thinking I'd tackle this in PQ so I can push it back to a Dataflow). The text I need to parse is long and has very high cardinality, so I'd prefer to identify if the text matches a template (either return as logical or if we return the template name, at least it should have low cardinality) and then drop the free text [Value] column.

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors