Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for a $400 discount.
Register nowGet inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.
In my table, I have a column meant for comments which has a short descriptor paragraph. Within this paragraph there may be certain keywords, which are mapped to specific string value categories (I currently have them in a separate lookup table). What I am looking to do is add a column that is a concatenation of all the different categories for which the corresponding keyword is found in the paragraph.
Here is the pseudo-code I have written out for my problem:
For each word in the paragraph
If the word is in the keyword column
Append the corresponding category
Here is an example of the results I'm looking for:
Is this possible in PowerBI? Any help would be greatly appreciated.
Solved! Go to Solution.
Hi @BWilkinson
Thanks for giving a toguh problem to solve, it took me almost 3 hours to resolve.
It is a bit convuluted way of working I could come out in DAX.
I had followed Marc Russo's article
http://sqlblog.com/blogs/marco_russo/archive/2011/12/30/string-comparison-in-dax.aspx
go through the comments and reply by Marc Russo to Tommy.
Ok based on that.
1. I created the raw text file as my source data called - TextData containing the column Text values same as your sample.
2. I created another table called Words which contains two colums, Word and ID. Word is the list of search terms and ID is a running sequence number. Example - Dog,1 ; Test,2 etc,
3. I created a column in TextData table as under.
Found = calculate(
if( countrows(values('Words'[ID]))>1, 0, values(('Words'[ID])) ),
filter(
all('Words'[Word]),
search('Words'[Word],TextData[Text],1,0)<>0
)
)
For explanation see the MArc Russo's article.
4. I created a table called Category containing ID and Cat example - 1, Animal Food ; 2 , Paragraph
5. Created a link between the Found and the Category table on ID using manage relationship.
6. Built a table report
7. Hope this is what you wanted.
8. There are few issues you should be aware that this will only search for the occurance of the search word in the data file. If it has more than one word in the current row then it will not display the corresponding category.
Let me know if this works for you.
Please accept this as solution if it works and give KUDOS.
Cheers
CheenuSing
This is how I solved it (using M)
Table 4
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcq1IzC3ISVXSUQpJLS5RitWJVnLMy8xNzAGKuOSngwXc8vNTgNyAzKqqRKXYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Keyword = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Keyword", type text}}) in #"Changed Type"
Table4
List X1
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcq1IzC3ISVXSUQpJLS5RitWJVnLMy8xNzAGKuOSngwXc8vNTgNyAzKqqRKXYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Keyword = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Keyword", type text}}), #"Removed Other Columns" = Table.SelectColumns(#"Changed Type",{"Keyword"}), Keyword = #"Removed Other Columns"[Keyword] in Keyword
X1
Table 3
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCsnILFYAokSFktTiEoWCxKLE9KLEggylWJ1oJd9KhZT8dIXEklSF3EqFgsyqqkSwOFhTRmKxQl6+QnZqZXl+UUqxUmwsAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Paragraph = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Paragraph", type text}}), #"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each (X1)), #"Expanded Custom" = Table.ExpandListColumn(#"Added Custom", "Custom"), #"Added Custom1" = Table.AddColumn(#"Expanded Custom", "Custom.1", each Text.Contains([Paragraph],[Custom],Comparer.OrdinalIgnoreCase)), #"Filtered Rows" = Table.SelectRows(#"Added Custom1", each ([Custom.1] = true)), #"Merged Queries" = Table.NestedJoin(#"Filtered Rows",{"Custom"},Table4,{"Keyword"},"Table4",JoinKind.LeftOuter), #"Expanded Table4" = Table.ExpandTableColumn(#"Merged Queries", "Table4", {"Category"}, {"Category"}), #"Removed Columns" = Table.RemoveColumns(#"Expanded Table4",{"Custom.1"}) in #"Removed Columns"
Table3
Hi @BWilkinson,
Use DAX lookup value need the two fields are same. For your scenairo, all th data type is text. So the upper and lowercase values of each letter is very important. For example, "Test" is not equal to "test", so I create the following sample table.
First, you need to split the Paragraph by the split column button or Power Query statement.
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("NYwxCsMwFEOvIv6cS3Rot27djAeBTWwa5xvb0Dqn70+gICR4POScvFLusBAj9oHKxrWxJlnkBDb3L0vdovjFyXMi6AqOiDJR83HQDEPWtz0Xbpd2nSZ27Ip3nB9toZvx9x+qQbz/AQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Paragraph = _t, Keyword = _t, Category = _t]), #"Changed Type" = Table.TransformColumnTypes(Source,{{"Paragraph", type text}, {"Keyword", type text}, {"Category", type text}}), #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Paragraph", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Paragraph.1", "Paragraph.2", "Paragraph.3", "Paragraph.4", "Paragraph.5"}), #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Paragraph.1", type text}, {"Paragraph.2", type text}, {"Paragraph.3", type text}, {"Paragraph.4", type text}, {"Paragraph.5", type text}}) in #"Changed Type1"
You will get the following formula.
For here you need to compare very new column to the very word in Keyword column value. It's really complex, because the length of [Paragraph] is not sure. And we usually campare different value in each row, like [Paragraph].This to [Keyword].test, you need to campare the [Paragraph].This to all the values in [Keyword][test, dog, pizza], [there is regular rules. And DAX is unsupported fuzzy seach, so it's impossible to do it using DAX. Thanks for understanding.
Best Regards,
Angelia
Thanks for the response, but this is not what I'm looking for. Based on your suggested solution I would have to manually create M*N lookup columns, where M is the length of my longest paragraph and N is the number of keywords.
At that point I might as well just use a massive if contains statement in one power query column.
Hi @BWilkinson
Thanks for giving a toguh problem to solve, it took me almost 3 hours to resolve.
It is a bit convuluted way of working I could come out in DAX.
I had followed Marc Russo's article
http://sqlblog.com/blogs/marco_russo/archive/2011/12/30/string-comparison-in-dax.aspx
go through the comments and reply by Marc Russo to Tommy.
Ok based on that.
1. I created the raw text file as my source data called - TextData containing the column Text values same as your sample.
2. I created another table called Words which contains two colums, Word and ID. Word is the list of search terms and ID is a running sequence number. Example - Dog,1 ; Test,2 etc,
3. I created a column in TextData table as under.
Found = calculate(
if( countrows(values('Words'[ID]))>1, 0, values(('Words'[ID])) ),
filter(
all('Words'[Word]),
search('Words'[Word],TextData[Text],1,0)<>0
)
)
For explanation see the MArc Russo's article.
4. I created a table called Category containing ID and Cat example - 1, Animal Food ; 2 , Paragraph
5. Created a link between the Found and the Category table on ID using manage relationship.
6. Built a table report
7. Hope this is what you wanted.
8. There are few issues you should be aware that this will only search for the occurance of the search word in the data file. If it has more than one word in the current row then it will not display the corresponding category.
Let me know if this works for you.
Please accept this as solution if it works and give KUDOS.
Cheers
CheenuSing
Thank you very much for linking that article and for providing your solution. I was able to slightly modify your solution to get what I was looking for.
Is there any particular reason you split my lookup table into two queries? By keeping my one lookup table, and changing the "0" in the if statement to a CONCATENATEX(Words, [Category], " "), I was able to properly show multiple categories.
Hi @BWilkinson,
Yeah, because the length of pragrah is not sure, so there is no formula to operate them at once. I am not specific for Power Query, you can post it to Power Query forum to ge more professional support.
Best Regards,
Angelia
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!
Check out the February 2025 Power BI update to learn about new features.
User | Count |
---|---|
85 | |
79 | |
65 | |
52 | |
30 |
User | Count |
---|---|
115 | |
114 | |
71 | |
66 | |
39 |