Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for a $400 discount.
Register nowGet inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.
Hello.
I was wondering if it would be possible to give Power BI a long list of freeform text and also give it a "dictionary" of predefined values then have it search our long list of freeform text and extract those values from the "dictionary".
Example: A survey is sent out asking about competitors and a longform text is returned to us. We want to only extract that competitors name from that longform text.
If theres a why for it to "learn" the format that the competitors are put in and then adapt to learn new ones that would also be adventageous.
Solved! Go to Solution.
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hVJBbsJADPyKlXNVIVBVcYRWapHghnqhHJysQyw2a+R1iOhr+pa+rJtAWqkq7c2enZn1jnezyWY1Htlhtr3ZZBgcaZQANdWiJyhQCXIqsIkEbBEKL5EUTGDH6ktWCi6JoMSa/an3WFM0Druzn+ai4BOjGaBZD61JFQuCxfIv0CeRSfh4P/cx1fCE6ijEa8gR/ZdgTrqPsCSOTXrGko/DDHPfEDyTSvjRDsJ/gVx5V1mLRtr3T0IRoil2cBfPywq6XKhG9uDSaXhtRqPxvUErur+4PEgwLAw4lHKV9Jt1W1GARVqP9xfWinRHDlq2CgbbxSNM7ibT6fT2fFuVdgdzFdk79N97lRKsIoj8NtSsMFvCODEcTCDH5IkHVKspWG8VDU8pzDQ4LIJjDOfv46VA47QSjhDEUgIghw7ItttP", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Data = _t]),
DictList = List.Buffer(Dict[Word List]),
#"Added Custom" = Table.AddColumn(Source, "Result", each List.First(List.RemoveNulls(List.Transform(DictList,(x)=>if Text.Contains([Data],x, Comparer.OrdinalIgnoreCase) then x else null))))
in
#"Added Custom"
Code for table named Dict
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bVLbbsIwDP0Vi+f9BN1VE9UQTLwgHlxqSlQ3QU5a1r+fO5LSor3Z5zi2z3H2+8WyKKiHF2LYYA9b14bz4vA04CSWwrmHFbZC7CFHU5K9kSXaQDEsja2Cs/MMPtuLCSQP6JrxSP9h8MyEorPqO+0nTf1Q6IwNEXGCMRKDwRxjdjoZa0I/SyB3uqyIytwZZqziBMbGqWh7jIJZFYdOKxJftM2MV/UPuTRo4Qtrf0d8hckZph/jYakN7RR49UFL0hOGLaGqG3OqxE1jyFxvB58ywnH0jZl5HLE1Ss1oyxm4IYvGe909rcaDo39iIScs3TXNv5xRKIToJwfcmRLhzUmAvFd5EW+T/3bQnA4lhRtuaIfTTpBvdX+8u/D0v/hB2TuKehtbd8iRzEhqDysyXj8grEyXumbcEnyQpDox1TlccbSiwU6XXhwOvw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Word List" = _t])
in
Source
Please post your data for us to have a look.
I cannot post my exact data but here is some dummy data:
Data:
Longform Data | Expected Result From Query |
Amavida | Amavida |
anderson memory care because its closer to girlfirendand family | Anderson |
Testing | |
arbor landuing | Arbor Landing |
Arbor Terrace IL | Arbor Terrace |
Arbor Terrace IL | Arbor Terrace |
Arlington | Arlington |
Aston Gardens | Aston Gardens |
Aston Gardens | Aston Gardens |
Avalon | Avalon |
Berks Leisure Living | Berks Leisure Living |
Blue Heron | Blue Heron |
Blue Heron | Blue Heron |
Blue Heron | Blue Heron |
Blue Heron | Blue Heron |
brightwater | Brightwater |
Goes straight to VM and email doesn't work | |
Contact info doesn't work | |
Goes straight to VM when I call | |
Merged with Contact ID 353999. | |
Chose Brookdale because of the size of their AL 2 bed 3 bath apartment | Brookdale |
staying in Indiana | |
location is not an option |
"Dictionary"
This would be a long list of competitor/community names. Here is a short list for an example. It would be MUCH longer (1000+ items)
Abbey Del Ray South |
Abernethy Laurels Maiden |
Adante |
Addington |
Addington Jupiter |
Addington Place |
Addington Place Clear Lake |
Addison |
Addison Point |
Adora |
Adriatica |
Affinity |
Affinity Monterrey Village |
Alamo Ranch |
Albertville |
Album Ranch |
Alden Ranch |
Alderman Oaks |
Aldersgate |
Alexis Allen |
Alexis Estates |
All Seasons |
Allegro |
Allegro Boynton Beach |
Allegro Jupiter |
Allegro Parkland |
Allegro Renaissance |
Allisonville Meadows |
Alpharetta |
AltaVida Fort Myers |
Alura |
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hVJBbsJADPyKlXNVIVBVcYRWapHghnqhHJysQyw2a+R1iOhr+pa+rJtAWqkq7c2enZn1jnezyWY1Htlhtr3ZZBgcaZQANdWiJyhQCXIqsIkEbBEKL5EUTGDH6ktWCi6JoMSa/an3WFM0Druzn+ai4BOjGaBZD61JFQuCxfIv0CeRSfh4P/cx1fCE6ijEa8gR/ZdgTrqPsCSOTXrGko/DDHPfEDyTSvjRDsJ/gVx5V1mLRtr3T0IRoil2cBfPywq6XKhG9uDSaXhtRqPxvUErur+4PEgwLAw4lHKV9Jt1W1GARVqP9xfWinRHDlq2CgbbxSNM7ibT6fT2fFuVdgdzFdk79N97lRKsIoj8NtSsMFvCODEcTCDH5IkHVKspWG8VDU8pzDQ4LIJjDOfv46VA47QSjhDEUgIghw7ItttP", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Data = _t]),
DictList = List.Buffer(Dict[Word List]),
#"Added Custom" = Table.AddColumn(Source, "Result", each List.First(List.RemoveNulls(List.Transform(DictList,(x)=>if Text.Contains([Data],x, Comparer.OrdinalIgnoreCase) then x else null))))
in
#"Added Custom"
Code for table named Dict
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bVLbbsIwDP0Vi+f9BN1VE9UQTLwgHlxqSlQ3QU5a1r+fO5LSor3Z5zi2z3H2+8WyKKiHF2LYYA9b14bz4vA04CSWwrmHFbZC7CFHU5K9kSXaQDEsja2Cs/MMPtuLCSQP6JrxSP9h8MyEorPqO+0nTf1Q6IwNEXGCMRKDwRxjdjoZa0I/SyB3uqyIytwZZqziBMbGqWh7jIJZFYdOKxJftM2MV/UPuTRo4Qtrf0d8hckZph/jYakN7RR49UFL0hOGLaGqG3OqxE1jyFxvB58ywnH0jZl5HLE1Ss1oyxm4IYvGe909rcaDo39iIScs3TXNv5xRKIToJwfcmRLhzUmAvFd5EW+T/3bQnA4lhRtuaIfTTpBvdX+8u/D0v/hB2TuKehtbd8iRzEhqDysyXj8grEyXumbcEnyQpDox1TlccbSiwU6XXhwOvw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Word List" = _t])
in
Source
I'm sorry I dont think I really understand how to implement this solution. I'm not the most advanced with power BI and could use some guidance in the right direction.
In the above screenshot I have pasted in the first block of code and it seems to be working kinda. But this is the first time I've used the advanced editor so I'm not too sure on how it works.
How would I change the longform data in the "Data" column. My data is data that would change month to month.
Where would I paste the second block of code? If I just paste it right below I'm left with a "Token Eof expected." error. How would I also update this list?
You need to get data from your sources into PQ..Delete everything after Source statement.
Let's assume you got it from Excel. Then Source statement generated will be like this
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]
Put a comma after source statement.
Copy the entire code provided by me starting DictList.
Paste the code.
So your code will become like this
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
DictList = List.Buffer(Dict[Word List]),
#"Added Custom" = Table.AddColumn(Source, "Result", each List.First(List.RemoveNulls(List.Transform(DictList,(x)=>if Text.Contains([Data],x, Comparer.OrdinalIgnoreCase) then x else null))))
in
#"Added Custom"
You will need to change column names here. If your column name is not Word List, say it is MyColumn. Then [Word List] will need to be replaced with [MyColumn].
@NickDSL This solution looks good.
If you want to match multiple dictionary words, then you can update the #"Added Custom" step to concatenate them:
Table.AddColumn(Source, "Result", each Text.Combine(List.Select(DictList, (x) => Text.Contains([Data], x, Comparer.OrdinalIgnoreCase))))
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!
Check out the February 2025 Power BI update to learn about new features.
User | Count |
---|---|
31 | |
30 | |
21 | |
14 | |
10 |
User | Count |
---|---|
21 | |
21 | |
16 | |
10 | |
9 |