Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
I'm quite new to power query. I used to run some M functions in power query to do keywords searching and assign these rows with specific keywords to a category. But it has some issues now. I need exact match rather fuzzy match. Can anyone tell me if there is a workaround?
Here is the example
My input text is work order description (the context of these description is mainly about building fabrics). This is a huge dataset normally more than 10k rows.
Before I run below script I'll get work order description related to lock issue easily. However, the fuzzy match makes my life difficult. It now returns the work order decription contains string "^lock" to me. For example, it thinks the work description with words "blocked toilet....." is a "lock" problem now just because "block" has "lock" ......
Table.AddColumn(Gate, "Door & Door Hardware", each if(List.ContainsAll(Text.SplitAny([#"fully cleaned text"], " "),{"door", "lock"})) then "lock" else if(Text.Contains([#"fully cleaned text"], "lock")) then "lock" else "" )
Thanks in advance.
Solved! Go to Solution.
Hi,
You may download my solution Excel file from here. Go to PowerPivot > Manage
Hope this helps.
A solution that works for me and keeps everything in power query is to put a space at the beginning of your search substring. For example " lock" will pick up all instances where the word "lock" is preceeded by a space " ".
Hi,
Could you share a dataset and show the expected result. Also, would you be OK with a DAX calculated formula as an alternative?
Thanks for replying my question. Here is the screeshot.
Now, if I use the above mentioned nested function searching "lock", it not only returns the text containing the word "lock" but also text containing "block". That's why I want some function can do exact match.
If DAX helps out, I'll surely embrace it 🙂
Hi,
How do we make the software understand that block should be ignored but padlock should be considered (when there is no space between pad and lock)
Yes yes yes, you are right.
I pre-define a collection of words to define it is a lock issue, padlock is one of them as well. I didn't show all my script (trying to catch attention on exact match).
But the issue is, if the text only mention "lock" and I use the word "lock", the result will bring other words containing "^lock"......This makes my life difficult 😞
Hi,
Share some sample data (which i can paste in an Excel file) and a table showing the words that you would like to be treated as lock.
Really appreciated your help. Here are some text.
I cleaned the data from R, so there is no singular and plura problem. My predefined list is shorten the if-else will go through in the order like below:
1) List.ContainsAll({"door","lock"}
2) List.ContainsAll({"key","lock"}
3) text.contain ("lock", "padlock", "locksmith")
The above 1) and 2) will use the nested text.splitany() by space 🙂
Many thanks.
block toilet haywards ac substation. |
wdv supply chain padlock road gate |
dvk lock toilet cubicle door portacom smoko sheed missing lock |
entry door design hand injury potential due proximaty lock door jamb |
sbk southbrook substation - block toilet |
isl - break door lock transfield office |
investigate blockage drain main driveway |
haywards toilet block ac |
ay lock mechanism switchyard b man gate accept key |
condenser internal door lock ac fall apart |
esni lock gate |
hot water unit toilet block need replacement leak flood toilet block |
wvy - rust screw exterior amenity block |
wairaki investigation electronic door lock control door. need replacement |
igh - unblock drain |
Hi,
Is this the result you are expecting (see the second column)?
Yes, I think so. Well, I'm ready to embrace the DAX now. Could you please show me how to do it?
Many thanks!!!!
Hi,
You may download my solution Excel file from here. Go to PowerPivot > Manage
Hope this helps.
Hi,
We created a column on a condition saying that, if the string contains "block" then block else "lock".
Power BI will search for the defined string in the condition and will reteun the expected results.
Hope I have answered your question.
Regards,
Pavan Vanguri.
No, you have not. Suppose there was a word bullock, you would have to create an exception for this as well. There will be countless such words.
Hi @Jun_Wang,
It works for me, please find the below screenshot for your reference.
Go to Edit Queries --> Select the column (Text) --> Add Column Tab --> Conditional Column --> and give the details as in the screenshot. It will work.
Please let me know if you still have any questions. Happy to help you.
Regards,
Pavan Vanguri.
Thanks Pavan, but this fix is not very ideal to me. My dataset normally has 10k rows, the "block" is not the only possible word containing string "^lock". It could be blockade, block, blockage, even sherlock........
I reckon exact match would be easier 🙂
padlock is not an exact match. Therefore why should that be a part of your result?
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
104 | |
101 | |
79 | |
72 | |
64 |
User | Count |
---|---|
142 | |
108 | |
101 | |
81 | |
74 |