Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Migasuke

Dynamic text search in Power Query

Use case description

Imagine we own several stores in one big shopping mall. We noticed that people write their reviews about all stores on the website of the shopping mall. What does it mean? It means we have hundreds or even thousands of reviews and comments in one single page but majority of those comments are irrelevant to us since we own only few shops in the mall and not every single one of them.

Main idea of our solution has following steps:

     1. We provide list of our stores.

     2. We search if those stores are mentioned in any of the comment.
     3.
We keep only those comments which are related to our stores.

Of course, everything should be automatic and possible to run in PBI Service.

 

Initial Set up

To keep the example simple I will work with only two queries:

1. First query is called “Comments” – this is the query which contains all the comments from the Shopping mall. Example is following:

Migasuke_0-1682769889508.png

 

2. The next query is related to our companies called Companies. It’s basically a list of the text strings we want to find:

Migasuke_1-1682769963559.png

So basically we search for three companies called Super Buyyy, Order Anything 27/7 and Massive Discounts ABC.


Solution

1. Let’s start with a very first step. If you have a table similar to my Companies table, be sure you change the table into the list. In case you don’t know how to do it, we can simply do right click on the column header and select Drill Down:

Migasuke_2-1682770099750.png

You should notice that the icon of the query has also changed:

 

Migasuke_3-1682770123612.png

Changing the table to a list is a crucial step because then we can take advantage of List functions in Power Query.


2. In the next step we will go to the Comments table and add a new custom column. Syntax for the query is very simple and we basically refer to our created Companies list:

Migasuke_4-1682770192080.png
Result should look like this:

Migasuke_5-1682770215021.png

 


3. The same way as we did before we will create a custom column, which contains find logic. This time the code is a bit more tricky:

Migasuke_6-1682770256322.png

Now let’s look more closely into the formula.
List.Contains is a function which validates if some of our values from Companies List can be found within Comments column and returns TRUE or FALSE. Basic formula would work only if we had 100% match of the value from List with our Comments. Example can be seen below:

Migasuke_7-1682770310875.png

Since we always compare Comments, which are different from our initial list values, we need to adjust the code by the equation criteria which would be:

(x as text, y as text)=>Text.Contains(Text.Lower(y),Text.Lower(x))

What does the formula above mean? Basically:

  • We define variables x and y as a text.
  • x is represented as our Companies List.
  • y is represented as our Comments.
  • Text.Lower ensures that we work only with lower case text (Super Buyyy vs super buyyy is not the same in terms of comparison)
  • Text.Contains compares our Comments with the List values.

 

Result of our new column should look like this:

Migasuke_8-1682770389878.png
In the final step we can just remove column with Lists and apply filter on values equal to TRUE so we exclude companies we are not interested in:

Migasuke_9-1682770414913.png

 




 




 



 






 

 

 

 

 

 

Comments

Hi,

The M code can be simplified to

List.Contains(Stores[Stores],[Comments],(x as text, y as text)=>Text.Contains(y,x,Comparer.OrdinalIgnoreCase))

What if we want the result as the name of the Store rather than TRUE/FALSE?

Could you please share the revised M code for that.

Hi,
thanks for the tip of

Comparer.OrdinalIgnoreCase

In case someone wants to have comment + name of the store, it can be done in bit reverse order:
1. Have query with Companies (act as main query)
2. Have query (list) as Comments
3. In Companies query we add new column with:

List.FindText(Comments_list,[Column1])

4. We expand the column in to new rows and we have two columns: Name of desired companies with related Comments.

Hi,

Thank you for replying.  This works but is not what i wanted.  I want the column to be created in the Comments Table itself.  I would like to see all the Comments even if the Companies mentioned in the Description there are not in the Comments Table.  So in the spare column of the Comments table, i want to see the Companies when available and null where not available.

Hi @Ashish_Mathur ,
Probably there are other ways but the easiest one I see at this moment is basically follow what I described in "reverse approach" in the previous comment and then just add step where you do Merge (right-join) on comments column with the Comments (list). Of course you need to change list to table, but this can be already part of the Merge step.

In the code bellow I am mergin the list:

= Table.NestedJoin(#"Expanded Custom", {"Custom"},Table.FromList(Comments_list, Splitter.SplitByNothing(), null, null, ExtraValues.Error), {"Column1"}, "Comments", JoinKind.RightOuter)

This approach works well. In case you would need more details you can send me direct message.

 

Hi,

Thank you for replying.  This M code works

let
    Source = Excel.CurrentWorkbook(){[Name="Comments"]}[Content],
    #"Added Custom" = Table.AddColumn(Source, "Custom", each List.Accumulate(#"Stores (3)"[Stores],"",(State,Current)=>if Text.Contains([Comments],Current,Comparer.OrdinalIgnoreCase) then Current & ", "&State else State)),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each Text.Combine(List.Select(Text.Split([Custom],","), each _<>" "),", ")),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"Custom"})
in
    #"Removed Columns"

Hope this helps.