Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.

Reply
MathMar
New Member

Power Query case : a clever Vlookup like feature

Hello. I'm new in using Power Query and I'm trying to make a query but I'm strugling to make it work. As logic differs from usual language (ie: Python), I have difficulty ti make it work properly.
If you could help, it will be great.

Below is an explanation of what I try to do.

I have a table, containing several columns. One of them is "Job Title". What I want is to classify the various "Job Title" (as it is a free text field, almost all values are different) according to a Dictionary of keywords. Each keyword is in a 2nd table with keyword and associated classification.

Table 1: Job Title

Job Title
gestion de projet
Comptable
Acheteur
y.Assistant(e)
Dir/Resp Informatique
Commercial
y.Autre Fct Achat/Serv. Gen.
Dir/Resp Admin. Et Financier
Gérant/Co Gérant
Dir/Resp Achat
Directeur Général
Directeur/Sous-Directeur
Directeur general
Dir/Resp Système Information
Gerant/Co Gerant
à repréciser
y.Autre Fonction Adm et Financ
President
Dirigeant/Gérant
Dir/Responsable Télécom
IT manager


Table 2: Dictionary

KeywordClassification
projetgestion de projet
projectgestion de projet
prjgestion de projet
consultantgestion de projet
assistantadministratif
comptableadministratif
acheteuradministratif
financeadministratif
responsableresponsable / manager
dirdirecteur / direction
respresponsable / manager
directeurdirecteur / direction
gérantdirecteur / direction
managerresponsable / manager
dsidirecteur / direction
ceodirecteur / direction
cfodirecteur / direction
chefdirecteur / direction
dgdirecteur / direction
rssidirecteur / direction
headresponsable / manager
snrresponsable / manager
seniorresponsable / manager

 

Objective is to perform a classification based on Keyword if "Job title" contains the keyword. A kind of Table.NestedJoin but instead of getting and exact match, we are looking for a contains. And once we got at least a match, we take the corresponding classification. We stop at first match as dictionary is set in a proper order. Please note that we are not case sensitive in our match.
So results we are expecting in our example should be :

Job TitleClassification
Chef de projetgestion de projet
Comptableadministratif
Acheteuradministratif
y.Assistant(e)administratif
Dir/Resp Informatiquedirecteur / direction
Commercialadministratif
y.Autre Fct Achat/Serv. Gen. 
Dir/Resp Admin. Et Financierdirecteur / direction
Gérant/Co Gérantdirecteur / direction
Dir/Resp Achatdirecteur / direction
Directeur Généraldirecteur / direction
Directeur/Sous-Directeurdirecteur / direction
Directeur generaldirecteur / direction
Dir/Resp Système Informationdirecteur / direction
Gerant/Co Gerantdirecteur / direction
à repréciser 
y.Autre Fonction Adm et Financ 
Presidentdirecteur / direction
Dirigeant/Gérantdirecteur / direction
Dir/Responsable Télécomdirecteur / direction
IT managerresponsable / manager


Job Title ClassificationTo do so, we first put our keywords in a list called "KEYWORD"
Then, we try to compare "job title" string to the list in order to see if we have some match.
So we did a "do while" like to compare each "job title" string with any value of "KEYWORD" list. We tried using List.Generated to generate a new column to do so:

Table.AddColumn(#"previous step", "CHECK RES", each List.Generate(
() => [COUNTER = 0, MAX=List.Count(KEYWORD), res="n.a", char=[Job Title] ] ,
each [COUNTER] < [MAX]+1 ,
each if Text.Contains([char],KEYWORD{[COUNTER]})
then [ res= KEYWORD{[COUNTER]}, COUNTER=[COUNTER]+1, MAX=[MAX], char=[char] ]
else [ res= "n.a", COUNTER=[COUNTER]+1, MAX=[MAX], char=[char] ],
each [res] ))

As a result, we got in each row a list containing either a "n.a" for each comparison with our list that failed, and the corresponding keyword for the matching test.

So we got something like:

List = {"n.a", ""n.a", "n.a", "n.a", "n.a", "n.a", "compatble","n.a", …}

 

Job TitleCHECK RES
Chef de projet[List]
Comptable[List]
Acheteur[List]
y.Assistant(e)[List]
Dir/Resp Informatique[List]
Commercial[List]
y.Autre Fct Achat/Serv. Gen.[List]
Dir/Resp Admin. Et Financier[List]
Gérant/Co Gérant[List]
Dir/Resp Achat[List]
Directeur Général[List]



If we look at [List] in row2, we have something like : List = {"n.a", ""n.a", "n.a", "n.a", "n.a", "n.a", "compatble","n.a", "n.a", …}

So it seems to work.
But then we are stucked at next step. We tried then to extract matching value with KEYWORD by using feature such as List.Intersect and then to extract first value but we don't succeed to make the List.Intersect work.

= Table.AddColumn(#"Check Responsability", "Intersect", each List.Intersect({[CHECK RES],KEYWORD}))


So could you help here?
• What is wrong with our List.Intersect? It looks like it doesn't recognized our "cell row" [CHECK RES] as a list. But how to make it then?
• Is the way we are doing is the right one or is there easier way to make it?

Thanks in advance for your advices and sorry for this long post.

 

1 ACCEPTED SOLUTION
Jimmy801
Community Champion
Community Champion

Hello @MathMar 

 

I think List.Generate here is an overkill. I tried to use a approach with Table.SelectRows. Here an example

let
    JobTitle =
    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bVFLTsMwEL3KKCuQIDlDVGjVHSLdRV0Y5xGM4nEYO0i9DUtyjlyMOCEprdh5Pu83Lsukhg/GMVWgVtw7QnK8K5ONs21QLw2mKtdvCOhkKk5p7r3xQXG4we3UejCSPcO3tOdXJ1YF89Fh4bEQbVSzYLsgoK0ONJKqkBWQz5R24PSSKa+s4ZQeA20NK9YGs/pu6GVUzjaOfp9XuMi6tKCj67jIcbe57GeF6/z9Wl6BajD+QGb24uTD8G1xDup4toXVFVZTwxcJWhl6bTzk8gCO9XT2MSdhCTmtPAm8qXAOZmpE7v/iOvbxk+gw9M0o4+w03R/IKlZ11Dz+AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Job Title" = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Job Title", type text}})
    in
        #"Changed Type",
    Dictionary = 
    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZDBDcMgDEVXQTlH6kBRDi4Y4igxESZDdY4uVlpo1UPiXBDwvr/tPwzdluKMueu7gJIpsnFo2t/YN2w1Pp8yG1n2JQOfl4MISVOAW4nLK0Em3wzWLcN9wUMKdsKMezqEnhjYHhcmlK2M1oz/XuZmVmAImD46R2/vcpYASp9C672s8fO5Mqilqk14PlJN4FzzdVW7CakeFqPO/QWf0KsCF/S45GLACcGpGwrrCQgyRUUyvgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Keyword = _t, Classification = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Keyword", type text}, {"Classification", type text}})
    in
        #"Changed Type",
    BufferDictionary = Table.Buffer(Dictionary),
    AddDictionaryToJobTitle = Table.AddColumn
    (
        JobTitle,
        "Dictionary",
        (add)=> Table.SelectRows
        (
            BufferDictionary,
            each Text.Contains(Text.Lower(add[Job Title]),Text.Lower([Keyword]))
        )
    ),
    #"Added Custom" = Table.AddColumn(AddDictionaryToJobTitle, "Classification", each try [Dictionary][Classification]{0} otherwise null)
in
    #"Added Custom"

Be aware that the last AddColumn is not needed, as you could do this job already in the step before. This is only to show you what table you are getting with the Table.SelectRows. There you can see that in some cases more then one row is found (the selectrows is checking if the keyword is somehow included in the Job Title)

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works. 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

View solution in original post

7 REPLIES 7
MathMar
New Member

Thanks again.

I correctly writted it. But it is not recognized.

 

Expression.Error : Le nom « Table.FuzzyNestedJoin » n'a pas été reconnu. Veuillez vérifier qu'il est correctement orthographié.

 

I also saw in some screenshot some Fuzzy option when trying to Nest table via the menu. I don’t got these options on my version. So it looks like depending on version of Power Query, some instructions are missing.

 

Regards,

Thanks for your suggestion related to write my last instruction in a different way: L_Keyword_Responsability{List.PositionOfAny([CHECK RES],List.Buffer(L_Keyword_Responsability))-1}

I succeed to make what I want by then using a Table.NestedJoin until I got latest version of Office to be able to use directly Table.FuzzyNestedJoin.
Until we succeed to get from our IT proper version, we would like to use what we built. However, on very large file (>200K records), now request is taking ages. We tried to use Table.Buffer and List.Buffer to speed things but still very very low.
Do you have suggestion to improve our code here to speed up things?

Here is our full code:

let
Source = Csv.Document(File.Contents("F:\all contact.csv"),[Delimiter=",", Columns=41, Encoding=1252, QuoteStyle=QuoteStyle.Csv]),
#"En-têtes promus" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"move Job Title to Lower case" = Table.TransformColumns(#"En-têtes promus",{{"Job Title", Text.Lower, type text}}),
#"Check match with Responsability Matrix" = Table.AddColumn(#"move Job Title to Lower case", "CHECK RES", each List.Generate(
() => [COUNTER = 0, MAX=List.Count(L_Keyword_Responsability), res="n.a", char=[Job Title] ] ,
each ([COUNTER] < [MAX]+1) ,
each if Text.Contains([char],L_Keyword_Responsability{[COUNTER]})
then [ inter=List.Buffer(L_Keyword_Responsability) ,res= inter{[COUNTER]}, COUNTER=[COUNTER]+1, MAX=[MAX], char=[char] ]
else [ res= "n.a", COUNTER=[COUNTER]+1, MAX=[MAX], char=[char] ],
each [res] )),
#"Get corresponding value in the table" = Table.AddColumn(#"Check match with Responsability Matrix", "Intersect", each L_Keyword_Responsability{List.PositionOfAny([CHECK RES],List.Buffer(L_Keyword_Responsability))-1}),
#"Erreurs remplacées" = Table.ReplaceErrorValues(#"Get corresponding value in the table", {{"Intersect", "vide"}}),
#"croisement matrice Responsabilité" = Table.NestedJoin(#"Erreurs remplacées",{"Intersect"},Table.Buffer(Matrix_Responsability),{"Mot clé autour de la responsabilité"},"Matrix_Responsability",JoinKind.LeftOuter),
#"Récupération responsabilité estimée" = Table.ExpandTableColumn(#"croisement matrice Responsabilité", "Matrix_Responsability", {"résponsabilité résumée"}, {"résponsabilité résumée"})
in
#"Récupération responsabilité estimée"

Jimmy801
Community Champion
Community Champion

Hello @MathMar 

 

I think List.Generate here is an overkill. I tried to use a approach with Table.SelectRows. Here an example

let
    JobTitle =
    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bVFLTsMwEL3KKCuQIDlDVGjVHSLdRV0Y5xGM4nEYO0i9DUtyjlyMOCEprdh5Pu83Lsukhg/GMVWgVtw7QnK8K5ONs21QLw2mKtdvCOhkKk5p7r3xQXG4we3UejCSPcO3tOdXJ1YF89Fh4bEQbVSzYLsgoK0ONJKqkBWQz5R24PSSKa+s4ZQeA20NK9YGs/pu6GVUzjaOfp9XuMi6tKCj67jIcbe57GeF6/z9Wl6BajD+QGb24uTD8G1xDup4toXVFVZTwxcJWhl6bTzk8gCO9XT2MSdhCTmtPAm8qXAOZmpE7v/iOvbxk+gw9M0o4+w03R/IKlZ11Dz+AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Job Title" = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Job Title", type text}})
    in
        #"Changed Type",
    Dictionary = 
    let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZDBDcMgDEVXQTlH6kBRDi4Y4igxESZDdY4uVlpo1UPiXBDwvr/tPwzdluKMueu7gJIpsnFo2t/YN2w1Pp8yG1n2JQOfl4MISVOAW4nLK0Em3wzWLcN9wUMKdsKMezqEnhjYHhcmlK2M1oz/XuZmVmAImD46R2/vcpYASp9C672s8fO5Mqilqk14PlJN4FzzdVW7CakeFqPO/QWf0KsCF/S45GLACcGpGwrrCQgyRUUyvgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Keyword = _t, Classification = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Keyword", type text}, {"Classification", type text}})
    in
        #"Changed Type",
    BufferDictionary = Table.Buffer(Dictionary),
    AddDictionaryToJobTitle = Table.AddColumn
    (
        JobTitle,
        "Dictionary",
        (add)=> Table.SelectRows
        (
            BufferDictionary,
            each Text.Contains(Text.Lower(add[Job Title]),Text.Lower([Keyword]))
        )
    ),
    #"Added Custom" = Table.AddColumn(AddDictionaryToJobTitle, "Classification", each try [Dictionary][Classification]{0} otherwise null)
in
    #"Added Custom"

Be aware that the last AddColumn is not needed, as you could do this job already in the step before. This is only to show you what table you are getting with the Table.SelectRows. There you can see that in some cases more then one row is found (the selectrows is checking if the keyword is somehow included in the Job Title)

 

Copy paste this code to the advanced editor in a new blank query to see how the solution works. 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

Jimmy801
Community Champion
Community Champion

Hello @MathMar 

 

use Tabel.FuzzyNestedJoind to join both tables. Use the treshholp option to find more similarities.

In your solution a code in a new column has to look like this

= Table.AddColumn(#"Check Responsability", "Intersect", each [KEYWORD]{List.PositionOf([CHECK RES],"compatble")})


If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Hello, thank you for your prompt reply. I wasn’t aware about Fuzzy option.

However, I have the impression that my Power Query version hasn’t this instruction as I got a message error stating that it is not correctly written. Is Fuzzy feature linked to a specific version of Power Query?

Current version I used for Power Query is : Version : 2.59.5135.201 32 bits

 

Regards,

Jimmy801
Community Champion
Community Champion

Hello @MathMar 

 

the correct spelling is this 

Table.FuzzyNestedJoin

I don't know when exactly was added. So try to update to the latest version and it should be available

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Thanks again.

I correctly writted it. But it is not recognized.

 

Expression.Error : Le nom « Table.FuzzyNestedJoin » n'a pas été reconnu. Veuillez vérifier qu'il est correctement orthographié.

 

I also saw in some screenshot some Fuzzy option when trying to Nest table via the menu. I don’t got these options on my version. So it looks like depending on version of Power Query, some instructions are missing.

 

Regards,

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors