Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Shape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.

Reply
Greg_Deckler
Super User
Super User

joinAlgorithm and keyEqualityComparers

In looking at the documentation for Table.Join and Table.NestedJoin, there is a brief mention of optional parameters of joinAlgorithm and keyEqualityComparers. However, I cannot find any information on how to use these. Does anyone have any examples of these? 

 

I particularly want to know if these can be used to support unequal matching conditions, for example, only join if > or < versus =.

 

@ImkeF?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...
1 ACCEPTED SOLUTION

Now I feel really honoured - thank you guys 🙂

 

Honestly: I haven't recognized these 2 parameters before. And while the MS-documentation holds some things about the joinAlgorithms in here: https://msdn.microsoft.com/en-us/library/mt296614.aspx, there's zero about keyEqualityComparers.

 

But with regards to the desired relative comparison here (< or >), I dare to say that they probably wouldn't help, as M distinguishes between equality and relational comparisons.

 

So in order to not let you down here, please have a look at the following query, which performs a relative lookup (and will hopefully appeal to you DAX-masters 🙂 😞

 

let
    Table1 = #table({"Key1"},{{10},{20},{30},{40}}),
    Table2 = #table({"Key2"},{{9},{19},{29},{39}}),
    RelativeMerge = Table.AddColumn(Table1, "RelativeJoin", 
            (Earlier) => Table.SelectRows(Table2, 
                         each [Key2]<Earlier[Key1])),
    #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
in
    #"Expanded RelativeJoin"

 

 

(This is nested row-context: Yes, we have evaluation context in M as well & fortunately it doesn't behave like a moving target 🙂 )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

25 REPLIES 25
freelensia
Advocate II
Advocate II

Hi @ImkeF 

I converted your code into the function below.

 

(LookInTbl as table, KeywordTbl as table, LookInCol as text, KeywordCol as text) =>
// https://community.powerbi.com/t5/Desktop/joinAlgorithm-and-keyEqualityComparers/m-p/182148#M79842

let
    RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
        (Earlier) => Table.SelectRows(KeywordTbl, 
                each Text.Contains(Table.Column(Earlier,LookInCol),Table.Column(KeywordTbl,KeywordCol), Comparer.OrdinalIgnoreCase))),
    ExpandRelativeJoin = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {KeywordCol}, {KeywordCol})
in
    ExpandRelativeJoin

Then called it like this:

RelativeMerge = SearchColForKeywords(GetWordsTbl, KeywordTbl, "English", "Keywords"),  

But it generated this error.

Expression.Error: We cannot convert a value of type List to type Text.
Details:
    Value=[List]
    Type=[Type]

 Could you check to see if I wrote something incorrectly?
Thanks!

Hi @freelensia ,

a table reference followed by a column name will return a list and not a text (for the first argument of the Text.Contains-function). Hence the error message. To transform my query into a function you have to operate on a record-level like so instead:

 

let
    LookInTbl = Table.Buffer(#table({"Key1"},{{"Auto"},{"Bus"},{"Autobus"}, {"Car"}})),
    KeywordTbl = Table.Buffer(#table({"Key2"},{{"Auto"},{"Bus"}})),
    LookInCol = "Key1",
    KeywordCol = "Key2",
    fnRelMerge = (LookInTbl as table, KeywordTbl as table, LookInCol as text, KeywordCol as text) =>
    let
            RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
            (Earlier) => Table.SelectRows(KeywordTbl, 
                         each Text.Contains(Record.Field(Earlier, LookInCol), Record.Field(_, KeywordCol), Comparer.OrdinalIgnoreCase))),
            #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
    in
        #"Expanded RelativeJoin",
    CallFunction = fnRelMerge(LookInTbl, KeywordTbl, LookInCol, KeywordCol )
in
    CallFunction

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

This works great! I just wanna share my final function. In this function I added 2 steps: filtering out non-matching rows, and removing the additional column. Hope this helps other people.

 

{SearchColForKeywords}
(LookInTbl as table, KeywordTbl as table, LookInCol as text, KeywordCol as text) =>

let
    RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
        (Earlier) => Table.SelectRows(KeywordTbl, 
            each Text.Contains(Record.Field(Earlier, LookInCol), Record.Field(_, KeywordCol), Comparer.OrdinalIgnoreCase))),
    ExpandRelativeJoin = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {KeywordCol}, {"Keywords found"}),
    FilterRows = Table.SelectRows(ExpandRelativeJoin, each [Keywords found] <> null and [Keywords found] <> ""),
    RemoveColumn = Table.RemoveColumns(FilterRows,{"Keywords found"})
in
    RemoveColumn

@freelensia can you please share a sample file with your function invoked to add a custom column to a table? It looks exactly like what I need but I'm struggling to pick the right table and column references (I'm easily confused!).

 

Edit: for reference, I ended up doing a partial match by using a cartesian product as described in this post:

https://www.myspreadsheetlab.com/power-query-solution-partial-match-count-with-a-condition/

 

Note that this will create a bunch of duplicate rows, so make sure to have an index column for your initial rows (before the cartesian product), then sort (decreasing) by the match column (the one with Text.PositionOf) then use Table.Buffer before you can finally remove duplicates:

https://www.youtube.com/watch?v=rqDdnNxSgHQ

------------------------------------------------
1. How to get your question answered quickly - good questions get good answers!
2. Learning how to fish > being spoon-fed without active thinking.
3. Please accept as a solution posts that resolve your questions.
------------------------------------------------
BI Blog: Datamarts | RLS/OLS | Dev Tools | Languages | Aggregations | XMLA/APIs | Field Parameters | Custom Visuals

@otraverspl see my improved code below.

Hi guys, just sharing my improved function one last time.

 

There was a small problem in my previous solution. If multiple keywords matched, the function will duplicate the row. For example if we have keywords: "cats", "dogs", "birds" and the cell has this text "The dogs chased the cats while the birds watched", you will have 3 identical rows each with one keyword that matched.

So we needed to group and concatenate matched keywords.

 

//Keep rows of table LookInTbl if its value in LookInCol contains any keyword in KeywordCol of KeywordTbl
//Used like this: 
//    NextStep = KeepRowsIfFindAnyKW(LastStep, "Word ID", KeywordLang, KeywordsTbl, "Keywords"),

(LookInTbl as table, IDCol as text, LookInCol as text, KeywordTbl as table, KeywordCol as text) =>

let
    //Search a column for any text from another table https://community.powerbi.com/t5/Desktop/joinAlgorithm-and-keyEqualityComparers/m-p/1013429#M479534
    RelativeMerge = Table.AddColumn(LookInTbl, "RelativeJoin", 
        (Earlier) => Table.SelectRows(KeywordTbl, 
            each Text.Contains(Record.Field(Earlier, LookInCol), Record.Field(_, KeywordCol), Comparer.OrdinalIgnoreCase))), //case insensitive
    ExpandRelativeJoin = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {KeywordCol}, {"Keywords Found"}),
    FilterRows = Table.SelectRows(ExpandRelativeJoin, each [Keywords Found] <> null and [Keywords Found] <> ""),

    // Group by first keyword and keep all columns https://community.powerbi.com/t5/Power-Query/Group-by-date-and-keep-all-columns/td-p/794601
    GroupRows = Table.Group(FilterRows, {IDCol}, {{"First Keyword Found", each List.Max([Keywords Found]), type text}, {"All Data", each _, type table}}),
    ExpandAll = Table.ExpandTableColumn(GroupRows, "All Data", List.RemoveItems(Table.ColumnNames(GroupRows[All Data]{0}), {IDCol})), //dynamically expanding all columns except Word ID and Keywords found https://eriksvensen.wordpress.com/2019/10/05/powerquery-control-the-expand-columns-so-it-includes-new-columns/
    SelectRows = Table.SelectRows(ExpandAll,each [First Keyword Found]=[Keywords Found]),
    RemoveCols = Table.RemoveColumns(SelectRows,{"First Keyword Found", "Keywords Found"})
in
    RemoveCols

 

Anonymous
Not applicable

It looks like the documentation for keyEqualityComparers is the same as the joinOptions for Table.FuzzyJoin

 

https://docs.microsoft.com/en-us/powerquery-m/table-fuzzyjoin

 

I've played with them a little and gotten them to work, i.e.:

 

  #"Merged Queries" = Table.NestedJoin(Source, {"Text1", "Text2"}, #"VCP-Microsoft-Unifications", {"Name", "Text2"}, "VCP Microsoft Unifications", JoinKind.LeftOuter, [IgnoreCase = true, IgnoreSpace = false]),

 

Hope that helps.

 

Zig.

 

tringuyenminh92
Memorable Member
Memorable Member

haha, i think you are asking correct guy @Greg_Deckler, @ImkeF will like this Heart

@tringuyenminh92a slight correction - the correct gal Smiley Happy

 

oh, After knowing your correction @Sean, I really want to show my admirable to her.

Now I feel really honoured - thank you guys 🙂

 

Honestly: I haven't recognized these 2 parameters before. And while the MS-documentation holds some things about the joinAlgorithms in here: https://msdn.microsoft.com/en-us/library/mt296614.aspx, there's zero about keyEqualityComparers.

 

But with regards to the desired relative comparison here (< or >), I dare to say that they probably wouldn't help, as M distinguishes between equality and relational comparisons.

 

So in order to not let you down here, please have a look at the following query, which performs a relative lookup (and will hopefully appeal to you DAX-masters 🙂 😞

 

let
    Table1 = #table({"Key1"},{{10},{20},{30},{40}}),
    Table2 = #table({"Key2"},{{9},{19},{29},{39}}),
    RelativeMerge = Table.AddColumn(Table1, "RelativeJoin", 
            (Earlier) => Table.SelectRows(Table2, 
                         each [Key2]<Earlier[Key1])),
    #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
in
    #"Expanded RelativeJoin"

 

 

(This is nested row-context: Yes, we have evaluation context in M as well & fortunately it doesn't behave like a moving target 🙂 )

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

That was a super helpful example code @ImkeF, thanks so much! I was able to apply it to my own situation super easily!

Fantastic example!  Thank you for providing it.  Do you think the relative merge function you provided below can be modified with a Text.Contains function, so you could essentially do a partial text match merge on two tables?

Sure!

This query will match all words from Table2 who are somehow included in the strings of Table1 in a case insensitive mode:

 

let
    Table1 = Table.Buffer(#table({"Key1"},{{"Auto"},{"Bus"},{"Autobus"}, {"Car"}})),
    Table2 = Table.Buffer(#table({"Key2"},{{"Auto"},{"Bus"}})),
    RelativeMerge = Table.AddColumn(Table1, "RelativeJoin", 
            (Earlier) => Table.SelectRows(Table2, 
                         each Text.Contains(Earlier[Key1],[Key2], Comparer.OrdinalIgnoreCase))),
    #"Expanded RelativeJoin" = Table.ExpandTableColumn(RelativeMerge, "RelativeJoin", {"Key2"}, {"Key2"})
in
    #"Expanded RelativeJoin"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hello ImkeF

 

I'm a newbie and trying to get my head around this which I am unable to do so completely.

 

How and where do I enter the function?

 

If I have two tables and the desired output is Category in Table 1:

 

Table 1                                                                        Table 2

Merchandise                    Category                            Sub-category           Category

============================              =========================

Cornmeal                                                                    Corn                          Corn

Bryers Cornstarch                                                       Wheat                        Wheat

Wenn cornstarch                                                        Car                           Automobile &  Spares

Wheat BrandA                                                            Ford                         Automobile &  Spares

Wheat Brand B                                                           Spares                       Automobile &  Spares

Wheat Germ                                                              Paper                       Paper & Carboard

Car                                                                             Cardboard               Paper & Carboard

Ford                                                                           

Paper Brand 1

Paper Brand 2                                                             

Cardboard box                                                           

Automobile spares 

Spares                                                                        

 

I need to identify if the 'Category' column's string in Table 1 contains any of the values in 'Sub-Category' in Table 2 and put the 'Category' from Table 2 in 'Category' in table 1.  The match should be case insensitive.

 

Thanks in advance for your help!

This SAVED ME Major!

 

It was small, but gave me much to read and apply!

 

I even added in the Table.FirstN to limit my matches to 1. 

 

let Table1 = Table.Buffer(TransactionsOnly), /* Bank Transactions already cleaed in another query*/ Table2 = Table.Buffer(MatchMatrix), /*Excel File With FROM and TO Columns to be used in my compare, from another query*/ FakeMerge = Table.AddColumn( Table1,"RelMerge", (OrigTable) => Table.FirstN( Table.SelectRows( Table2, each Text.Contains(OrigTable[TheDESCR],[From] ,Comparer.OrdinalIgnoreCase) ),1) /*Limits to first row match*/ ), #"Expanded RelMerge" = Table.ExpandTableColumn(FakeMerge, "RelMerge", {"To"}, {"To"}), #"Replaced Value" = Table.ReplaceValue(#"Expanded RelMerge",null,"Other",Replacer.ReplaceValue,{"To"}), #"Changed Type" = Table.TransformColumnTypes(#"Replaced Value",{{"Amount", type number}}), #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Source] = " Capital 1 - 1324")) in #"Filtered Rows"

 

ignore the stuff below I am still troubleshooting somethings, but wanted to share this! very powerful stuff! 

 

With my approach as new transactions appear, I can easily update the excel file and it will now have a new value to test against and convert 

Anonymous
Not applicable

Hi @ImkeF ,

 

Sorry for replying to an old post, but needed your valuable help to my query as this problem looked similar to mine:

 

I have a single column table named WordList containing a unique list of values, whose each word value i want to do a partial match with phrases in multiple Attribute columns in another table named Data Extract and get a count of the item codes that show these word values. I want to count only the First occurence of a word value for each item

 

e.g. A word value may occur in a single phrase multiple times OR across multiple Attributes for that Item, but it should be counted as 1.

 

WordList Table:

 

Words

auto

bus

ingenius

natural

color

 

 

DataExtract Table:

 

ItemID    Attr1                       Attr2                        Attr3                      Attr4

10001      natural and soft     put in auto mode      so natural!               get auto; get auto mode

10002      bus at stop             natural icecream       boy is ingenius        more power

10003      let me out!             promote goodwill     simple simon           auto model

10004      matching color      natural color              natural essence       figure it!

10005      god will                  meet at bus stop       similar cases            rosetta stone

10006      mud ringworm       helper columns         rainbow color          geneva convention

10007      masked man          tenor voice                silica gel                   natural smell

10008      microcosm             guns of navarone      bloated cell              enigmatic man

10009      cellular sky             i am natural               color coated            ingenius of him!

10010      ingenius body        maker chambers       old story                  minor changes

 

 

Output Table:

Words         ItemCount          %ItemCount

auto                   2                              =2/10

bus                     2                              =2/10

ingenius             3                              =3/10

natural               5                              =5/10

color                  3                              =3/10

 


How do i find the Item Counts?

 

not sure what you mean with "partial match":

1) Would "Autobus" return a match (or even 2)?

 

and if so: only for "bus" or also for "auto" - leading to the second question:

 

2) Shall this be case sensitive or not?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hi @ImkeF ,

 

Thanks for your reply. 🙂

 

Sorry, i added a new post later with the complete issue on the Desktop board. Got a solution using DAX in PowerBI, but needed a Power Query solution in Excel as i am on Excel 2016 and as the data is huge.

 

To answer your question - sorry again for the wrong word used ie. partial. What i simply meant is:

 

Only search for each word given in the Wordlist Table in the phrases found in the Attribute columns in DataExtract table. and get a count of the items that have atleast one occurence of each word. Also, it will be case-insensitive.

 

You may reply to that new post. Again, apologies for the replying to an old post, instead of starting a new post.

Thank you!

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.