Solved: Re: Complex PowerQuery Merge query using substring...

viera00 · ‎06-24-2016

Hello everyone,

I have a questions about a complex merge I need to do in Power Query.

Imagine the following tables:

Table1:

Column1 | Column2

1 4441

2 4442

3 4443

4 443

Then I have Table2 , as the following table:

Table2:

Column1 | Column2

1 4441098

2 4441097

3 4441011

4 4441122

5 4443111

6 4443123

And the merge should have the rows of Table1 and the rows of Table2 where the "prefix" of Table2.Column2 is Table1.Column2. Take into account that I do not know the number of chars of the prefix, I mean, Table1.Column2 can have any lenght not greater than 7

Regards GV

German Viera
http://slidemodel.com/

ImkeF · ‎06-25-2016

A different way to perform a lookup is to use "Table.SelectRows", although this might be slower - so it is essential that you use the Table.Buffer and for very large table try to partition also. But the basic principle can look as follows:

let
    Table1 = Table.Buffer(#table({"Column1", "Column2"}, {{1, "4441"}, {2, "4442"}, {3, "4443"}, {4, "443"}})),
    Table2 = #table({"Column1", "Column2"}, {{1, "4441098"}, {2, "4441097"}, {3, "4441011"}, {4, "4441122"}, {5, "4443111"}, {6, "4443123"}}),
    Lookup = Table.AddColumn(Table2, "Lookup", (outer) => Table.SelectRows(Table1, each Text.StartsWith(outer[Column2], [Column2]))),
    #"Expanded Lookup" = Table.ExpandTableColumn(Lookup, "Lookup", {"Column2"}, {"Lookup.Column2"})
in
    #"Expanded Lookup"

Advantage here is that you don't need an exact match, but can use all sorts of conditions, in this case "Text.StartsWith".

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

Simonolsson · ‎04-03-2023

Hello, hope this is ok to make this question in this topic, i was the closest case i could find on this page.

I can't manage to get this code to work, my case i just a little bit different than the origanal post.

I have this:

Sheet 1 = Price master

(column B) = P/N

Example: 16510-96J10

Sheet 2 = Exceptions

(Column A) = Part start

Example: 16510-
(Column E) = AG end

Example: 3

I want to do like a Vlookup from Price master, the result should be "search for first characters" from Exceptions.
In excel i can do it like this, but want is integrated in my Query "=VLOOKUP(LEFT([@[P/N]];5);Exceptions!A:D;4;0)"

The result should be something like this:

Search: 16510-96J10 (number found in "Price master"

The match should be for something like this, match from 2 characters to whole number. whichever comes first.

16

165

1651

16510

Numbers found in "Exceptions"

Outcome: New column in "Price master" with result from AG end ( "3" in this case)

Beckham · ‎06-27-2016

Lets say your unique ID has 6 different possible lengths (2, 3, 4, 5, 6, or 7 characters long)

A way you could possibly do this would be to duplicate your query 6 times, split off a different number of characters in each query, and merge only matching rows to the original table. Then at the end you'd need to append all the tables.

It might be messy and a bit slow, but it should work.

MattAllington · ‎06-24-2016

well there needs to be a rule about the length of the merged characters. It is not important how long the second table are, but it is important how long the characters in the first table are. Your example suggests that there can be 3 or 4 character lenght in the first table. Assuming that is a typo, I would do this.

Load table 2

duplicate column 2

split column 2 at 4 characters

delete the remainder column

join table 1 on table 2 with the new column.

If your 3 character 443 in table is not a typo, you are in a world of trouble - unless you can identify a rule on how to split the column in table 2.

* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
I will not give you bad advice, even if you unknowingly ask for it.

viera00 · ‎06-27-2016

Hi Matt,

Thank you for your reply, but I'm in a world of trouble, as my 3 chars is not a typo. Even worst, I've records of 5 and 2 chars.

Regards,

GV

German Viera
http://slidemodel.com/

ImkeF · ‎06-25-2016

A different way to perform a lookup is to use "Table.SelectRows", although this might be slower - so it is essential that you use the Table.Buffer and for very large table try to partition also. But the basic principle can look as follows:

let
    Table1 = Table.Buffer(#table({"Column1", "Column2"}, {{1, "4441"}, {2, "4442"}, {3, "4443"}, {4, "443"}})),
    Table2 = #table({"Column1", "Column2"}, {{1, "4441098"}, {2, "4441097"}, {3, "4441011"}, {4, "4441122"}, {5, "4443111"}, {6, "4443123"}}),
    Lookup = Table.AddColumn(Table2, "Lookup", (outer) => Table.SelectRows(Table1, each Text.StartsWith(outer[Column2], [Column2]))),
    #"Expanded Lookup" = Table.ExpandTableColumn(Lookup, "Lookup", {"Column2"}, {"Lookup.Column2"})
in
    #"Expanded Lookup"

Advantage here is that you don't need an exact match, but can use all sorts of conditions, in this case "Text.StartsWith".

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

kalcey · ‎04-05-2017

Hey @ImkeF,

Could you please elaborate more on where to write this code?

Is there a way to use the GUI to do that?

Thanks

ImkeF · ‎04-05-2017

Pls check this video, which contains the "less-codiest"-way I can think of (for this specific technique):

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

jfreyha · ‎01-23-2019

Thank you!

what's the difference between inner and outer? are those the join kinds? or is something like "earlier" in DAX?

ImkeF · ‎01-24-2019

Yes, it's like EARLIER in DAX 🙂

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous · ‎01-03-2019

Hi @ImkeF,

This solution was very helpful. Really grateful. I have a question regarding the type of join. I would like to use a full outer join, in order to see all rows from the First and Second. I tried this, but I still keep getting only all rows from the second table. Is there a way to get around this

=Table.SelectRows(Table.Buffer(Table1), (FullOuter) => Text.StartsWith([Column2], FullOuter[Columns]))

ImkeF · ‎01-03-2019

Yes, the "Inner" is just a random name for the "inner" of the nested functions in that command. So you can name it anything and it wouldn't change the result.

If you want to have a full outer, you can simply perform a full outer after that first operation (and merge the results of the 2 columns together)

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Hermes · ‎09-19-2017

This is awesome solution.
I've spend few days trying different ways of achieving same result, but this is by far the best - fastest and cleanest.

I wonder if this could be made into custom function?

Also, is there a solution to speed up subsequent actions after this join? To be honest, performance of PQ after this kind of join falls of the cliff and even simple filter takes ages (and this is for a dimension kind of table, with just about 1000 rows)

MarcelBeug · ‎09-19-2017

The solution below I applied similarly in an Excel / Power Query application I recently developed ( @ImkeF knows: the account matching application), which is used in an interactive way, so it had to be fast.

You can add key columns to Table 2, each with the first n (1-7) characters of the value in Column2, acting as key columns. I did so in below query Table2With7Keys.

Next, you can add a column to Table1 with the length of the value in Column2, and group on that length, so you have partitioned tables for each key length.

Now you can merge those nested tables with Table2, using the key column with the corresponding key length.

Then finish up with some expanding and removal.

Query Table2With7Keys:

let
    Source = Table2,
    #"Added Custom" = Table.AddColumn(Source, "Keys", (x) => Table.FromRows({List.Transform({1..7}, each Text.Middle(x[Column2],0,_))}, List.Transform({1..7}, each "Key"&Text.From(_)))),
    #"Expanded Keys" = Table.ExpandTableColumn(#"Added Custom", "Keys", {"Key1", "Key2", "Key3", "Key4", "Key5", "Key6", "Key7"})
in
    #"Expanded Keys"

Query MergedTables:

let
    Source = Table1,
    #"Added Custom" = Table.AddColumn(Source, "Length", each Text.Length([Column2])),
    #"Grouped Rows" = Table.Group(#"Added Custom", {"Length"}, {{"AllData", each _, type table}}),
    Merged = Table.Buffer(Table.TransformColumns(#"Grouped Rows",{{"AllData", each Table.NestedJoin(_,{"Column2"},Table2With7Keys,{"Key"&Text.From(List.Average(_[Length]))},"Table2Data",JoinKind.LeftOuter), type table}})),
    #"Expanded AllData" = Table.ExpandTableColumn(Merged, "AllData", {"Column1", "Column2", "Table2Data"}, {"Column1", "Column2", "Table2Data"}),
    #"Expanded Table2Data" = Table.ExpandTableColumn(#"Expanded AllData", "Table2Data", {"Column1", "Column2"}, {"Table2.Column1", "Table2.Column2"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Table2Data",{"Length"})
in
    #"Removed Columns"

Specializing in Power Query Formula Language (M)

viera00 · ‎06-27-2016

Hi ImkeF

Interesting Solution.

I will try it and let you know if worked. Thank you very much for your contribution.

Regards,

GV

German Viera
http://slidemodel.com/

Complex PowerQuery Merge query using substrings

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly

Party with Power BI’s own Guy in a Cube

Complex PowerQuery Merge query using substrings

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

How to Get Your Question Answered Quickly