Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
Hello everyone,
I have a questions about a complex merge I need to do in Power Query.
Imagine the following tables:
Table1:
Column1 | Column2
1 4441
2 4442
3 4443
4 443
Then I have Table2 , as the following table:
Table2:
Column1 | Column2
1 4441098
2 4441097
3 4441011
4 4441122
5 4443111
6 4443123
And the merge should have the rows of Table1 and the rows of Table2 where the "prefix" of Table2.Column2 is Table1.Column2. Take into account that I do not know the number of chars of the prefix, I mean, Table1.Column2 can have any lenght not greater than 7
Regards GV
Solved! Go to Solution.
A different way to perform a lookup is to use "Table.SelectRows", although this might be slower - so it is essential that you use the Table.Buffer and for very large table try to partition also. But the basic principle can look as follows:
let Table1 = Table.Buffer(#table({"Column1", "Column2"}, {{1, "4441"}, {2, "4442"}, {3, "4443"}, {4, "443"}})), Table2 = #table({"Column1", "Column2"}, {{1, "4441098"}, {2, "4441097"}, {3, "4441011"}, {4, "4441122"}, {5, "4443111"}, {6, "4443123"}}), Lookup = Table.AddColumn(Table2, "Lookup", (outer) => Table.SelectRows(Table1, each Text.StartsWith(outer[Column2], [Column2]))), #"Expanded Lookup" = Table.ExpandTableColumn(Lookup, "Lookup", {"Column2"}, {"Lookup.Column2"}) in #"Expanded Lookup"
Advantage here is that you don't need an exact match, but can use all sorts of conditions, in this case "Text.StartsWith".
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hello, hope this is ok to make this question in this topic, i was the closest case i could find on this page.
I can't manage to get this code to work, my case i just a little bit different than the origanal post.
I have this:
Sheet 1 = Price master
(column B) = P/N
Example: 16510-96J10
Sheet 2 = Exceptions
(Column A) = Part start
Example: 16510-
(Column E) = AG end
Example: 3
I want to do like a Vlookup from Price master, the result should be "search for first characters" from Exceptions.
In excel i can do it like this, but want is integrated in my Query "=VLOOKUP(LEFT([@[P/N]];5);Exceptions!A:D;4;0)"
The result should be something like this:
Search: 16510-96J10 (number found in "Price master"
The match should be for something like this, match from 2 characters to whole number. whichever comes first.
16
165
1651
16510
Numbers found in "Exceptions"
Outcome: New column in "Price master" with result from AG end ( "3" in this case)
Lets say your unique ID has 6 different possible lengths (2, 3, 4, 5, 6, or 7 characters long)
A way you could possibly do this would be to duplicate your query 6 times, split off a different number of characters in each query, and merge only matching rows to the original table. Then at the end you'd need to append all the tables.
It might be messy and a bit slow, but it should work.
well there needs to be a rule about the length of the merged characters. It is not important how long the second table are, but it is important how long the characters in the first table are. Your example suggests that there can be 3 or 4 character lenght in the first table. Assuming that is a typo, I would do this.
Load table 2
duplicate column 2
split column 2 at 4 characters
delete the remainder column
join table 1 on table 2 with the new column.
If your 3 character 443 in table is not a typo, you are in a world of trouble - unless you can identify a rule on how to split the column in table 2.
Hi Matt,
Thank you for your reply, but I'm in a world of trouble, as my 3 chars is not a typo. Even worst, I've records of 5 and 2 chars.
Regards,
GV
A different way to perform a lookup is to use "Table.SelectRows", although this might be slower - so it is essential that you use the Table.Buffer and for very large table try to partition also. But the basic principle can look as follows:
let Table1 = Table.Buffer(#table({"Column1", "Column2"}, {{1, "4441"}, {2, "4442"}, {3, "4443"}, {4, "443"}})), Table2 = #table({"Column1", "Column2"}, {{1, "4441098"}, {2, "4441097"}, {3, "4441011"}, {4, "4441122"}, {5, "4443111"}, {6, "4443123"}}), Lookup = Table.AddColumn(Table2, "Lookup", (outer) => Table.SelectRows(Table1, each Text.StartsWith(outer[Column2], [Column2]))), #"Expanded Lookup" = Table.ExpandTableColumn(Lookup, "Lookup", {"Column2"}, {"Lookup.Column2"}) in #"Expanded Lookup"
Advantage here is that you don't need an exact match, but can use all sorts of conditions, in this case "Text.StartsWith".
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hey @ImkeF,
Could you please elaborate more on where to write this code?
Is there a way to use the GUI to do that?
Thanks
Pls check this video, which contains the "less-codiest"-way I can think of (for this specific technique):
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Thank you!
what's the difference between inner and outer? are those the join kinds? or is something like "earlier" in DAX?
Yes, it's like EARLIER in DAX 🙂
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @ImkeF,
This solution was very helpful. Really grateful. I have a question regarding the type of join. I would like to use a full outer join, in order to see all rows from the First and Second. I tried this, but I still keep getting only all rows from the second table. Is there a way to get around this
=Table.SelectRows(Table.Buffer(Table1), (FullOuter) => Text.StartsWith([Column2], FullOuter[Columns]))
Yes, the "Inner" is just a random name for the "inner" of the nested functions in that command. So you can name it anything and it wouldn't change the result.
If you want to have a full outer, you can simply perform a full outer after that first operation (and merge the results of the 2 columns together)
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
This is awesome solution.
I've spend few days trying different ways of achieving same result, but this is by far the best - fastest and cleanest.
I wonder if this could be made into custom function?
Also, is there a solution to speed up subsequent actions after this join? To be honest, performance of PQ after this kind of join falls of the cliff and even simple filter takes ages (and this is for a dimension kind of table, with just about 1000 rows)
The solution below I applied similarly in an Excel / Power Query application I recently developed ( @ImkeF knows: the account matching application), which is used in an interactive way, so it had to be fast.
You can add key columns to Table 2, each with the first n (1-7) characters of the value in Column2, acting as key columns. I did so in below query Table2With7Keys.
Next, you can add a column to Table1 with the length of the value in Column2, and group on that length, so you have partitioned tables for each key length.
Now you can merge those nested tables with Table2, using the key column with the corresponding key length.
Then finish up with some expanding and removal.
Query Table2With7Keys:
let Source = Table2, #"Added Custom" = Table.AddColumn(Source, "Keys", (x) => Table.FromRows({List.Transform({1..7}, each Text.Middle(x[Column2],0,_))}, List.Transform({1..7}, each "Key"&Text.From(_)))), #"Expanded Keys" = Table.ExpandTableColumn(#"Added Custom", "Keys", {"Key1", "Key2", "Key3", "Key4", "Key5", "Key6", "Key7"}) in #"Expanded Keys"
Query MergedTables:
let Source = Table1, #"Added Custom" = Table.AddColumn(Source, "Length", each Text.Length([Column2])), #"Grouped Rows" = Table.Group(#"Added Custom", {"Length"}, {{"AllData", each _, type table}}), Merged = Table.Buffer(Table.TransformColumns(#"Grouped Rows",{{"AllData", each Table.NestedJoin(_,{"Column2"},Table2With7Keys,{"Key"&Text.From(List.Average(_[Length]))},"Table2Data",JoinKind.LeftOuter), type table}})), #"Expanded AllData" = Table.ExpandTableColumn(Merged, "AllData", {"Column1", "Column2", "Table2Data"}, {"Column1", "Column2", "Table2Data"}), #"Expanded Table2Data" = Table.ExpandTableColumn(#"Expanded AllData", "Table2Data", {"Column1", "Column2"}, {"Table2.Column1", "Table2.Column2"}), #"Removed Columns" = Table.RemoveColumns(#"Expanded Table2Data",{"Length"}) in #"Removed Columns"
Hi ImkeF
Interesting Solution.
I will try it and let you know if worked. Thank you very much for your contribution.
Regards,
GV
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
112 | |
89 | |
83 | |
76 | |
64 |
User | Count |
---|---|
146 | |
111 | |
109 | |
102 | |
96 |