Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredJoin us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM. Register now.
Hi all,
I have a tricky one that I'd appreciate some advice on. I'm trying to perform a multi step merge between two tables. I have come up with two solutions which are described below with sample data but the performance of each is terrible when I'm trying to join tables which have 900k rows (I can only get method 1 to work with production data. Method 2 only works with sample data and it just spins forever if I try giving it more than a few hundred rows). I'm joining between a CSV file and a SQL table.
Problem:
Table 1 is a lookup table and table 2 has the data I want to augment. The problem is table 2 may have a valid value for column A, B, C or all three and I want to join to the lookup table in that order of precedence. i.e. lookup table 1 and if you find a match with column A, then use it, else lookup on column B and if you find a match on column B then use it else lookup on column C and if you find a match then use it otherwise null.
Table 1:
| A | B | C | Ref |
| A1 | B1 | C1 | 1000 |
| A2 | B2 | C2 | 2000 |
| A3 | B3 | C3 | 3000 |
Table 2:
| A | B | C |
| a1 | nonsense | nonsense |
| nonsense | B2 | nonsense |
| nonsense | nonsense | c3 |
| nonsense | nonsense | nonsense |
| c3 | nonsense | nonsense |
| nonsense | c2 | nonsense |
| nonsense | nonsense | C1 |
| nonsense | b1 | c3 |
Result:
| A | B | C | ID |
| a1 | nonsense | nonsense | 1000 |
| nonsense | B2 | asdd | 2000 |
| nonsense | nonsense | c3 | 3000 |
| nonsense | nonsense | nonsense | null |
| c3 | nonsense | nonsense | null |
| nonsense | c2 | nonsense | null |
| nonsense | nonsense | C1 | 1000 |
| nonsense | b1 | c3 | 1000 |
Note the last row has valid matches for B and C but B is pulled back according to precedence that I require.
Method 1:
Method 1 just does three table merges and then coalesces the 3 found columns to fulfill my precedence requirement. It works in production but it is very slow.
let
Source = #"Table 2",
#"Uppercased Text" = Table.TransformColumns(Source,{{"A", Text.Upper, type text}, {"B", Text.Upper, type text}, {"C", Text.Upper, type text}}),
#"Merged Queries" = Table.NestedJoin(#"Uppercased Text", {"A"}, #"Table 1", {"A"}, "A Matches", JoinKind.LeftOuter),
#"Expanded A Matches" = Table.ExpandTableColumn(#"Merged Queries", "A Matches", {"Ref"}, {"A Matches.Ref"}),
#"Merged Queries1" = Table.NestedJoin(#"Expanded A Matches", {"B"}, #"Table 1", {"B"}, "B Matches", JoinKind.LeftOuter),
#"Expanded B Matches" = Table.ExpandTableColumn(#"Merged Queries1", "B Matches", {"Ref"}, {"B Matches.Ref"}),
#"Merged Queries2" = Table.NestedJoin(#"Expanded B Matches", {"C"}, #"Table 1", {"C"}, "C Matches", JoinKind.LeftOuter),
#"Expanded C Matches" = Table.ExpandTableColumn(#"Merged Queries2", "C Matches", {"Ref"}, {"C Matches.Ref"}),
#"Added Custom" = Table.AddColumn(#"Expanded C Matches", "Ref", each [A Matches.Ref] ?? [B Matches.Ref] ?? [C Matches.Ref]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"A Matches.Ref", "B Matches.Ref", "C Matches.Ref"})
in
#"Removed Columns"
Method 2:
Method 2 uses a custom function which uses Table.SelectRows to return the value I need. I thought this would be faster but I can't even get it to work on more than a 300 rows. I have tried Table.Buffer() on both table1 and/or table2 and it still won't complete. I have also tried Table.AddKey() which didn't do much either.
let
Source = #"Table 2",
lookup = (a as text, b as text, c as text) as list =>
let
w = Table.SelectRows(#"Table 1", each Text.Lower([A])=Text.Lower(a)),
x = Table.SelectRows(#"Table 1", each Text.Lower([B])=Text.Lower(b)),
y = Table.SelectRows(#"Table 1", each Text.Lower([C])=Text.Lower(c)),
z = if Table.RowCount(w) > 0 then w[Ref] else if Table.RowCount(x) > 0 then x[Ref] else if Table.RowCount(y) > 0 then y[Ref] else {}
in
z,
#"Added Custom" = Table.AddColumn(Source, "ID", each lookup([A], [B], [C])),
#"Expanded ID" = Table.ExpandListColumn(#"Added Custom", "ID")
in
#"Expanded ID"
Does anyone have any thoughts on anything else I can try or something obvious I may have missed. FYI, my real use case has 6 merges but I think the principle should still apply.
Why not break the lookup table into three separate one column tables, then do the joins?
--Nate
I do that in my real world case using Table.SelectColumns() but didn't show it for the example.
I don't know if this going to help you but give this a try.
Thanks @AlexisOlson I have tried with normal if then else statements and it doesn't improve things much. Method 1 works and I even have a setup where I filter subsequent merges if a match has already happened but it isn't fast by any measure. I was wondering if there was some other technique out there I hadn't considered.
You could also try unpivoting both tables before merging.
I don't know if this will be any faster but you can try something like this:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSjRU0lFSyMvPK04FIgU4QBZVitWJVoJzdJScjHDKYtEN4iQbo5uhgJUNUgRUi08aSSIZtzvQjHA2RJdNMoQ6KxYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [A = _t, B = _t, C = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"A", type text}, {"B", type text}, {"C", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Column", "Value"),
#"Uppercased Text" = Table.TransformColumns(#"Unpivoted Columns",{{"Value", Text.Upper, type text}}),
#"Merged Queries" = Table.NestedJoin(#"Uppercased Text", {"Column", "Value"}, Table1, {"Column", "Value"}, "Table", JoinKind.LeftOuter),
#"Expanded Table" = Table.ExpandTableColumn(#"Merged Queries", "Table", {"Ref"}, {"Ref"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded Table", each ([Ref] <> null)),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Index"}, {{"Ref", each List.First([Ref]), type text}}),
#"Merged Queries1" = Table.NestedJoin(#"Added Index", {"Index"}, #"Grouped Rows", {"Index"}, "Grouped Rows", JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries1", "Grouped Rows", {"Ref"}, {"Ref"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded Grouped Rows",{"Index"})
in
#"Removed Columns"
Thanks @AlexisOlson . I will give it a try tomorrow and see if the performance improves.
I'd expect your Method 1 to work OK, although I think your coalesce ("??") is probably breaking the query folding. I'd suggest editing it to use if/then rather than ?? to see if that makes any difference (since PQ knows how to turn if/then into a SQL case statement).
@lasersharks which one is the SQL table ? Table 1/Table2?
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.