March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi. Per the title, what are the differences between these two, and at which scenario should I choose one over the other?
Solved! Go to Solution.
Hi @Anonymous ,
The first main difference is in the name: "Nested". The Table.NestedJoin function will output your second table as nested tables in a column on your first table. This allows you the option to select which columns you want to expand from the second table.
The absence of "Nested" in Table.Join means the tables just get joined on the key fields specified i.e. all columns from Table1 and all columns from Table2, with no option to select specific columns to expand.
The second main difference is that Table.Join allows the use of an optional 'joinAlgorithm' parameter. This allows you to set the method by which Power Query goes about actually evaluating and joining your data which, when used correctly, can significantly improve join performance. For example, if the keys in both your tables are sorted ascending, you can use the 'JoinAlgorithm.SortMerge' argument to join at incredibly fast speeds. However, you must have a good knowledge of how PQ performs other functions (such as sorting) in order to actually get these gains. For more detail on this, check out Chris Webb's blog:
Why one over the other? Assuming NO query folding then:
Basic Scenario A:
Table1 has 20 columns, Table2 has 40 columns. You only need two columns from Table2 to be joined onto Table1.
I would use Table.NestedJoin, as joining all 40 columns of Table2 would be a huge processing overhead, and also make Table1 quite unwieldy, so using the nested join allows me to just expand the two columns I need.
Basic Scenario B:
Table1 has 10 columns, Table2 has 2 columns. Both tables have more than 1M rows, but both keys are sorted ascending AT SOURCE. I would use Table.Join with the JoinAlgorithm.SortMerge argument to significantly speed up a very large merge. I would expect the gains from this merge type to outweigh the losses from having to remove the second key column that gets brought in from Table2.
These are, as noted, basic scenarios, and your actual needs will likely be more nuanced and complicated than these, but they give a broad idea of why you might select one over the other. Once query folding comes into play it can actually drastically change the assessment of these two scenarios, but that's getting a bit beyond scope I think.
Pete
Proud to be a Datanaut!
I love the solution here, it's very focused on the "what is this"... it also taught me a little bit more than I've learned googling for hours, and I'd like to inject "how and why I'm here..." which I hope will improve peoples ability to fix/troubleshoot later.
It does appear these two things (Table.Join and Table.NestedJoin) are becoming more relevant in powerbi if im trying to frankenstien my dashboard together, like if I have the wrong data types and I'm brushing over this by writing SQL or trying to force powerbi to make the move for me.
Here's HOW/WHY I heard about these two specific buzzwords called; Table.Join and Table.NestedJoin..
0. im running a live connection to postgreSQL
1. if i have a decimal string etc and im calling it a whole number in powerbi
2. when im trying to fix data types in SQL itself sort-of-like...
-- from tableA a inner join tableC c on id::int = c.the_id
3. when it's a table being joined to as a "many" relationship
4. using RLS
My quick naive oversight, feels like it's attempting to act as a sql optimizer with a layer of new restrictions to adhere to, which im comfortable doing in SQL. Hopeful these two things are benefiting me w/o having to dig much deeper here...
Hi @Anonymous ,
The first main difference is in the name: "Nested". The Table.NestedJoin function will output your second table as nested tables in a column on your first table. This allows you the option to select which columns you want to expand from the second table.
The absence of "Nested" in Table.Join means the tables just get joined on the key fields specified i.e. all columns from Table1 and all columns from Table2, with no option to select specific columns to expand.
The second main difference is that Table.Join allows the use of an optional 'joinAlgorithm' parameter. This allows you to set the method by which Power Query goes about actually evaluating and joining your data which, when used correctly, can significantly improve join performance. For example, if the keys in both your tables are sorted ascending, you can use the 'JoinAlgorithm.SortMerge' argument to join at incredibly fast speeds. However, you must have a good knowledge of how PQ performs other functions (such as sorting) in order to actually get these gains. For more detail on this, check out Chris Webb's blog:
Why one over the other? Assuming NO query folding then:
Basic Scenario A:
Table1 has 20 columns, Table2 has 40 columns. You only need two columns from Table2 to be joined onto Table1.
I would use Table.NestedJoin, as joining all 40 columns of Table2 would be a huge processing overhead, and also make Table1 quite unwieldy, so using the nested join allows me to just expand the two columns I need.
Basic Scenario B:
Table1 has 10 columns, Table2 has 2 columns. Both tables have more than 1M rows, but both keys are sorted ascending AT SOURCE. I would use Table.Join with the JoinAlgorithm.SortMerge argument to significantly speed up a very large merge. I would expect the gains from this merge type to outweigh the losses from having to remove the second key column that gets brought in from Table2.
These are, as noted, basic scenarios, and your actual needs will likely be more nuanced and complicated than these, but they give a broad idea of why you might select one over the other. Once query folding comes into play it can actually drastically change the assessment of these two scenarios, but that's getting a bit beyond scope I think.
Pete
Proud to be a Datanaut!
Great answer! Couldn't be better. Thank you.
Can you continue on with your answer assuming query folding is required?
Thank you.
Hi @Kdhosk ,
Query folding pretty much makes this entire discussion moot.
Once folding is in play, the native query will just resolve to left outer join/inner join etc. and these will be processed using the server/DB's algorithm. The native query will also only include the columns retained at the end of your query within the native SELECT clause, so it doesn't matter whether you nest and limit them, or merge the whole lot then remove them afterwards, the query will be optimised as at the last folded step.
Pete
Proud to be a Datanaut!
Thanks for the very detailed explanation @BA_Pete ! Seems to me that for the most part, I will use NestedJoin over Join then.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
User | Count |
---|---|
20 | |
12 | |
10 | |
9 | |
7 |
User | Count |
---|---|
40 | |
26 | |
16 | |
16 | |
10 |