Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Power BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.

Reply
naninamu
Helper IV
Helper IV

Left outer Join - but only selected columns from both tables

Hi - I feel this should be easy but I can't work it out.

 

I have table A - it has let's say 50 columns, it's quite big.

I have table B - let's say it also has 50 columns.

 

I need to merge these with a Left Outer Join. However, currently it's taking all 50 columns from Table A and then as many columns as I select from Table B.

 

But say I only want to select 5 columns from Table A (as that's all I need and to help performance) - how do I actually do this??

 

Thanks in advance.

14 REPLIES 14
PwerQueryKees
Super User
Super User

Option b is definitely not advisable. From a performance perspective, but also from a code structure perspective.

Option a and c are similar in most respects I think. Although I would probably remove the unwanted columns before the merge, from a maintainability perspective. I don't think perfromance would be very different.

Option c would be my preference from a code structure an clarity perspective. 

Option d I don't know. I am not using PBI, only PW in excel sometimes combined with Power Pivot. 

naninamu
Helper IV
Helper IV

Hi @PwerQueryKees Honestly I don't really know. It was just something I read that a reference would still hit the table twice, but upon doing more reading I'm seeing contradictions.

 

Which of these approaches would you recommend from a performance point of view:

 

a) Merge Table A with Table B into a new table, remove unwanted columns

b) Duplicate Table A, merge with table B, remove unwanted columns

c) Reference Table A, merge with table B, remove unwanted columns

d) Create a calculated table in DAX and perform the merge in DAX in PBI

 

I have done Option a). I feel this is similar to Option c), as in due to the fact that I am merging Table A to create a new table, it is using all the steps Table A needed to get to that point.

 

I also tried Option d, which worked until I came to buld my chart, and it wouldn't accept anything in the Y axis field. No ideal why, obviously a problem with my data types.

 

Thanks in advance!

 

 

Anonymous
Not applicable

Hi

Thanks for the solutions everyone offered, and i want to offer some more information foe user to refer to.

hello @naninamu , you can consider to select the columns you need in table A, create a new query.

let
  source=Table.SelectColumns(TableA,.....)
in 
Source

Then use this new table to merge with table b

 

Best Regards!

Yolo Zhu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

PwerQueryKees
Super User
Super User

Are you sure that creating a reference to your table and doing the join with the reference will hit the table twice? Normally query folding would just hit a table once, but there are conditions and the differ per connector and DBMS. Best way to see what happens is to give it a try... 

By the way: Making a reference is NOT copying. It does make any differene in performance if you have a query with 10 steps or 2 queries with 5 steps each (provided the steps remain the same off course...)

naninamu
Helper IV
Helper IV

Thanks Poojara. My issue is not the above - I know how to do that - but the fact that down the track I will need to use all the columns in Table A (the left table) so I cant delete them from the merge.

 

I could make a copy of the table, but that impacts performance as it hits the DB twice and has millions of rows.

Poojara_D12
Super User
Super User

Hi @naninamu 

To merge Table A and Table B with a Left Outer Join while selecting only specific columns:

  1. Remove Unnecessary Columns from Table A:

    • In Power Query, select Table A and remove columns you don't need (only keep the 5 columns you want).
  2. Merge the Tables:

    • Use Merge Queries to join Table A and Table B on the key column(s) with a Left Outer Join.
  3. Select Columns from Table B:

    • After merging, expand the new column from Table B and select only the columns you need.

This way, you limit the columns used in the merge, improving performance.

 

Did I answer your question? Mark my post as a solution, this will help others!

If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂

Kind Regards,
Poojara
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS 

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a "Kudos"

Kind Regards,
Poojara - Proud to be a Super User
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS
Omid_Motamedise
Super User
Super User

Initially romve the columns you do not want and then apply the merge

If my answer helped solve your issue, please consider marking it as the accepted solution. It helps others in the community find answers faster—and keeps the community growing stronger!
You can also check out my YouTube channel for tutorials, tips, and real-world solutions in Power Query with the following link
https://youtube.com/@omidbi?si=96Bo-ZsSwOx0Z36h

But as I said in the above posts, I need some of the columns I'm removing down the track...

ThxAlot
Super User
Super User

If possible, finish joins operation at the RDB side (SQL Server, MySQL, Oracle ...); not only because those "traditional RDB" are more performant in joining, but what's more practical is that they support flexible Non-equi joins.

 

If all you can do is only PQ, try Tables.SelectColumns() to choose necessary columns; then join the resultant tables.



Expertise = List.Accumulate(


        {Days as from Today},


        {Skills and Knowledge},


        (Current, Everyday) => Current & Day.LearnAndPractise(Everyday)


)



Thanks. It's a good point. I do need to do more dev pre PQ. 😛

naninamu
Helper IV
Helper IV

Thank you - if performance isn't impacted my question is redundant then.

 

However, I was aware of only selecting the columns.

 

THe problem is , in Table A for instance,  what if I still need to refer to the full Table for certain visualisations, while the merged table doesn't need those columns? What I mean is, pratically, I'd go to Table A, pick the coumns I need, and then merge with Table B to create a new table. However, Table A now only has the reduced number of columns. What if one of the ones I removed to create the merge is one I actually need in Table A?

 

I see what you mean.

Make a new query as a reference to TableA, select your columns and do the merge.
This will keep your TableA as it is and will give you a new table with the merge result.

In real life,my Table A in  extremely large - 500m rows +. Even if I make a reference, it's still hitting the dB twice isn't it? I was trying to avoid doing this by only selecting the columns needed to at least save some performance - but if columns don't matter I guess that's a moot point.

 

What I did try was to copy the table in DAX and merge it from there. This was successful, although I ran into another issue as I then tried to use this table as part of a scatter chart, and it wouldn't let me drag anything into the Y axis field. Not sure if it's related to how my new table is built, or something completely unrelated... but in theory, is this methond a lot quicker than doing a duplicate/reference in PQ? Cheers, Andrew

PwerQueryKees
Super User
Super User

Performance of left outer join is not impacted by the number of columns.
BUT

In general, always start by selecting only the rows and the columns you need from every table you use.

Selecting the columns you need is easy:

  • Select the columns by ctrl-click on the ones you need
  • Right-Click on any of the selected columns and select "Remove other columns"
  • The new step that is created has a little cogwheel!
  • Click on the cogwheel if you change your mind and want to select more or fewer of the columns

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.