Re: Optimizing Filtering in Power Query: Large Dat...

IOdzelashvili · ‎08-08-2024

Hello there!

Imagine I have two table in Power Query Editor:

Loans, with LoanId, BusinessSegment columns. This table contains over 6 mln rows with unique LoanIds.
LoanBalances with LoanId, YearMonth, BalanceAmount. This table contains over 20 mln rows.

I want to filter the LoanBalances table where the BusinessSegment from the Loans table is "Mass".

the Condition is that I need to do this step in Power Query Editor, these are my tables and I cannot edit them before loading them into Power Query Editor.

I have tried many options available on the internet, but they only work by merging these two tables, which is very slow, because of the numbers of rows in both tables.

Could you help me find a better solution for my issue?

Thanks in advance.

Anonymous · ‎08-13-2024

Hi,

Thanks for the solution PwerQueryKees and watkinnc offered, and i want to offer some more infotmation for user to refet to.

hello @IOdzelashvili , you can refer to the follwing solution.

1.in Loans Table, create a new column to merge the LoanId, BusinessSegment columns, can name it merged.

2.Then in LoanBalances create a custom column.

=let a=List.Buffer(List.Select(Loans[Merged],each Text.Contains(_,"Mass"))),
b=[LoanId]
in List.Count(List.Select(a,each Text.Contains(_,b)))

Then you can filter the new column that it greater than 0.

Best Regards!

Yolo Zhu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

PwerQueryKees · ‎08-12-2024

@IOdzelashvili if you got a solution working, please share with some explanation! Power query performance on large datasets is always tricky and I would like to learn from your experience...

Anonymous · ‎08-11-2024

A couple more things: I don't know how many results your query gets, but if you're loading them to a page, then you can only print like 2 million some odd rows. Also, is your loan ID column text? If so, that's much slower for that many values.

--Nate

PwerQueryKees · ‎08-11-2024

Joinkinndnis only supported by table.join unfortunately. So if you stick with table.join, you may want to look at the join algorithms: https://powerquery.how/joinalgorithm-type/

PwerQueryKees · ‎08-11-2024

I agree with @Anonymous .

For even more speed you can use the Table.AddJoinColumn function instead because it does not actually bring any data from the joined table.

And JoinKind.LeftSemi could even add a bit more speed.

Anonymous · ‎08-10-2024

I would first filter the loans table like Table.SelectRows(Loans, each [BusinessSegment] = "Mass")

Then sort both tables ascending by LoanID (this is critical for

next step). Then, use the non-nested inner join:

Table.Join(PriorStepOrTableName, "LoanId", LoanBalances, "LoanId", JoinKind.Inner, JoinAlgorithm.SortMerge)

That should be pretty fast.

--Nate

PwerQueryKees · ‎08-09-2024

Not sure if it wil be any faster though. Just try...

PwerQueryKees · ‎08-09-2024

I don't have acces to my laptop. But you can write your own selection function for the selectrows of the loan balances table.

It would be something like this:

Table.Selectrows(LoanBalances, each Table.IsEmpty(Table.Selectrows(Loans, (L) => L[LoanId] = [LoanId] and L[BusinessSegement] "Mass")) = false)

IOdzelashvili · ‎08-09-2024

Merge is doing everything good, just it is too slow and I have to wait sometimes more then 30-40 minute to complete this process.

I hope there is easier way to do this, using M code, but if there is not, okay, I will continuous to merge my queries.

Anonymous · ‎08-08-2024

I mean if you wanted to build a proper data model, you'd be done right now. Load those tables to the data model, make a relationship between the tables, filter to your heart's content. Otherwise, you have to join them.

--Nate

IOdzelashvili · ‎08-09-2024

My goal is not to load too many data and not to increas in size my model.

Also, I am creating data model for self-service, not a report, so it is important to be loaded only needed data for my model users.

Optimizing Filtering in Power Query: Large Datasets with 6M+ Loans and 20M+ Balances

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

Party with Power BI’s own Guy in a Cube

Optimizing Filtering in Power Query: Large Datasets with 6M+ Loans and 20M+ Balances

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025