Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.
This is more of a generic question than a specific problem but my question, whats the most efficent way to filter large data sets within Power Query?
Simplified Example:
Table 1 Job Number and Date
Table 2 Job Number and Qty
I want to see total qty produced within a given timeframe so I believe I have 3 options to get there...
Option 1: Filter Table 1 to desired time frame then merge Table 2 into that query
Option 2: Filter Table 1 to desired time frame then merge Table 1 into Table 2, then filter out null in Qty column
Option 3: Filter Table 1 to desired time frame, turn into a list, bring list into Table 2 (use List.Buffer) and utilise List.Contains to now fitler - I like this option as it means I can make the date ranges user editable
Example from a query
#"Filtered Material" = Table.SelectRows(PV_JobStep_Table, each ([StepType] = 113)),
ListQuery = List.Buffer (PV_JobCosts),
#"Filtered PV_JobCosts" = Table.SelectRows(#"Filtered Material", each List.Contains (ListQuery, [JobCode])),
Problem I'm having is when the tables have lots of columns and contain millions of rows the reports are taking up to 10 minutes to refresh and return a hundred or so rows.
Using a 64bit ODBC connection into a Progress database so no option to utilise Query Folding 😞
Thanks,
I like option 3 best as long as your list is < 1,000 items. More than that and it starts to get slow.
Because you are not folding, you have to process 100% of the data no matter what.
If Table 1 isn't huge after the date filter, consider wrapping it in Table.Buffer before you do the merge. Test that out - but save the file first. If it is huge, it could eat up RAM and cause the app to lock up. I speak from experience when I used Table.Buffer on a 3M record 30+ column table.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingThanks Edhans.
Yea most of the time it isn't too bad (I consider report refresh in sub 2 minutes acceptable for users) but as you say utilising bigger tables with 2M+ lines things start to slow down considerably.
I've played with Tabe.Buffer a bit to see if that can speed things up but it seems really hit and miss either getting a slight speed increase, no change or absolutley kills the report depending on where I use it.
Main speed considerations/rules I try to stick to:
- Filter as early as possible
- Remove unwanted columns before merging
- As few steps as possible and no duplication of steps (so all my change types, renames etc. together)
- Reference tables rather than going into the databse multiple times for the same data
Are there any other 'speed' guidelines people try and use when query folding isn't an option?
Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!
Check out the September 2025 Power BI update to learn about new features.