Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.
Hi friends,
I have a large table "GL" (30k rows) >> I reference this table many times to do different filtering, groupby etc, so I now have many tables referencing to this "GL" table. But it's getting extremely slow in power query.
I would like to find ways to improve the speed. I have the following two practices, will the second one be faster than the other? or they are basically the same? Else, would you have a better idea how I can better tackle the above case? Thank you!!!
Practice one: in GL (30k rows) >> reference to GL many times >> Get many times and perform different filtering&/group by on each of them.
Practice two: in GL (30k rows) >> groupby >> get a new table: GL-A (1k rows) >> reference to GL-A many times >> get GL-B, GL-C, GL-D etc tables and perform different filtering&/group by on each of them.
Hi @Anonymous ,
Could you tell me if your problem has been solved by @BA_Pete 's method? If it is, kindly Accept it as the solution. More people will benefit from it.
Best Regards,
Eyelyn Qin
Hi @Anonymous ,
Broadly speaking, there's no difference between the methods. It doesn't generally matter if you reference or duplicate a query 30 times, you'll usually make 30 calls to the source.
You can speed up refresh by:
1) Assessing whether you actually need 30 separate queries in the first place - try to use DAX to define measures that make the distinctions between some/all of your additional 29 tables instead.
2) Reduce merges in your tables - send tables to the data model to be related instead.
3) Avoid whole-table operations in general where possible - functions such as Group By, Pivot, Sort etc. require the whole table to be loaded into memory to perform. If this is happening with 30 tables, you're likely going to get a severe memory backlog.
4) Ensure as many queries fold to your source (if relevant) as possible - an SQL server will perform almost all major transformations far faster than Power Query will.
That's the main areas to start with.
Pete
Proud to be a Datanaut!
Hi Pete,
appreciated!! If I'am understanding correctly, even if I reference to a table(which shrinks from 30k rows to 1k rows by groupby) it will be of no difference from referencing to the raw data(30k)? or even worse since in the first senario I actually reference twice?
following your points, I avoided merges. but cannot really think of ways to avoid using groupby and referecing tables. I created those querys as I will apply different filters and aggregate in different dimensions latter on, would you suggest any ways? (my data is on excel, so no previlege to use SQL)
@Anonymous ,
Pretty much. In your referenced scenario, PQ will go to the source to get the 30k rows. It would then go to the source again to get all 30k rows again (but possibly in parallel) for the referenced query. Then the referenced query would have additional wait time to perform the Group By operation. This wouldn't necessarily be slower than just going to the source once, depending on what your set up is with regard to parallel loading.
Some deeper info on this if you've got the stomach for it:
On a foldable source, your second scenario would be much slower than it needed to be. The SQL server would pick up the group operation on the referenced query server-side (assuming it's a simple group) before returning the data. You would then potentially have to wait for the refresh to finish loading the original 30k row query. This is why query folding is such a big deal when working in Power Query/Power BI.
Difficult for me to offer advice around removing groups and references without fully understanding your original data and required outputs, although I would say that I've rarely, if ever, had to duplicate a source query in order to get the results I need. I'm assuming that you're sending the queries to Power BI here, not to Excel.
Happy to have a look though if you're able to share your original (anonymised) data and explain what kind of outputs you need.
Pete
Pete
Proud to be a Datanaut!
Hi Pete,
Thanks for sharing the knowledge about query folding. Very interesting to learn about.
File linked: Sample.xlsx
(The numbers are faked. I've wiped out most texts and deleted unneccesary columns for confidentiality issue. (would these also affect loading speed?)
The final output in PQ i'd like to get is sth like this:
Company| Category | Cost Type | Project | Calculated Amount
The final output in PB report i'd like to get is sth like this:
rows: Company>Category>Cost Type, column: Project, value: Calculated Amount.
My current process In PQ, I perform several filters & aggregation for each reference table, for example:
for one referenced table: exclude 228&328 Account >>exclude No in Payment type>> group by Company, Cost Type, Project >> add Category column fill with value "Cate1"
And similar process for the other 10ish tables, and append them so I get only one big table and load to power bi.
It takes a long time to load from PQ to PB, but I am doing this just so the user side is fast. (that's also sth I was told, i.e. do all necessary calculation in PQ, the loading can be done at night, so no one will be affected). But I am really curious learn about performance and to see if there could be any improvement.
@Anonymous ,
Ok, so I'm assuming 'Cate2' might be something like > exclude [Account] 290 & 301, exclude [Gross Type] NR, exclude [Department] NA etc.?
Have you ever used Tabular Editor? You could really easily create a calculation group with each of your categories defined as different calculation items and just use a single SUM(table[Amount]) measure to visualise in a matrix visual.
I'll try and put an example PBIX together for you, but it's POETS day here today, so may have to wait until Monday.
Pete
Proud to be a Datanaut!
@Anonymous ,
Try out the attached PBIX.
I've quickly created two categories using Tabular Editor.
Cat1 defined as per your example. Cat2 defined as per my example.
Chucked it all in a matrix.
Obviously you'll need to reconnect to the same example file in PQ at your end.
Pete
Info on calculation groups:
https://p3adaptive.com/2021/03/calculation-groups-to-the-rescue/
Direct link to Tabular Editor 2 - Portable edition (if you can't install on your computer) ZIP file. Just extract to a folder and run EXE:
https://github.com/TabularEditor/TabularEditor/releases/download/2.16.5/TabularEditor.Portable.zip
Calculation item formulas used:
//Cat 1
CALCULATE (
SELECTEDMEASURE (),
NOT verayanTable[Account] IN { 228, 328 },
verayanTable[Payment type] <> "No"
)
//Cat2
CALCULATE (
SELECTEDMEASURE (),
NOT verayanTable[Account] IN { 290, 301 },
verayanTable[Gross type] <> "NR",
verayanTable[Department] <> "NA"
)
Proud to be a Datanaut!
Hi Pete,
First question: yes. Btw, the fields (other than company, account and amount) are vlookuped from other tables(List.PostionsOf and search by index), not sure if this would aslo affect speed.
I will look into Tabular Editor if I am allowed to install on my work pc, sounds powerful. And of course, no hurries, you've been helping a lot, appreciated and good weekend. Cheers!
Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!
Check out the October 2025 Power BI update to learn about new features.