This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
I’m trying to better understand what really drives CU consumption in Dataflows Gen2, because my observations don’t fully match my expectations.
I have two Dataflows Gen2, both orchestrated via a pipeline:
Dimensions dataflow
Loads 19 tables into a Lakehouse
All tables come from one Excel file on SharePoint, each from a different sheet
Transformations: only Select Columns
Runtime: ~4 minutes
Facts dataflow
Loads 8 tables
Heavy transformations:
Multiple pivot / unpivot
Merges
Two tables have >15M rows
Runtime: ~10 minutes
Intuitively, I expected the facts dataflow to consume significantly more CUs due to:
Much larger row counts
More complex transformations
However, In practice, the dimensions dataflow actually consumes more CUs than the facts dataflow.
What are the main factors that actually influence CU consumption in Dataflows Gen2?
Solved! Go to Solution.
Hi @mrbartuss,
Yes, exactly and the saving is actually proportional to your table count.
Without query referencing (no "Reference" step):
Each of your 19 output tables would independently evaluate its own query all the way to the source. That means 19 separate HTTP requests to SharePoint + 19 full Excel file parses through the Mashup engine. Each one is billed at 12 CU/s for up to 10 minutes.
With query referencing (your current setup) :
The SharePoint read happens once (the base "Source" query), and all 19 output tables branch off that single evaluation. Only the filtering/column selection steps run 19 times which are near-instant in-memory operations.
So in your Dimensions dataflow specifically, query referencing could be saving you up to ~18× the cost of a full source re-read. You're already doing the right thing there.
The remaining optimization lever (point 1) would remove the SharePoint dependency entirely once the file is in OneLake as a Delta table, even that single source read becomes a local Delta scan instead of an HTTP call to SharePoint.
Hi @mrbartuss
We have not received a response from you regarding the query and were following up to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions.
Thank You.
Hi @mrbartuss
Thank you for posting your query in the Microsoft Fabric Community Forum, and thanks to @Tamanchu for sharing valuable insights.
Could you please confirm if your query has been resolved by the provided solution?If you have any more questions, please let us know and we’ll be happy to help.
Regards,
Microsoft Fabric Community Support Team
Hi @mrbartuss,
Great question and your observation is actually expected behavior once you understand how Dataflow Gen2 billing works under the hood.
CU is charged per query, not per dataflow
According to the official Dataflow Gen2 pricing docs, there are three separate billing meters:
The key insight: Standard Compute is billed per output query every output table runs as a separate Mashup engine evaluation. For CI/CD dataflows, the rate is :
Why your Dimensions dataflow costs more
More tables + shorter runtime = more CU than fewer tables + longer runtime. Counterintuitive, but it follows directly from the per-query billing model.
Optimization ideas
Hope this clarifies the billing model!
Thanks for the super‑clear explanation.
Given that, what would you say is the recommended alternative to reading Excel files directly from SharePoint in this setup?
Hi @mrbartuss,
Given your setup, the two most impactful alternatives:
1. Pre-land Excel files into OneLake before the Dataflow runs
Add a Copy activity in your pipeline (before the Dataflow step) to copy the Excel files from SharePoint into a Lakehouse. Then point your Dataflow to the Lakehouse tables instead of SharePoint directly.
Benefits:
→ Docs: Copy activity in Data Factory pipelines Microsoft Learn
2. Use query referencing inside the Dataflow (critical for your Dimensions case)
Since your 19 Dimensions tables all come from the same Excel file, the Mashup Engine is fetching that file 19 times once per output table. Instead, create a single base query that loads the Excel file, then derive your 19 table queries from it by referencing that base query. SharePoint is then hit only once.
Base query → [Excel file from SharePoint]
→ Docs: Query folding and referencing in Dataflow Gen2 Microsoft Learn
In short: for your Dimensions dataflow, query referencing alone should drastically cut your CU consumption without changing your architecture. For Facts, the pipeline Copy → Lakehouse → Dataflow pattern is the cleanest long-term approach.
1. I'll test this approach
2. That's exactly my set up right now - I'm already using 'Reference query'. What you are saying is that if I did not do it this way, even move CUd were consumed?
Hi @mrbartuss,
Yes, exactly and the saving is actually proportional to your table count.
Without query referencing (no "Reference" step):
Each of your 19 output tables would independently evaluate its own query all the way to the source. That means 19 separate HTTP requests to SharePoint + 19 full Excel file parses through the Mashup engine. Each one is billed at 12 CU/s for up to 10 minutes.
With query referencing (your current setup) :
The SharePoint read happens once (the base "Source" query), and all 19 output tables branch off that single evaluation. Only the filtering/column selection steps run 19 times which are near-instant in-memory operations.
So in your Dimensions dataflow specifically, query referencing could be saving you up to ~18× the cost of a full source re-read. You're already doing the right thing there.
The remaining optimization lever (point 1) would remove the SharePoint dependency entirely once the file is in OneLake as a Delta table, even that single source read becomes a local Delta scan instead of an HTTP call to SharePoint.
Check out the April 2026 Fabric update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.