Solved: Loading a database on fabric is consuming a lot of...

AstridM · ‎05-27-2025

Good morning,

I have F64, and we are trying to migrate from the serverless to fabric.

We have a pipeline that loads data from a lakehouse to a database in fabric, around 300 GB, and every time it rans, it consumes most of my capacity, it doesnt make sence that loading a db takes so much capicity, and the capacity is consumed by the db, not the pipeline, what can I be doing wrong?

All we do is truncate and insert from a lakehouse.

Thanks

Aala_Ali · ‎05-27-2025

Hi @AstridM ,

You're absolutely right to question the high capacity usage during your Fabric DB load. Here's a breakdown of what could be happening and how to optimize it:

What Might Be Going Wrong

Truncate + Insert Pattern: Every full reload can trigger costly internal operations like table optimization (V-Order), statistics recalculation, and index maintenance.
Warehouse Handles the Load: Even though you're using a pipeline, the heavy capacity hit comes from the Warehouse, not the pipeline activity.
No Batching: Loading 300 GB in one operation causes spikes in memory/compute usage and longer lock times.
Auto V-Order Optimization: Fabric automatically re-optimizes the table layout unless you control it manually.

Optimization Suggestions

Avoid Truncating Large Tables
- Use MERGE, UPSERT, or a staging table with a swap pattern to reduce overhead.
Batch Your Loads
- Break the data into smaller chunks (e.g., by date) and load in parallel or sequential batches.
Disable and Manually Trigger V-Order
sql

ALTER TABLE your_table SET (enable_vorder = false); -- Load data ALTER TABLE your_table SET (enable_vorder = true); OPTIMIZE your_table;
Note: Only disable V-Order if you plan to optimize manually after the load.
Use Staging Tables
- Load into a temporary/staging table first, then switch or insert into the main table for better performance control.
Monitor with Query Insights
- Use the queryinsights schema or Fabric Capacity Metrics App to pinpoint which operations are using the most capacity.

Final Tips

Try loading a smaller sample (e.g., 10 GB) to see where the spike begins.
Run your heavy loads during off-peak hours if using shared capacity.
Consider temporarily upgrading capacity (e.g., to F128) if loading large datasets regularly.

If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful 🙂.

View solution in original post

JasonFife · ‎06-28-2025

Loading 300GB to an Azure SQL Database isn't a problem, at a cost that's reasonable. Trying to do the same thing loading a Fabric Database shouldn't cost exponentially more, should it?

Aala_Ali · ‎05-27-2025

Hi @AstridM ,

You're absolutely right to question the high capacity usage during your Fabric DB load. Here's a breakdown of what could be happening and how to optimize it:

What Might Be Going Wrong

Truncate + Insert Pattern: Every full reload can trigger costly internal operations like table optimization (V-Order), statistics recalculation, and index maintenance.
Warehouse Handles the Load: Even though you're using a pipeline, the heavy capacity hit comes from the Warehouse, not the pipeline activity.
No Batching: Loading 300 GB in one operation causes spikes in memory/compute usage and longer lock times.
Auto V-Order Optimization: Fabric automatically re-optimizes the table layout unless you control it manually.

Optimization Suggestions

Avoid Truncating Large Tables
- Use MERGE, UPSERT, or a staging table with a swap pattern to reduce overhead.
Batch Your Loads
- Break the data into smaller chunks (e.g., by date) and load in parallel or sequential batches.
Disable and Manually Trigger V-Order
sql

ALTER TABLE your_table SET (enable_vorder = false); -- Load data ALTER TABLE your_table SET (enable_vorder = true); OPTIMIZE your_table;
Note: Only disable V-Order if you plan to optimize manually after the load.
Use Staging Tables
- Load into a temporary/staging table first, then switch or insert into the main table for better performance control.
Monitor with Query Insights
- Use the queryinsights schema or Fabric Capacity Metrics App to pinpoint which operations are using the most capacity.

Final Tips

Try loading a smaller sample (e.g., 10 GB) to see where the spike begins.
Run your heavy loads during off-peak hours if using shared capacity.
Consider temporarily upgrading capacity (e.g., to F128) if loading large datasets regularly.

If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful 🙂.

Element115 · ‎05-28-2025

This all nice and dandy but I have a huge complaint here. Upgrading to F128, or for that matter even being able to afford F64 is literally not possible for a lot of SMEs. The cashflow of a SME is not necessarily correlated to their data and compute needs. So you might have users with poor cashflow that need F64 or higher SKUs but can't afford the insane pricing that only a JP Morgan or Stellantis can. At that point, what choice is left for these SMEs?

AstridM · ‎05-29-2025

Completely agree with you, as we are on a deadline I will push everything to synapse and rething the strategy on how this is going to work. I would like if we have a better unfestanding of how capacity is consumed and the reporting was a bit better. But what I can say 300 GB is not that much, this is supposed to be a platform to run business analysis and a bunch of other things.

Element115 · ‎05-29-2025

You are right. 300 GB doesn't even qualifies as big data. What irks me, as I am looking at my Fabric Capacity Metrics app, there is a lot of CUs consumed by some data warehouse even though I did not spin up any such warehouse! A lot of hidden costs lurk in the shadows when only using a lakehouse, a few notebooks and pipelines, and one semantic model and report.

And it is far from clear whether all this CU reporting is actually correct. Let's just say end users are justified to have doubts given the long list of issues that plagues the whole platform.

Loading a database on fabric is consuming a lot of capacity

What Might Be Going Wrong

Optimization Suggestions

Final Tips

What Might Be Going Wrong

Optimization Suggestions

Final Tips

Helpful resources

Fabric Monthly Update - September 2025

FabCon is coming to Atlanta

Loading a database on fabric is consuming a lot of capacity

What Might Be Going Wrong

Optimization Suggestions

Final Tips

What Might Be Going Wrong

Optimization Suggestions

Final Tips

Helpful resources

Fabric Monthly Update - September 2025