Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
bw_chec
Frequent Visitor

Fabric Billable Storage After Vacuum

I have been working in my Fabric capacity for 2 months now, on a fairly small (10GB) dataset that overwrites every night (due to the limitations on source database).

I ran a maintenance script yesterday to go through my delta tables, optimise them and vacuum them > 7 days.

 

Today I have looked in the Metrics App at the current or billable storage metrics haven't gone down, despite ~100-150GB of data being cleared yesterday.

 

Any thoughts on why? Would like to understand the Metrics App better and best practices for keeping storage costs down.

 

Best,

BW

1 ACCEPTED SOLUTION

Perhaps it's the soft delete.

 

Check again after 7 days, perhaps the storage has dropped then.

 

https://learn.microsoft.com/en-us/fabric/onelake/onelake-disaster-recovery#soft-delete-for-onelake-f...

View solution in original post

6 REPLIES 6
v-jingzhan-msft
Community Support
Community Support

Hi @bw_chec 

 

Here are some potential reasons: 

  • Sometimes, the Metrics App might not update immediately. It could take some time for the changes to reflect in the storage metrics. You might check again after a few hours or the next day to see if the metrics have updated.
  • There might be background processes or operations that are still holding onto the storage space. For example, certain operations might not release the storage immediately even after vacuuming. It's worth checking if there are any ongoing background processes that could be affecting the storage metrics.

 

Best Practices for Storage Cost Management:

  • Regular Maintenance: Continue to optimize and vacuum your delta tables regularly. This helps in keeping the storage usage efficient.
  • Monitor Usage Trends: Use the Metrics App to monitor usage trends and identify any unusual spikes or patterns in storage consumption. This can help in pinpointing areas that need attention.
  • Optimize Data Storage: Ensure that your data storage is optimized by removing unnecessary data and compressing data where possible. This can significantly reduce storage costs. 
  • Review and Adjust Capacity: Regularly review your capacity usage and adjust it based on your needs. This can help in avoiding over-provisioning and reducing costs.

Reference: OneLake capacity consumption example - Microsoft Fabric | Microsoft Learn

 

Hope this would be helpful!

 

Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!

Hi @v-jingzhan-msft ,

 

Thanks for responding.

 

It has been 2 days and the storage still hasn't gone down in the metrics app, even though I can see a load of parquet files have been deleted.

How can I find which background processes are holding onto the data?

 

When I look at the Utilisation graph I can see The CU % is constantly sitting at > 50% background %.

When I explore these timepoints it says the processes running are notebook/pipeline runs that I had run hours before?

 

bw_chec_0-1727336469241.png

Thanks,

Ben

Hi @bw_chec 

 

It has been more than 7 days, do you see the storage goes down?

 

For the CU % usage line, it's probably due to that Fabric smooths the CU usage of background operations that have long runtimes and consume heavy CU loads. You could learn more from the documentation: Understand your Fabric capacity throttling - Microsoft Fabric | Microsoft Learn

 

Balance between performance and reliability

Fabric is designed to deliver lightning-fast performance to its customers by allowing operations to access more capacity unit (CU) resources than are allocated to the capacity. Tasks that might take several minutes to complete on other platforms can be finished in mere seconds on Fabric. To avoid penalizing users when operational loads surge, Fabric smooths or averages the CU usage of an operation over a minimum of five minutes, and even longer for high CU usage but short runtime requests. This behavior ensures you can enjoy consistently fast performance without experiencing throttling.

For background operations that have long runtimes and consume heavy CU loads, Fabric smooths their CU usage over a 24-hour period. Smoothing eliminates the need for data scientists and database administrators to spend time creating job schedules to spread CU load across the day to prevent accounts from freezing. With 24-hour CU smoothing, scheduled jobs can all run simultaneously without causing any spikes at any time during the day, and you can enjoy consistently fast performance without wasting time managing job schedules.

 

Best Regards,
Jing

Hi,

 

Yes the storage went down after around 7 days. Thanks for your help!

Perhaps it's the soft delete.

 

Check again after 7 days, perhaps the storage has dropped then.

 

https://learn.microsoft.com/en-us/fabric/onelake/onelake-disaster-recovery#soft-delete-for-onelake-f...

The current default is 28 days but starting May 2024 we are transitioning to a 7-day default retention period

Gives a whole new meaning to "current"...

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

November Update

Fabric Monthly Update - November 2024

Check out the November 2024 Fabric update to learn about new features.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.