This is best Fabric, Power BI, SQL and AI community event. How do we know? The last event sold out! Save €200 with code FABCMTY200.
Register nowA new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.
i have three pipeline and three lakehouses one from dataverse to fabric bronze layer getting only incremental data and overwrite bronze store only the modified around 2000 - 3000 row and one is bronze to silver upsert operation only the incremental data and other one is also same to silver silver have around 80 - 90 thousand row upsert only modifed data same with gold in azure storage explorer i have its showing only 1 to 2 gb of data but in storage explorer its showing around 16 gb of data
Thank you for reaching out to the Microsoft Fabric Forum Community.
@BhaveshPatel @tayloramy Thanks for the inputs. user inputs are usefull.
Additon to users points, Even with VACUUM 0, the higher storage usage can still be normal in your scenario. Since your pipelines run 12 times a day across Bronze, Silver, and Gold, every MERGE/upsert creates new Delta/Parquet files, and over time this can lead to lots of small active files. VACUUM helps clean up old unused files, but it doesn’t combine or shrink the active ones.
Also, Azure Storage Explorer shows the total physical storage being used, not just the actual current table data. That includes active data files, Delta logs, temporary/staging files, and storage across all three lakehouses. So the 16 GB vs 1–2 GB difference is likely due to a mix of frequent small file creation, metadata overhead, and storage across the full pipeline not just old retained data.
If there are any deviations from your expectation please let us know we are happy to address.
Thanks.
Hi,
Thanks For the respnse,
So what is the solution for is there any way to reduce the small or can we delete this files just like vaccum
Hi @shirishmathanka
As per my knowledge, Vacuum only removes obsolete/unreferenced files, it does not compact active files. Frequent MERGE/upserts create many small active files across Bronze, Silver, and Gold, which is why storage shows more than actual data.
To reduce storage and improve performance, run OPTIMIZE to compact small files, and consider batching updates or reducing write frequency.
I have included the official Microsoft documentation for your review.
Lakehouse and Delta Tables - Microsoft Fabric | Microsoft Learn
Remove unused data files with vacuum - Azure Databricks | Microsoft Learn
If there are any deviations from your expectation please let us know we are happy to address.
Thanks.
Thank you for reaching out to the Microsoft Fabric Forum Community.
I hope the information provided was helpful. If you still have questions, please don't hesitate to reach out to the community.
Hi @shirishmathanka,
To add to what @BhaveshPatel said, when you change a table in a lakehouse/warehouse, the old data is not removed, it is just unreferenced.
So if you only change 3 rows, the old parquet file that had all the rows still exists, and a new parquet file with the modifications is created.
This is useful for the time travel feature.
using the VACUUM command will delete unreferenced files, which means that you can no longer time travel back to them, but will free up storage space.
Proud to be a Super User! | |
hi @tayloramy and @BhaveshPatel thanks for the response,
i am running vaccum once a day with 0 rettention and my pipeline is running 12 time a day in every 2 hours
You should use Python notebooks. It is so easy top use bronze --> silver --> gold layer. You should use OPTIMIZE commands for indexing and Vaccum commands as well. That way You can apply VACUUM and you can save money a lot.
Check out the April 2026 Fabric update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.
| User | Count |
|---|---|
| 7 | |
| 7 | |
| 4 | |
| 4 | |
| 3 |
| User | Count |
|---|---|
| 18 | |
| 11 | |
| 8 | |
| 6 | |
| 5 |