<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Storage increasing daily in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5185362#M16282</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1417330"&gt;@shirishmathanka&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for reaching out to the Microsoft Fabric Forum Community.&lt;/P&gt;
&lt;P&gt;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/10288"&gt;@BhaveshPatel&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1340679"&gt;@tayloramy&lt;/a&gt;&amp;nbsp;Thanks for the inputs. user inputs are usefull.&lt;BR /&gt;&lt;BR /&gt;Additon to users points, Even with VACUUM 0, the higher storage usage can still be normal in your scenario. Since your pipelines run 12 times a day across Bronze, Silver, and Gold, every MERGE/upsert creates new Delta/Parquet files, and over time this can lead to lots of small active files. VACUUM helps clean up old unused files, but it doesn’t combine or shrink the active ones.&lt;/P&gt;
&lt;P&gt;Also, Azure Storage Explorer shows the total physical storage being used, not just the actual current table data. That includes active data files, Delta logs, temporary/staging files, and storage across all three lakehouses. So the 16 GB vs 1–2 GB difference is likely due to a mix of frequent small file creation, metadata overhead, and storage across the full pipeline not just old retained data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there are any deviations from your expectation please let us know we are happy to address.&lt;/P&gt;
&lt;P&gt;Thanks.&lt;/P&gt;</description>
    <pubDate>Tue, 19 May 2026 11:14:40 GMT</pubDate>
    <dc:creator>v-priyankata</dc:creator>
    <dc:date>2026-05-19T11:14:40Z</dc:date>
    <item>
      <title>Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5182457#M16219</link>
      <description>&lt;P&gt;i have three pipeline and three lakehouses one from dataverse to fabric bronze layer getting only incremental data and overwrite bronze store only the modified around 2000&amp;nbsp; - 3000 row and one is bronze to silver upsert operation only the incremental data and other one is also same to silver silver have around 80 - 90 thousand row upsert only modifed data same with gold in azure storage explorer i have its showing only 1 to 2 gb of data but in storage explorer its showing around 16 gb of data&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 14 May 2026 07:28:38 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5182457#M16219</guid>
      <dc:creator>shirishmathanka</dc:creator>
      <dc:date>2026-05-14T07:28:38Z</dc:date>
    </item>
    <item>
      <title>Re: Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5182500#M16220</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1417330"&gt;@shirishmathanka&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You should use Python notebooks. It is so easy top use bronze --&amp;gt; silver --&amp;gt; gold layer. You should use OPTIMIZE commands for indexing and Vaccum commands as well. That way You can apply VACUUM and you can save money a lot.&lt;/P&gt;</description>
      <pubDate>Thu, 14 May 2026 08:17:44 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5182500#M16220</guid>
      <dc:creator>BhaveshPatel</dc:creator>
      <dc:date>2026-05-14T08:17:44Z</dc:date>
    </item>
    <item>
      <title>Re: Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5182834#M16221</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1417330"&gt;@shirishmathanka&lt;/a&gt;,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To add to what&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/10288"&gt;@BhaveshPatel&lt;/a&gt;&amp;nbsp;said, when you change a table in a lakehouse/warehouse, the old data is not removed, it is just unreferenced.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So if you only change 3 rows, the old parquet file that had all the rows still exists, and a new parquet file with the modifications is created.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is useful for the time travel feature.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;using the VACUUM command will delete unreferenced files, which means that you can no longer time travel back to them, but will free up storage space.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 14 May 2026 13:43:20 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5182834#M16221</guid>
      <dc:creator>tayloramy</dc:creator>
      <dc:date>2026-05-14T13:43:20Z</dc:date>
    </item>
    <item>
      <title>Re: Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5183340#M16228</link>
      <description>&lt;P&gt;hi &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1340679"&gt;@tayloramy&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/10288"&gt;@BhaveshPatel&lt;/a&gt;&amp;nbsp;&amp;nbsp;thanks for the response,&lt;BR /&gt;i am running vaccum once a day with 0 rettention and my pipeline is running 12 time a day in every 2 hours&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 May 2026 07:25:44 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5183340#M16228</guid>
      <dc:creator>shirishmathanka</dc:creator>
      <dc:date>2026-05-15T07:25:44Z</dc:date>
    </item>
    <item>
      <title>Re: Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5185362#M16282</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1417330"&gt;@shirishmathanka&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for reaching out to the Microsoft Fabric Forum Community.&lt;/P&gt;
&lt;P&gt;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/10288"&gt;@BhaveshPatel&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1340679"&gt;@tayloramy&lt;/a&gt;&amp;nbsp;Thanks for the inputs. user inputs are usefull.&lt;BR /&gt;&lt;BR /&gt;Additon to users points, Even with VACUUM 0, the higher storage usage can still be normal in your scenario. Since your pipelines run 12 times a day across Bronze, Silver, and Gold, every MERGE/upsert creates new Delta/Parquet files, and over time this can lead to lots of small active files. VACUUM helps clean up old unused files, but it doesn’t combine or shrink the active ones.&lt;/P&gt;
&lt;P&gt;Also, Azure Storage Explorer shows the total physical storage being used, not just the actual current table data. That includes active data files, Delta logs, temporary/staging files, and storage across all three lakehouses. So the 16 GB vs 1–2 GB difference is likely due to a mix of frequent small file creation, metadata overhead, and storage across the full pipeline not just old retained data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there are any deviations from your expectation please let us know we are happy to address.&lt;/P&gt;
&lt;P&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Tue, 19 May 2026 11:14:40 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5185362#M16282</guid>
      <dc:creator>v-priyankata</dc:creator>
      <dc:date>2026-05-19T11:14:40Z</dc:date>
    </item>
    <item>
      <title>Re: Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5185380#M16283</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;Thanks For the respnse,&lt;/P&gt;&lt;P&gt;So what is the solution for is there any way to reduce the small or can we delete this files just like vaccum&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 19 May 2026 11:38:28 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5185380#M16283</guid>
      <dc:creator>shirishmathanka</dc:creator>
      <dc:date>2026-05-19T11:38:28Z</dc:date>
    </item>
    <item>
      <title>Re: Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5186720#M16308</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1417330"&gt;@shirishmathanka&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;As per my knowledge,&amp;nbsp;Vacuum only removes obsolete/unreferenced files, it does not compact active files. Frequent MERGE/upserts create many small active files across Bronze, Silver, and Gold, which is why storage shows more than actual data.&lt;/P&gt;
&lt;P&gt;To reduce storage and improve performance, run OPTIMIZE to compact small files, and consider batching updates or reducing write frequency.&lt;BR /&gt;&lt;BR /&gt;I have included the official Microsoft documentation for your review.&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-and-delta-tables" target="_blank"&gt;Lakehouse and Delta Tables - Microsoft Fabric | Microsoft Learn&lt;/A&gt;&lt;BR /&gt;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/delta/vacuum" target="_blank"&gt;Remove unused data files with vacuum - Azure Databricks | Microsoft Learn&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;If there are any deviations from your expectation please let us know we are happy to address.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks.&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 21 May 2026 07:22:01 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5186720#M16308</guid>
      <dc:creator>v-priyankata</dc:creator>
      <dc:date>2026-05-21T07:22:01Z</dc:date>
    </item>
    <item>
      <title>Re: Storage increasing daily</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5188308#M16359</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1417330"&gt;@shirishmathanka&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for reaching out to the Microsoft Fabric Forum Community.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope the information provided was helpful. If you still have questions, please don't hesitate to reach out to the community.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 25 May 2026 09:35:21 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Storage-increasing-daily/m-p/5188308#M16359</guid>
      <dc:creator>v-priyankata</dc:creator>
      <dc:date>2026-05-25T09:35:21Z</dc:date>
    </item>
  </channel>
</rss>

