Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
frithjof_v
Community Champion
Community Champion

Time travel - Not recommended for long-term backup solution

Hi,

 

This article https://learn.microsoft.com/en-us/azure/databricks/delta/history says that

 

"Databricks does not recommend using Delta Lake table history as a long-term backup solution for data archival."

 

What are the reasons why it is not recommended to use table history as a long-term backup?

 

Time travel seems like a really convenient functionality 😀 I would like to learn more about the reasons why it is not recommended for long-term backup solution for data archival.

 

Thank you! 😀

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @frithjof_v,

Response from internal Team -

A primary reason for not recommending time travel for long term archival is that the older versions of the data/files are stored in the same storage location as the current data and hence 1. Is prone to human error deletions and accidental drops and 2. Maintaining tables with large data volumes as older versions is costly as compared to alternative archival methods like Azure Storage cold tier.



Hope this is helpful.

View solution in original post

4 REPLIES 4
Anonymous
Not applicable

Hi @frithjof_v ,

Thanks for using Fabric Community.
As per my understanding these are two main reasons why Databricks recommends a shorter retention period (7 days by default) for Delta Lake table history and advises against using it for long-term backups:

  1. Storage Costs: Every version of your data created through modifications to the Delta table is stored in the table history. This can quickly become expensive as data accumulates over time. Long-term backups would require storing a significant amount of historical data, leading to high storage costs.
  2. Performance Impact: Maintaining a large history can impact the performance of Delta Lake operations. Here's how:
    • VACUUM Operation: VACUUM is a process that cleans up old versions of files in Delta Lake to reclaim storage space. With a large history, the VACUUM operation becomes more complex and time-consuming.
    • Time Travel Queries: While time travel is convenient, querying historical versions requires accessing the relevant data files. A vast history increases the number of files to potentially scan, slowing down queries.


As you know Databricks originally developed the Delta Lake and continues to actively contribute to the open source project, they are only offically recommending this -
Work with Delta Lake table history | Databricks on AWS

vgchennamsft_0-1713802945268.png


Some useful links -
Delta Tables- Advanced Concepts 

Hope this is helpful. Please let me know incase of further queries.

Anonymous
Not applicable

Hi @frithjof_v ,

We haven’t heard from you on the last response and was just checking back to see if you got some insights on your query?
Otherwise, will respond back with the more details and we will try to help .

Thanks

Anonymous
Not applicable

Hi @frithjof_v ,

We haven’t heard from you on the last response and was just checking back to see if you got some insights on your query?
Otherwise, will respond back with the more details and we will try to help .

Thanks

Anonymous
Not applicable

Hi @frithjof_v,

Response from internal Team -

A primary reason for not recommending time travel for long term archival is that the older versions of the data/files are stored in the same storage location as the current data and hence 1. Is prone to human error deletions and accidental drops and 2. Maintaining tables with large data volumes as older versions is costly as compared to alternative archival methods like Azure Storage cold tier.



Hope this is helpful.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.