Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and a 50 percent discount on exams.
Get startedEarn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi,
This article https://learn.microsoft.com/en-us/azure/databricks/delta/history says that
"Databricks does not recommend using Delta Lake table history as a long-term backup solution for data archival."
What are the reasons why it is not recommended to use table history as a long-term backup?
Time travel seems like a really convenient functionality 😀 I would like to learn more about the reasons why it is not recommended for long-term backup solution for data archival.
Thank you! 😀
Hi @frithjof_v ,
Thanks for using Fabric Community.
As per my understanding these are two main reasons why Databricks recommends a shorter retention period (7 days by default) for Delta Lake table history and advises against using it for long-term backups:
As you know Databricks originally developed the Delta Lake and continues to actively contribute to the open source project, they are only offically recommending this -
Work with Delta Lake table history | Databricks on AWS
Some useful links -
Delta Tables- Advanced Concepts
Hope this is helpful. Please let me know incase of further queries.
Hi @frithjof_v ,
We haven’t heard from you on the last response and was just checking back to see if you got some insights on your query?
Otherwise, will respond back with the more details and we will try to help .
Thanks
Hi @frithjof_v ,
We haven’t heard from you on the last response and was just checking back to see if you got some insights on your query?
Otherwise, will respond back with the more details and we will try to help .
Thanks
Hi @frithjof_v,
Response from internal Team -
A primary reason for not recommending time travel for long term archival is that the older versions of the data/files are stored in the same storage location as the current data and hence 1. Is prone to human error deletions and accidental drops and 2. Maintaining tables with large data volumes as older versions is costly as compared to alternative archival methods like Azure Storage cold tier.
Hope this is helpful.
Ask questions in Eventhouse and KQL, Eventstream, and Reflex.
Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.