Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more

Reply
PeterANB
Frequent Visitor

Warehouse, dbt & time travel: how it fits together

Our Case:

Our problem:

  • Using the dbt 'table' materialization defaults to DROP/CTAS statements which work very fast, but negates the time-travel functionality:
    • SELECT * FROM warehouse.gold.<table> OPTION (FOR TIMESTAMP AS OF '<beforeloaddate>')
    • results in 'The specified point-in-time is outside of the data retention period for the table.'
    • This is logical - but not desirable.

Our possible solutions: 

  • 'incremental' materialization (Fabric-equivalent of merge)
    • -> would keep the table & thus the time travel functionality
    • -> likely much slower than Drop + CTAS
    • -> more complex in setup (not all data has an updated_ts to filter on)
  • new custom materialization 'tableTruncate' (Truncate + Insert)
    • -> would keep the table & thus the time travel functionality
    • -> should be marginally slower than Drop + CTAS
    • -> likely not ideal for time travel given the full removal & insertion of data (but storage is cheap)
  • MS Fabric overhauls its time travel logic to be name-based instead of object-based
    • -> would work regardless of loading-approach
    • -> would make me very happy
    • -> not very likely to happen
  • something else?

Anyone run across this dilemma yet?

1 ACCEPTED SOLUTION

Hi @PeterANB ,

That's an interesting approach.  You're absolutely right Fabric’s clone functionality creates a metadata only copy of the table, so it doesn't duplicate the underlying data but references the snapshot at that moment.

Here are some key points to keep in mind.

  • The cloned table remains available indefinitely (beyond the default 30-day time-travel period) as long as it isn't explicitly dropped.

  • Changes to the source table do not affect the cloned table. It stays a frozen version unless refreshed or recreated.

  • Since cloning doesn't duplicate data, it remains cost-effective and performant while preserving historical snapshots.

  • Using parameterized schema names (like gold_YYYYMMDD) allows for dynamic versioning in Power BI, making historical analysis more manageable.

If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful.

View solution in original post

5 REPLIES 5
V-yubandi-msft
Community Support
Community Support

Hi @PeterANB ,

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution we provided for your issue worked for you  or let us know if you need any further assistance?

Your feedback is important to us, Looking forward to your response. 

 

Thank You.

V-yubandi-msft
Community Support
Community Support

Hi @PeterANB ,

 

We noticed we haven't received a response from you yet, so we wanted to follow up and ensure the solution we provided addressed your issue. If you require any further assistance or have additional questions, please let us know.

Your feedback is valuable to us, and we look forward to hearing from you soon.

 

Thank You.

Hi @V-yubandi-msft ,

 

As you confirmed it's rather a choice between function and performance,

I started mulling about the alternate approach in time travel - being via the clone-functionality at table level.

if we update the drop-functionality of DBT to precede it by a clone to a variable schema (gold_YYYYMMDD), to my understanding:

- we aren't copying data underneath - just providing an additional 'pointer' to the existing data at that moment

- the cloned table with its underlying data will be maintained until we drop it - regardless of the 30-day timetravel
- the newly created CTAS table will be unrelated to the old one and keep its performant CTAS-approach

- we have an easy way of pointing our PowerBI datasets to the historical data if we make our schemaname a parameter

 

Right now, it's just a theory that remains to be tested .

(it would be great if you could confirm the part about it remaining available even after 30 days)

 

Screenshot 2025-02-11 135049.png

 

Hi @PeterANB ,

That's an interesting approach.  You're absolutely right Fabric’s clone functionality creates a metadata only copy of the table, so it doesn't duplicate the underlying data but references the snapshot at that moment.

Here are some key points to keep in mind.

  • The cloned table remains available indefinitely (beyond the default 30-day time-travel period) as long as it isn't explicitly dropped.

  • Changes to the source table do not affect the cloned table. It stays a frozen version unless refreshed or recreated.

  • Since cloning doesn't duplicate data, it remains cost-effective and performant while preserving historical snapshots.

  • Using parameterized schema names (like gold_YYYYMMDD) allows for dynamic versioning in Power BI, making historical analysis more manageable.

If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful.

V-yubandi-msft
Community Support
Community Support

Hi @PeterANB ,

Thank you for sharing the potential solutions.  The problem arises because dbt's table materialization uses DROP TABLE + CTAS, which deletes the table and its time-travel history. Since Fabric's time-travel is object-based, the history only exists if the table remains unchanged. When the table is dropped, its historical snapshots are lost, resulting in the error: The specified point-in-time is outside of the data retention period for the table.

To maintain time-travel functionality.

  • Use incremental materialization to merge changes instead of dropping and recreating the table.
  • Create a custom materialization, such as TableTruncate (Truncate + Insert), to keep the table intact.

Many users have encountered similar challenges when balancing performance with time-travel functionality in data warehousing.

 

If my response solved your query, please mark it as the Accepted solution to help others find it easily!

And if my answer was helpful, I'd really appreciate a 'Kudos'.



Helpful resources

Announcements
MarchFBCvideo - carousel

Fabric Monthly Update - March 2025

Check out the March 2025 Fabric update to learn about new features.

Notebook Gallery Carousel1

NEW! Community Notebooks Gallery

Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.

April2025 Carousel

Fabric Community Update - April 2025

Find out what's new and trending in the Fabric community.