Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
frithjof_v
Community Champion
Community Champion

Dataflow Gen2 - Table not getting dropped and recreated

Hi,

 

According to the documentation about managed settings for new tables:

"Drop and recreate table: To allow for these schema changes, on every dataflow refresh the table is dropped and recreated. Your dataflow refresh might cause the removal of relationships or measures that were added previously to your table."

Dataflow Gen2 data destinations and managed settings - Microsoft Fabric | Microsoft Learn

 

However in my Lakehouse, I can still query old versions of the table by using notebook.

 

frithjof_v_0-1720689132971.png

 

So it seems the table is not actually dropped and recreated when the dataflow gen2 refreshes? 

 

Instead, it seems to happen two operations: ReplaceTable and Update. Note that I can still query the old versions of the table by using time travel (e.g. '%%sql SELECT * FROM Table_aa VERSION AS OF 1' will work fine).

So it doesn't seem that the table is actually dropped and recreated.

 

I am using the managed settings for new tables (automatic settings) in my dataflow destination settings. 

 

Is the documentation incorrect on this point, or am I missing something?

 

Thank you 😀 

 

frithjof_v_1-1720689727235.png

 

5 REPLIES 5
v-nuoc-msft
Community Support
Community Support

Hi @frithjof_v 

 

Each operation that modifies the Delta Lake table creates a new version of the table.

 

You can use the history information to audit operations, roll back tables, or query tables at specific points in time using time travel.

 

Table history retention is determined by the table setting , which is 30 days by default.

 

The original table has been replaced by a newer one, but you can view the original table in the notebook by querying the table version.

 

This is normal and not because the table has not been deleted or recreated.

 

Regards,

Nono Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

If a table gets dropped and recreated, there should be no history anymore (and not possible to do time travel), isn't that correct?

 

Because dropping a table means to delete the table, also the version history?

 

If I run a command to manually drop a table, e.g. '%%sql DROP TABLE Table_aa'

then there will be no table history anymore because the table has been dropped.

 

What I am saying is I don't think the Dataflow Gen2 really drops the table, as the documentation states.

 

Instead it seems to me that the dataflow gen2 does some kind of overwrite or replace of the table instead, which is different than a drop. Because overwrite and replace operations keep the history.

 

Maybe similar to replace which is mentioned here:

 

https://docs.databricks.com/en/delta/drop-table.html#when-to-replace-a-table

 

So I would like to know if the behaviour I see is the expected behaviour, in that case I think the documentation is incorrect because the documentation states that the table will be dropped and recreated, however it doesn't seem to me that the table actually gets dropped.

Hi @frithjof_v 

 

Your idea is reasonable.

 

"Drop and recreate table: To allow for these schema changes, on every dataflow refresh the table is dropped and recreated. Your dataflow refresh might cause the removal of relationships or measures that were added previously to your table."

 

The documentation mentions that tables are deleted and recreated when the data flow is refreshed, which you can actually think of as an overwriting operation. The new table overwrites the original table.

 

You mentioned:

 

"If I run a command to manually drop a table, e.g. '%%sql DROP TABLE Table_aa'

Then there will be no table history anymore because the table has been dropped."


This is also true because the table has been deleted. These are actually two different operations that the document aims to refresh.

 

I hope I have clarified the matter for you.

 

Regards,

Nono Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

I'm not quite sure I understand.

 

"These are actually two different operations that the document aims to refresh."

 

"The documentation mentions that tables are deleted and recreated when the data flow is refreshed, which you can actually think of as an overwriting operation. The new table overwrites the original table."

 

Per my understanding, delete (drop) and recreate a table is a different concept than overwrite a table, especially in Delta lake world where old versions are kept when overwriting a table, but old versions are deleted when dropping a table.

 

So I would like to get clarification about this part of the documentation:

 

"Drop and recreate table: To allow for these schema changes, on every dataflow refresh the table is dropped and recreated."

 

Does the table actually get dropped and recreated?

 

Or it just gets replaced?

 

Ref.

https://docs.databricks.com/en/delta/drop-table.html#when-to-replace-a-table

 

 

If the table doesn't really get dropped, then I think this part of the documentation should be changed because the documentation says that the table gets dropped.

 

However, my understanding about what it means that a table gets dropped, may be wrong.

But so far, my understanding when you drop a table is that is deletes the table including all the version history. So it seems to me the documentation here is incorrect, because the version history actually seems to be retained after refreshing the dataflow.

Hi @frithjof_v 

 

We may need more time to discuss the problem.

 

Thank you for your discoveries and questions!

 

If you still have questions about the official documentation, you can provide product feedback in the Feedback section below the documentation.

 

vnuocmsft_0-1720775295807.png

 

Regards,

Nono Chen

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors