Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
sk_few
Frequent Visitor

Truncate Lakehouse Tables

Hej,

 

How do you actually truncate data in Lakehouse Tables the best way?

 

I have used the below sql coding in Notebook, but it takes 1 minute to delete 1 row.

sk_few_0-1712525400452.png

There are just 3 rows in total in this test-table.

And without the WHERE-clause, it just seems to take forever sometimes. No matter how big the table is.

 

How do you truncate your data in Lakehouse table without deleting the table

7 REPLIES 7
v-cboorla-msft
Community Support
Community Support

Hi @sk_few 

 

Thanks for using Microsoft Fabric Community.

Apologies for the inconvenience.

Notebooks being a viable approach for truncating data in Lakehouse tables in Fabric.

Lakehouse tables typically use append-only storage formats like Delta Lake. You can use DELETE statements with appropriate WHERE clauses to target specific rows for deletion. The DELETE statement doesn't physically remove data. Instead, it marks rows as deleted. This can be slow for small tables as it still needs to manage the metadata.

You can write PySpark scripts to manipulate dataframes and achieve truncating data in Lakehouse.

For additional information please refer : pyspark.sql.DataFrame.drop — PySpark master documentation (apache.org)

 

I hope this information helps. Please do let us know if you have any further questions.

 

Thanks.

Hi,

 

Thank you for taking your time to answer.

That can explain why, I keep maxing out Fabric Capacity, if it goes and remap all rows. I have like 100 million of rows that I delete quite often.

 

I have looked into the link of  DrataDfram.drop, but that is only to be used for cropping columns,.

Please correct me if I am wrong. Is there a specific script that can empty a whole table permanently instead of marking them as Delete?

 

Additionally, I have looked into how to delete a parquet file, as they just keep piling up, taking up more and more storage space. Can you advise on this too?

sk_few_0-1712998141678.png

 

frithjof_v
Continued Contributor
Continued Contributor

I don't have experience with this, but I am guessing that the VACUUM command could be used here?

 

Hope someone else can confirm 😀

 

I just read about the vacuum command, and from what I read it seems to be fit for this kind of purpose, but I don't have the practical experience so don't trust me blindly on this 😊

Hi @sk_few 

 

Apologies for the delay in response from my end.

As @frithjof_v mentioned, you can use VACCUM command.

For more details please refer : Use table maintenance feature to manage delta tables in Fabric

                                               : Delta Lake table optimization and V-Order

 

And also please refer to the similar thread realted to VACCUM command : Solved: Re: Delta Lake VACUUM, how does it work? - Microsoft Fabric Community

 

I hope this information helps. Please do let us know if you have any further queries.

 

Thank you.

Hi @sk_few 

 

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.

 

Thanks.

Hi @sk_few 

 

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others .
If you have any question relating to the current thread, please do let us know and we will try out best to help you.
In case if you have any other question on a different issue, we request you to open a new thread.

 

Thanks.

Hi @sk_few 

 

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.


Thanks.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors