- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How do you delete specific rows from a Lakehouse table?
Greetings, all. I have some files loaded into a Lakehouse that I've also loaded into a Lakehouse table. I want to delete some of the rows in the Lakehouse table, but I can't seem to find a way to do that. How does one go about doing this?
Any help appreciated.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @arpost ,
Thanks for using Fabric Community.
It is possible to delete rows in a Lakehouse table using Spark-SQL in notebooks. I have created a repro of the scenario. I have attached the screenshots for your reference.
Trying to delete the row which has value of 46134
Query succeeded
After deletion of the row:
You can refer this link for more information: Link1
Hope this helps. Do let us know if you have any further issues. Glad to help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Give the below a try. I won't take credit for it. It was given to me by a MSFT trainer when I proposed the question.
----------------------------------------------------------
%%pyspark
df = spark.sql("SELECT MAX(DataDate) MDate FROM Lakehouse.Table")
MDate = df.agg({"MDate":"max"}).collect()[0]['max(MDate)']
#print(MDate)
sqlStr = "DELETE From Lakehouse.Table WHERE DataDate < '"+str(MDate)+"'"
#print(sqlStr)
#dfDistinct = spark.sql("SELECT DISTINCT DataDate FROM Lakehouse.Table")
#display(dfDistinct)
dml = spark.sql(sqlStr)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Right, "<", ">" and "=" type of comparisons work.
Problem is only with the word "IN".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It doesn't work in a programmatic manner.
If you have pre-defined solution, then you can make the SQL command.
But if you for example create a table of keys and try to delete those rows, that is not possible atm.
Couple of options:
- Delete everything and recreate all of it without the lines you wanted to delete.
- Use Power Query filtering in dataflow and make the dataflow to overwrite the data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have the same problem: subqueries are not supported in delete.
I tried to create a temporary table and make a join, but that failed as well.
Since I have only two values, I can hard code my case for now.
Original deletion query:
DELETE FROM DE_LH_200_SILVER_Default.ServiceRequest WHERE updated NOT IN (select distinct DATE(updated) from DE_LH_100_BRONZE_Default.ServiceRequest)
Failed attempt with temp table and join:
CREATE TABLE deletecontent
(
todelete DATE
);
INSERT INTO deletecontent
select distinct DATE(updated) from DE_LH_200_SILVER_Default.ServiceRequest WHERE updated NOT IN (select distinct DATE(updated) from DE_LH_100_BRONZE_Default.ServiceRequest);
SELECT * FROM deletecontent;
SELECT DISTINCT DATE(updated) FROM DE_LH_200_SILVER_Default.ServiceRequest INNER JOIN deletecontent ON deletecontent.todelete = DE_LH_200_SILVER_Default.ServiceRequest.updated;
DELETE FROM DE_LH_200_SILVER_Default.ServiceRequest SR INNER JOIN deletecontent ON deletecontent.todelete = DE_LH_200_SILVER_Default.ServiceRequest.updated;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Appreciate it, @v-nikhilan-msft! I was hoping there was a way to do it through a UI, but that works.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @arpost ,
Glad that your query got resolved. Please continue using Fabric Community for help on your queries.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @arpost ,
Thanks for using Fabric Community.
It is possible to delete rows in a Lakehouse table using Spark-SQL in notebooks. I have created a repro of the scenario. I have attached the screenshots for your reference.
Trying to delete the row which has value of 46134
Query succeeded
After deletion of the row:
You can refer this link for more information: Link1
Hope this helps. Do let us know if you have any further issues. Glad to help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I went exactly the same way but it doesn´t work for me. Are there any bugs at the moment?
Everything is fine without the where clause (deleting all the data) but not with a where clause, regardless if it´s a clause to text or date column.
Any suggestions?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems to be something with timeout - when I delete just 150 rows, everything works fine!
But when I change the attribute die 'BCA', which is many thousands of rows in the table, it crashes with the following message:
INCONSISTENT_BEHAVIOR_CROSS_VERSION.READ_ANCIENT_DATETIME
I don´t see any connection to datetime in this?
Happy for any idea...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
These discussions should be still valid for Lakehouses (and probably Warehouses) without schema.
I heard that Lakehouse with schema has more deletion issues.
Is this the case?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, I´m using a Lakehouse without schema.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How are trying to delete have you tried the pyspark method above?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm new to SparkSQL. Would anyone happen to know how to acheive the following SQL statement? I get the error: Error: Subqueries are not supported in the DELETE
%%SQL
DELETE FROM <Table>
WHERE [ImportDate] < (SELECT MAX([ImportDate]) FROM <Table>)
Thanks in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm trying to do something similar too. Did you find a solution? Thanks

Helpful resources
Join us at the Microsoft Fabric Community Conference
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Fabric Monthly Update - February 2025
Check out the February 2025 Fabric update to learn about new features.

Subject | Author | Posted | |
---|---|---|---|
11-28-2024 03:10 AM | |||
09-02-2024 04:20 AM | |||
02-18-2024 04:58 AM | |||
12-02-2024 10:57 AM | |||
11-28-2024 10:41 PM |
User | Count |
---|---|
7 | |
4 | |
2 | |
1 | |
1 |
User | Count |
---|---|
13 | |
9 | |
5 | |
5 | |
4 |