Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
I have 2 tables in the same lakehouse and one of them (existingData) has rows that need to be deleted, based on what's in the other table (newData). Whatever IDs found in newData are the rows that need to be deleted from existingData. The two tables have identical structure, but the IDs in newData are not continuous, so I cannot delete based on range. Also, subqueries are not supported in Spark SQL. Delete queries using inner join didn't work. I ended up using this PySpark loop but it's slow:
Solved! Go to Solution.
Hi @ebjim
I tried to repro the same scenario from my side. I have 2 tables with same schema in my lakehouse :
1) List with 40 rows
2) Listings_3 with 45 rows.
I tried to delete the 40 rows in Listings_3 with the below code:
The rows got successfully deleted. SQL MERGE is faster when compared to other DML operations. By using the MERGE statement, you can replace the individual DML statements with a single statement. This can improve query performance because the operations are performed within a single statement, therefore, minimizing the number of times the data in the source and target tables are processed.
Please try this approach and do let me know if you have any further questions. Hope this helps.
Hi @ebjim
Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this. We will update you once we hear back from them.
Thanks
Hi @ebjim
I tried to repro the same scenario from my side. I have 2 tables with same schema in my lakehouse :
1) List with 40 rows
2) Listings_3 with 45 rows.
I tried to delete the 40 rows in Listings_3 with the below code:
The rows got successfully deleted. SQL MERGE is faster when compared to other DML operations. By using the MERGE statement, you can replace the individual DML statements with a single statement. This can improve query performance because the operations are performed within a single statement, therefore, minimizing the number of times the data in the source and target tables are processed.
Please try this approach and do let me know if you have any further questions. Hope this helps.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
User | Count |
---|---|
9 | |
4 | |
3 | |
3 | |
2 |
User | Count |
---|---|
8 | |
6 | |
5 | |
5 | |
4 |