Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
FelixL
Advocate II
Advocate II

Delta ConcurrentAppendException when loading different tables in Lakehouse, but at the same time

I am facing an issue when running multiple concurrent loads into a Fabric Lakehouse, that doesnt really make sense to me. I have developed a PySpark framework for running my loasds in parallell, utilizing the notebookutils runMultiple function. This means that I fire off multiple jobs concurrently from the same session. Each job then loads separate tables, using some variations of Merge, Overwrite and Replace functionality.
 
The problem I am facing is that I get "ConcurrentAppendException" errors whenever jobs time it so that they load their respective tables within seconds of each other. Note: they never load the same tables, but they all load to tables within the same lakehouse. 
 
I would assume that "ConcurrentAppendException" should not be raised when I am not running any concurrent appends to the same tables. The error messages I see also point to other tables than the one actually being loaded, which makes me belive that there is something fishy going on.
 
Example below... 
 
Job 1 is loading table: gold_sales_dimension_customervisittypeno (success)
Job 2 is loading table: gold_sales_dimension_planogramno (error)
 
As highlighted in the below error message from job 2, it fails due to concurrent append, and refers to the "target table" of job 1. 
 
Error message from job 2 (i.e. the job loading table gold_sales_dimension_planogramno)

 

 

 

Error message: [DELTA_CONCURRENT_APPEND] ConcurrentAppendException: Files were added to the root of the table by a concurrent update. Please try the operation again.
 
Conflicting commit: {"timestamp":1739965911476,"operation":"UPDATE","operationParameters":{"predicate":["(TargetTable#642166 = gold_sales_dimension_customervisittypeno)"]},"readVersion":1535,"isolationLevel":"Serializable","isBlindAppend":false,"operationMetrics":{"numRemovedFiles":"1","numRemovedBytes":"1585","numCopiedRows":"0","numDeletionVectorsAdded":"0","numDeletionVectorsRemoved":"0","numAddedChangeFiles":"0","executionTimeMs":"2083","numDeletionVectorsUpdated":"0","scanTimeMs":"1541","numAddedFiles":"1","numUpdatedRows":"1","numAddedBytes":"1585","rewriteTimeMs":"541"},"tags":{"VORDER":"true"},"engineInfo":"Apache-Spark/3.5.1.5.4.20241007.4 Delta-Lake/3.2.0.5","txnId":"dc53e62a-437f-47ef-ac62-dddb5850fdd7"}
Refer to xttps://docs.delta.io/latest/concurrency-control.html for more details.

 

 

 

The entire framework is runnign fine in Azure Synapse Analytics, and has for years. I have never seen issues like these there... And the jobs do not (sohuld not) touch each others tables, ever. So I dont understand where this comes from.

 

One of my scheduled jobs, loading to some 80 unique tables, easily get 5-6 failed loads due to concurrent appends like above.

 

Anyone else seen anything like this?

1 ACCEPTED SOLUTION

I am an idiot. After further debugging i found the issue, and it was all on my side. So Fabric/Lakehouse works as intended in this case. Thank you for helping me, and sorry for wasting your time. :'( 

 

(the conflict appears in a shared/log table, that I did not think of being a potential culprit..)

View solution in original post

8 REPLIES 8
V-yubandi-msft
Community Support
Community Support

Hello @FelixL ,

we wanted to check in as we haven't heard back from you. Did our solution work for you? If you need any more help, please don't hesitate to ask. Your feedback is very important to us. We hope to hear from you soon.

 

Thank You.

Fabamik
Frequent Visitor

I am getting similar error "ConcurrentAppendException" when trying to update delta tables from lakehouse using parallel operations. Queries from parallel operations very simple like "update table set status = {value} where id ={id}" - where id is different in every operation. But still it is failing with error ConcurrentAppendException. I am loogin through various forums and found suggestions like -

1. retry mechanism - this is not feasible as on every transaction I can't add this code

2. batch updates - not in scope as part of parallely executing pipeline I am running these notebooks with update queries

3. Partitioning - for partitioning id column is primary key column so it will create lot of files

4. Isolation level - This is not sure but I think, In Microsoft Fabric, Delta tables in the Lakehouse use Snapshot Isolation by default. But I haven't found any way to change this to serializable.

5. In delta.io they are indicating to use more specific filter as per partition but in my case I don't have much data in table and also no other column to make it specific filter. As I am directly doing it on primary key

 

Seems like very basic requirement in Data engineering but struggling to achieve it in Microsoft Fabric. Do we have any solution on this?

V-yubandi-msft
Community Support
Community Support

Hi @FelixL ,

 

@nilendraFabric , has provided an exact solution to your issue. Could you kindly confirm whether your problem has been resolved or if you are still encountering any difficulties? Your feedback is valuable to the community as it assists others who might face a similar issue.

Thank you, @nilendraFabric , for your valuable insights.

 

If the issue is resolved, please mark it As the Accepted solution to help others in the community find the answer more easily.

 

Regards,

Yugandhar.

nilendraFabric
Community Champion
Community Champion

Hello @FelixL 

 

Fabric Lakehouse uses a queuing system where concurrency depends on the capacity SKU (e.g., F2 allows 1 concurrent job; F32 allows 8).

If your capacity tier is too low, concurrent writes to any table in the Lakehouse may trigger conflicts, as jobs compete for limited resources

For 10 concurrent jobs, use at least F16 (burst factor 3 → 32×3 = 96 cores).

 

Upgrade SKUs to match workload parallelism requirements using Microsoft’s formula

 

Required SKU Tier = Ceil(Total Concurrent Jobs / Burst Factor)

https://learn.microsoft.com/en-us/fabric/data-engineering/spark-job-concurrency-and-queueing

 

Try partition by natural keys to prevent overlapping file operations

 

 

 

Thanks for your reply, but I think we are talking sbout different things here..? 

I am running one single spark session, using 12 vcores, in a F64/P1 capacity. This one small session then writes to 80 tables in my lakehouse, using a "run multiple" notebooks concurrency of 3.


I should have enormous compute overhead - and I am not seeing any concurrency throttling or job queuing. 

What I am seeing are error messages saying that jobs try to write to delta tables they are not actually touching. This either needs a good explanation, or appears to be some bug in the lakehouse-onelake sync layer..? 

Makes sesnse now, its not a capacity issue. Let me dig down further. My gut feeling says it is related to , but let me come back on this 

"isolationLevel":"Serializable"

 

I am an idiot. After further debugging i found the issue, and it was all on my side. So Fabric/Lakehouse works as intended in this case. Thank you for helping me, and sorry for wasting your time. :'( 

 

(the conflict appears in a shared/log table, that I did not think of being a potential culprit..)

Thanks @FelixL for sharing the findings. Happy to help anytime

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.