Solved: Re: Async merge risk

smpa01 · ‎07-09-2025

Are there any risks with asynchronous Delta merge/concurrent writes from Bronze (not Lakehouse) to Silver (Lakehouse) for tables that are not dependent on each other?

Suppose I have multiple tables, each being merged independently (e.g., {T1: Merge_1, T2: Merge_2, ..., Tn: Merge_n} - same notebook), and there is no overlap or dependency between these tables.

- Are there still any risks or potential issues with running these merges concurrently?
- Has anyone experienced problems with this pattern?
- Are there any resource-level concerns (e.g., cluster contention, throttling, or IO bottlenecks) I should be aware of?

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

v-tsaipranay · ‎07-10-2025

Hi @smpa01 ,

Thank you for reaching out to the Microsoft Fabric Community Forum.

When merging multiple independent tables asynchronously from Bronze (non-Lakehouse) to Silver (Lakehouse Delta), and assuming no cross-table dependencies, the pattern is generally safe but there are still some risks and practical considerations to be consider.

Even if the tables are fully independent, running multiple merge operations at the same time can cause some problems. Each merge uses system resources like CPU, memory, and input/output (IO), so if you run many merges together, especially on a smaller cluster, it can slow down performance or even cause jobs to fail.

Also, if all the merge operations are reading from or writing to the same storage system as ADLS or One Lake, it can create IO bottlenecks or hit bandwidth limits, particularly when dealing with large volumes of data. Merge operations also involve internal processing (called shuffling), which uses a lot of memory. If too many such processes run at once, it can lead to memory issues or spill data to disk, which slows things down.

Lastly, when running merges asynchronously (in parallel), it becomes harder to catch errors. If one merge fails, you may not notice it unless you specifically check or handle it in your code.

So, while this pattern is supported and commonly used, it’s important to control the number of concurrent merges, monitor your cluster performance, and handle errors properly to avoid performance or reliability issues.

Hope this helps. Please reach out for further assistance.

Thank you.

View solution in original post

v-tsaipranay · ‎11-07-2025

Hi @AndreyBear ,

To clarify my earlier point when I referred to “asynchronous,” I meant situations where merge operations are started in parallel from different execution contexts, such as separate notebooks or pipeline activities, allowing them to run simultaneously at the capacity level.

You’re correct that within a single notebook, Spark schedules these merge operations one after another at the driver side, even if the code is non-blocking. Also, concurrent writes to the same table partition cause file-level lock conflicts, but writing to separate tables stays isolated at the Delta transaction level.

Thank you for highlighting this distinction, as it helps others understand how execution works in Fabric.

AndreyBear · ‎11-05-2025

@smpa01 - What do you mean by asynchronous?
Concurrent writes to a single table partition from multiple processes will 100% fail on file-lock conflicts. Writing to different partitions works great for that use-case.

A single notebook will always single-process write in a blocking sequence even if you try to make the writes async. Internally, it will distribute each individual writes, but will only proces a single write at a time.
This also means that a single notebook writing 100 partitions will be far outperformed by 10 notebooks writing 10 - in most cases.

v-tsaipranay · ‎07-23-2025

Hi @smpa01 ,

We haven’t received an update from you in some time. Could you please let us know if the issue has been resolved?
If you still require support, please let us know, we are happy to assist you. Also thankyou @BhaveshPatel for your helpful response.

Thank you.

BhaveshPatel · ‎07-20-2025

Hi @smpa01

There is no harm in following data lakehouse pattern from Bronze to Silver to Gold. Just follow best practices:

1. Single Delta table in one notebook at a time. Overwrite the independent table. (

# Overwrite the Delta table

df.write.format("delta").mode("overwrite").saveAsTable("my_table")

2. Utilize Materilised Lake View in Silver

3. Use raw data in bronze ( Extract) (Python) , transform it in silver ( Transform) ( Spark SQL ) and clean up in Gold ( Load ) ( Spark SQL). CPU and IO part is considered when you are dealing with a billion of rows of data. Small tables ( ~ 5000000 rows ) is not a problem in Data Lakehouse.

4. Consider F64 capacity cluster which is equivalent to Power BI Premium P1 Node.

4. Follow Dataflow Gen 2, It has Data Source and Data Destination for Self Service ETL ( UI ). See Screenshots below:

5. Follow Notebooks but you are advanced user of delta lake and you know exactly what you are doing. Save money a lot..but it has a risk of deleting delta table as well. ( Parquet / Delta & Apache Spark Engine )

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

v-tsaipranay · ‎07-19-2025

Hi @smpa01 ,

Could you please confirm if the issue has been resolved? If a solution has been found, it would be greatly appreciated if you could share your insights with the community. This would be helpful for other members who may encounter similar issues.

Thank you.

v-tsaipranay · ‎07-15-2025

Hi @smpa01 ,

Thank you for confirming. Please share your details once it's complete, and we will be happy to assist with any further questions.

Thank you.

v-tsaipranay · ‎07-13-2025

Hi @smpa01 ,

I wanted to follow up on our previous suggestions. We would like to hear back from you to ensure we can assist you further.

Thank you.

smpa01 · ‎07-14-2025

I will update once I run a DEV, expected this week.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

My custom visualization projects

Plotting Live Sound: Viz1

Beautiful News:Viz1, Viz2, Viz3

Visual Capitalist: Working Hrs

Others:Easing Graph, Animated Calendar

v-tsaipranay · ‎07-10-2025

Hi @smpa01 ,

Thank you for reaching out to the Microsoft Fabric Community Forum.

When merging multiple independent tables asynchronously from Bronze (non-Lakehouse) to Silver (Lakehouse Delta), and assuming no cross-table dependencies, the pattern is generally safe but there are still some risks and practical considerations to be consider.

Even if the tables are fully independent, running multiple merge operations at the same time can cause some problems. Each merge uses system resources like CPU, memory, and input/output (IO), so if you run many merges together, especially on a smaller cluster, it can slow down performance or even cause jobs to fail.

Also, if all the merge operations are reading from or writing to the same storage system as ADLS or One Lake, it can create IO bottlenecks or hit bandwidth limits, particularly when dealing with large volumes of data. Merge operations also involve internal processing (called shuffling), which uses a lot of memory. If too many such processes run at once, it can lead to memory issues or spill data to disk, which slows things down.

Lastly, when running merges asynchronously (in parallel), it becomes harder to catch errors. If one merge fails, you may not notice it unless you specifically check or handle it in your code.

So, while this pattern is supported and commonly used, it’s important to control the number of concurrent merges, monitor your cluster performance, and handle errors properly to avoid performance or reliability issues.

Hope this helps. Please reach out for further assistance.

Thank you.

Async merge risk

Helpful resources

Fabric Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

FabCon is coming to Atlanta

Async merge risk

Helpful resources

Fabric Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026