Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
dbeavon3
Memorable Member
Memorable Member

Poor parallel processing problem in pipelines (aka P5)

When Microsoft was building ADF parallel pipelines, they made the decision to use static partitioning.  The workloads that are going to a parallel loop is prepared in advance and they are not adjusted, even if some items are completed faster than others.

 

In most scheduling engines of parallel work, the load balancing is done dynamically.  It is very frustrating that Microsoft did things this way.  On many occasions, I have wasted time trying to accommodate the silly limitation in ADF.  It would be so much easier for Microsoft to solve this in a centralized way for the benefit of all their customers.

 

About three years ago I reported this as an ADF bug because it was mind-boggling to me that the partitioning would be so unfriendly. 
Is there any plan to fix this issue in "Fabric" now that Microsoft has moved the same technology here from their "ADF" platform?

 

 

 

NOTE: Below is what this problem looks like in Fabric.  Notice that there is one item in the parallel loop that is larger than the others, and it causes "stragglers" to be executed long after all the other parallel workers have gone to sleep.

 

dbeavon3_0-1736868018130.png

 

 

I'm sure this P5 issue is familiar to others.  Has anyone else contacted Microsoft? If you are reading this, would you please take a turn opening a support ticket about the scheduling bug?  Given that they have been aware of the problem for years, I'm not optimistic that this will be fixed until every customer is complaining.  The bug is described in their docs, but they won't fix it for whatever reason.  ... I'm guessing here, but fixing this would probably reduce the customer spend in ADF, because integration runtimes and virtual networks would not be active for as long as before.  It may seem immaterial, but if all parallel loops for all ADF customers were shortened by just ~10 mins, it would certainly amount to $100's of thousands per year (possibly even $ millions).  Normally I won't even start looking at workarounds until my pipelines are running an hour longer than they should.  ... What a waste.

 

 

1 ACCEPTED SOLUTION

Hi @v-kpoloju-msft 

The update is from Microsoft:
https://learn.microsoft.com/en-us/azure/data-factory/pipeline-trigger-troubleshoot-guide#degree-of-p...


The problem is that this is an obvious bug and they choose not to fix it, despite the fact that customers have struggled for years:

 

dbeavon3_0-1738694878900.png

 

 

Any concurrent or threaded programming language nowadays will allow tasks to be re-balanced while processing is underway.   Customers of ADF will expect it to perform the dynamic load-balancing, especially given the excessive cost of the underlying compute and the underlying network components.

 

The workarounds can often be complex, and involve predicting how long something will take to run, before you run it.  This prediction is not always accurate, and working on that prediction can take even more programming effort than the work that is done inside the loop. 

 

 

 

View solution in original post

5 REPLIES 5
v-kpoloju-msft
Community Support
Community Support

Hi @dbeavon3,

 

We regret the inconvenience you are experiencing and acknowledge your requirements. However, we are unable to raise the support ticket on your behalf.

 

Kindly submit the support ticket using the link provided below.
https://learn.microsoft.com/en-us/power-bi/support/create-support-ticket

 

Thank you for your understanding.

Hi @dbeavon3,

 

Since we haven't heard back from you, we wanted to follow up regarding your ticket.

Could you please provide an update on the status of your ticket ? it will be helpful for other members of the community who have similar problems as yours to solve it faster.

 

Thankyou.

Hi @v-kpoloju-msft 

The update is from Microsoft:
https://learn.microsoft.com/en-us/azure/data-factory/pipeline-trigger-troubleshoot-guide#degree-of-p...


The problem is that this is an obvious bug and they choose not to fix it, despite the fact that customers have struggled for years:

 

dbeavon3_0-1738694878900.png

 

 

Any concurrent or threaded programming language nowadays will allow tasks to be re-balanced while processing is underway.   Customers of ADF will expect it to perform the dynamic load-balancing, especially given the excessive cost of the underlying compute and the underlying network components.

 

The workarounds can often be complex, and involve predicting how long something will take to run, before you run it.  This prediction is not always accurate, and working on that prediction can take even more programming effort than the work that is done inside the loop. 

 

 

 

Hi @dbeavon3,

 

We apologize for the inconvenience. Unfortunately, we do not have an immediate solution currently. However, we will escalate this issue to our internal team to gather insights from various perspectives and resolve it as soon as possible.

 

Thank you.

Hi @dbeavon3,

If the issue has been resolved, we kindly request you to share the resolution or key insights here to help others in the community. If we don’t hear back, we’ll go ahead and close this thread.

Should you need further assistance in the future, we encourage you to reach out via the Microsoft Fabric Community Forum and create a new thread. We’ll be happy to help.

 

Thank you for your understanding and participation.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.