Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
When Microsoft was building ADF parallel pipelines, they made the decision to use static partitioning. The workloads that are going to a parallel loop is prepared in advance and they are not adjusted, even if some items are completed faster than others.
In most scheduling engines of parallel work, the load balancing is done dynamically. It is very frustrating that Microsoft did things this way. On many occasions, I have wasted time trying to accommodate the silly limitation in ADF. It would be so much easier for Microsoft to solve this in a centralized way for the benefit of all their customers.
About three years ago I reported this as an ADF bug because it was mind-boggling to me that the partitioning would be so unfriendly.
Is there any plan to fix this issue in "Fabric" now that Microsoft has moved the same technology here from their "ADF" platform?
NOTE: Below is what this problem looks like in Fabric. Notice that there is one item in the parallel loop that is larger than the others, and it causes "stragglers" to be executed long after all the other parallel workers have gone to sleep.
I'm sure this P5 issue is familiar to others. Has anyone else contacted Microsoft? If you are reading this, would you please take a turn opening a support ticket about the scheduling bug? Given that they have been aware of the problem for years, I'm not optimistic that this will be fixed until every customer is complaining. The bug is described in their docs, but they won't fix it for whatever reason. ... I'm guessing here, but fixing this would probably reduce the customer spend in ADF, because integration runtimes and virtual networks would not be active for as long as before. It may seem immaterial, but if all parallel loops for all ADF customers were shortened by just ~10 mins, it would certainly amount to $100's of thousands per year (possibly even $ millions). Normally I won't even start looking at workarounds until my pipelines are running an hour longer than they should. ... What a waste.
Hi @dbeavon3,
We regret the inconvenience you are experiencing and acknowledge your requirements. However, we are unable to raise the support ticket on your behalf.
Kindly submit the support ticket using the link provided below.
https://learn.microsoft.com/en-us/power-bi/support/create-support-ticket
Thank you for your understanding.
Hi @dbeavon3,
Since we haven't heard back from you, we wanted to follow up regarding your ticket.
Could you please provide an update on the status of your ticket ? it will be helpful for other members of the community who have similar problems as yours to solve it faster.
Thankyou.
Hi @v-kpoloju-msft
The update is from Microsoft:
https://learn.microsoft.com/en-us/azure/data-factory/pipeline-trigger-troubleshoot-guide#degree-of-p...
The problem is that this is an obvious bug and they choose not to fix it, despite the fact that customers have struggled for years:
Any concurrent or threaded programming language nowadays will allow tasks to be re-balanced while processing is underway. Customers of ADF will expect it to perform the dynamic load-balancing, especially given the excessive cost of the underlying compute and the underlying network components.
The workarounds can often be complex, and involve predicting how long something will take to run, before you run it. This prediction is not always accurate, and working on that prediction can take even more programming effort than the work that is done inside the loop.
User | Count |
---|---|
29 | |
10 | |
4 | |
3 | |
1 |
User | Count |
---|---|
45 | |
15 | |
14 | |
10 | |
9 |