Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hello!
I have been attempting to use DataflowGen2 CI/CD with parameters with very little success, the goal was to reuse the dataflow within a pipeline For Each Activity using Parallel/Concurrent executions.
I have experimented with altering batch size (even going as low as 2), but have had to move to sequential execution, as it seems this functionality simply does not work reliably.
The Pattern of the errors I have encountered:
1) Pipeline fails when running dataflow: Dataflow refresh job failed with status:
Failed. Error Info: { errorCode: JobInstanceStatusFailed, message: Job instance failed without detail error, requestId: 609e5561-ad0d-40ca-b5df-20d6a0dcad86 }
Often times it appears this error is thrown when pipeline is checking the Dataflow refresh status, the dataflow can still succeed and this error stays with the pipeline run.
2) When the dataflow actually does fail drilling into these errors on the Dataflow run details simply results in Fabric interface itself throwing an error:
I am hoping that this feedback can be passed on to the product team as it is frustrating not being able to see the reason things are failing when executed concurrently, however these issues do not occur when executing sequentially.
Thanks!
Solved! Go to Solution.
Internally, Dataflow Gen2 jobs may hit compute or metadata contention when too many parallel refreshes are triggered in a short interval. The Dataflow execution engine sometimes fails to forward exceptions back to the parent pipeline. Also the Fabric UI does not yet provide robust debugging/logging for concurrent Dataflow executions. Pls note that dataflow Gen2 is still evolving support for robust parameter handling in parallel executions, especially when resource reuse or contention is present.
As a temp workaround try:
But always it is ideal to follow the best practices as listed below:
Please Kudos & 'Accept' as solution if the reply was helpful. This will be benefitting other community members who face the same issue.
Internally, Dataflow Gen2 jobs may hit compute or metadata contention when too many parallel refreshes are triggered in a short interval. The Dataflow execution engine sometimes fails to forward exceptions back to the parent pipeline. Also the Fabric UI does not yet provide robust debugging/logging for concurrent Dataflow executions. Pls note that dataflow Gen2 is still evolving support for robust parameter handling in parallel executions, especially when resource reuse or contention is present.
As a temp workaround try:
But always it is ideal to follow the best practices as listed below:
Please Kudos & 'Accept' as solution if the reply was helpful. This will be benefitting other community members who face the same issue.
Thank you @Vinodh247 for the detailed response!
For others potentially running into this issue:
Throttle parallelism: Instead of completely disabling parallelism, set Batch count = 2 and introduce a Wait activity (2–5 seconds) between executions to reduce race conditions. I experimented with differernt bath sizes and wait times (even using rand() to randomize dataflow kick offs) while this did improve the issues, it never resolved it.
Isolate Dataflows per iteration: If possible, clone the dataflow for testing and assign different names for each execution path to test whether the issue is caused by shared state or metadata locks. I am actually doing this in another project and it works well, the issue is there is limitation on # of workspace items and it becomes extremely onerous when there are more than an handful of iterations.
Avoid Parameter Binding in Highly Parallel Jobs: Parameters in Dataflow Gen2 often get lost or mismatched when triggered concurrently. For now, externalize transformations to Notebooks or Spark job definitions if possible. This is unfortunate, hopefully a more robust and performant option is availble via DataflowGen2 in the future.
Will be sure to provide feedback with fabric, was not aware of that option.
Thanks again!
User | Count |
---|---|
13 | |
5 | |
4 | |
3 | |
3 |
User | Count |
---|---|
8 | |
8 | |
7 | |
6 | |
5 |