Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric Data Days Monthly is back. Join us on March 26th for two expert-led sessions on 1) Getting Started with Fabric IQ and 2) Mapping & Spacial Analytics in Fabric. Register now

Reply
ClimberScott
Frequent Visitor

Sudden jump in time to execute pipeline

Hello, 

 

SUMMARY: Pipeline is taking 5 minutes longer per dataflow, yet underlying dataflows show no increase in time taken to run. 

 

I have a master pipeline process I created that ingests a number of daily files from blob storage. It has a number of steps, including sub-pipelines, notebooks and so on. This process has always taken around 3 hours to complete. However, in the last few days, the time to process has jumped to 8 hours! 

 

Digging into the details, there are 2 sub-pipelines that each run 21 Gen2 dataflows that are causing most of the issues.  These 2 pipelines used to take around 30 minutes each, and this has ballooned to just under 3 hours each. 

 

I have reviewed the run history for the pipelines and what I see is that each or the 21 dataflows is now taking in the region of 340 seconds longer to process each. This varies by dataflow, with some taking 420 seconds longer, and others taking 318 seconds longer, but it's in that range and is relatively consistent, which makes me think there's a common cause. 

 

What is interesting is that when I drill down to an individual dataflows, and check the recent run history, I see NO increase in the time taken to run. So, the pipeline is where the delay is coming in. It doesn't appear to be due to the dataflow itself. 

 

The dataflows in one pipeline read and write to the bronze lakehouse, whereas the dataflows in the other read from bronze and write to the silver lakehouse (ie different destination lakehouses but similar issues). 

 

If anyone has any suggestions on what to look at as a potential cause, I'd be happy to hear suggestions! 

 

Thanks in advance. 

1 ACCEPTED SOLUTION

Changing to a notebook would startup faster, like 3-5 seconds using a 'StarterPool'. StartPools are warm servers ready to run. If you change the environment, like using a managed identity in a workspace, Fabric will not longer use the StartPool and will have to spinup a custom cluster to use the managed identity, thus taking just as long as the dataflow to spinup.

That being said, once the first startup is done, wehter dataflow or notebook, the sequental runs will be faster.

View solution in original post

10 REPLIES 10
cassidy
Power Participant
Power Participant

Also experiencing this - thought I was crazy.  The Dataflow being triggered takes 60 seconds to refresh, the Pipeline stays on that Dataflow node for 3.5 Minutes - big waste of time and wasn't like that until recently.

Weirdly, Pipelines calls them Gen2 Dataflows when they are not.

"dataflowType": "Dataflow-Gen2"

cyberalan
Frequent Visitor

I got this problem as well. Rather changing to notebook, anyone other workaround on this issue.

Which dataflows are you using? We experienced the same problem since last Friday.

 

For us the solution was converting the dataflow gen2 entities to dataflow gen2 (CI/CD). The next run showed the exact same overhead as we saw before the problem occured. It seems that the overhead for the "old" dataflow gen2 has increased.

 

I've raised the issue here as well: Re: Sudden increase Pipeline overhead cost - Microsoft Fabric Community

So I would recommend if you use dataflow gen2 convert those to dataflow gen2 (CI/CD) and see if the problem persists.

 

Hope this is helpfull


[Tip] Keep CALM and DAX on.
[Solved?] Hit “Accept as Solution” and leave a Kudos.
[About] Chiel | SuperUser (2023–2) |

Two things you can try are: first, make sure StarterPool is being used (if Managed Identity or a custom Spark setup was added, it may force a new cluster to spin up each time). Second, run your dataflows in parallel batches instead of strictly one after another, so they can reuse the same Spark session and avoid repeated startup delays in Microsoft Fabric.

 

v-priyankata
Community Support
Community Support

Hi @ClimberScott 

Thank you for reaching out to the Microsoft Fabric Forum Community.

@Thomaslleblanc Thanks for the inputs.

I hope the information provided by users was helpful. If you still have questions, please don't hesitate to reach out to the community.

 

Hi @ClimberScott 

Hope everything’s going smoothly on your end. I wanted to check if the issue got sorted. if you have any other issues please reach community.

Thomaslleblanc
Super User
Super User

Gen2 Dataflows always introduce overhead when they are triggered from a pipeline, and this overhead occurs before the actual dataflow execution begins. When a pipeline calls a dataflow, Fabric or Synapse must allocate compute resources, spin up the dataflow engine, and validate dependencies before any transformation work starts. If the underlying capacity is under load or experiencing throttling, this startup phase can grow significantly, sometimes adding several minutes per run.

This aligns directly with the behavior you are seeing. The pipeline reflects an additional three hundred to four hundred seconds for each dataflow, while the individual dataflow run history shows no increase in execution time. The dataflow logs measure only the internal transformation work, but they do not capture the startup or provisioning delay that happens outside the dataflow itself.

If you have a Fabric capacity, you need to look at Fabric Metrics Capacity app to see if queueing or throttling is happening. Else, if it is a Pro license, you are in conflict with other Pro license shared capacity and there is no way to see others effects on you.

Hello @Thomaslleblanc do you recommend changing the dataflows to notebooks? Or is there another way to process these steps more quickly? 

Changing to a notebook would startup faster, like 3-5 seconds using a 'StarterPool'. StartPools are warm servers ready to run. If you change the environment, like using a managed identity in a workspace, Fabric will not longer use the StartPool and will have to spinup a custom cluster to use the managed identity, thus taking just as long as the dataflow to spinup.

That being said, once the first startup is done, wehter dataflow or notebook, the sequental runs will be faster.

ClimberScott
Frequent Visitor

Just to add, I have NOT enabled staging, or fast copy. I'm a bit anxious to make any changes to this process, as it is a production system! 

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

February Fabric Update Carousel

Fabric Monthly Update - February 2026

Check out the February 2026 Fabric update to learn about new features.