Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi everyone,
I have a notebook with several cells that runs very fast, in two minutes or less. When programmed with a pipeline, the duration can reach up to 5 minutes. Do you know what this could be due to and how to solve it?
Solved! Go to Solution.
Hi @rgsalido,
Thank you for the update.
The behavior you're experiencing is normal when running a notebook through a pipeline. Pipelines typically start a new Spark session for each run, which adds extra time compared to running the notebook manually. Because your pipeline runs every 5 minutes, session startup is likely causing most of the delay.
Even with a session tag applied, Spark may still start a new session if the previous one isn't active or if the compute resources are busy.
To improve performance, you can try these steps:
Use a consistent session tag in the Notebook activity so Fabric can reuse the Spark session when possible.
Enable high-concurrency or session sharing for pipeline notebooks, if your workspace supports it. This helps the pipeline connect to an existing Spark application instead of starting a new one.
Check your Spark pool capacity. If other jobs are using the pool, session startup may be slower because executors aren't available right away.
Review the Spark UI timeline for idle periods, which often show Spark waiting for resources, shuffle, or I/O, rather than issues in your code.
If your pipeline needs to run frequently, consider keeping a warm session active with the same tag so notebook runs can attach to it faster.
Thank you.
HI @rgsalido,
We haven’t heard from you on the last response and was just checking back to see if your query was answered.
Otherwise, will respond back with the more details and we will try to help .
Thank you.
Hi @rgsalido,
I wanted to follow up on our previous suggestions regarding the issue. We would love to hear back from you to ensure we can assist you further.
Thank you.
Hi @rgsalido,
Thank you for the update.
The behavior you're experiencing is normal when running a notebook through a pipeline. Pipelines typically start a new Spark session for each run, which adds extra time compared to running the notebook manually. Because your pipeline runs every 5 minutes, session startup is likely causing most of the delay.
Even with a session tag applied, Spark may still start a new session if the previous one isn't active or if the compute resources are busy.
To improve performance, you can try these steps:
Use a consistent session tag in the Notebook activity so Fabric can reuse the Spark session when possible.
Enable high-concurrency or session sharing for pipeline notebooks, if your workspace supports it. This helps the pipeline connect to an existing Spark application instead of starting a new one.
Check your Spark pool capacity. If other jobs are using the pool, session startup may be slower because executors aren't available right away.
Review the Spark UI timeline for idle periods, which often show Spark waiting for resources, shuffle, or I/O, rather than issues in your code.
If your pipeline needs to run frequently, consider keeping a warm session active with the same tag so notebook runs can attach to it faster.
Thank you.
I don't have any more notebooks running in that pipeline. The pipeline runs every 5 minutes. Maybe the pipeline+notebook pattern is not the most efficient.
Ideally, the Spark session should be reused for each execution. I have used the tag in the notebook activity, but it doesn't always use the same session.
Hi @rgsalido
Yes you are right. Running Notebooks is faster compared to pipelines. pipelines is slower because it runs on UI/UX whereas Notebooks is faster because it is programmed by Python/Scala and we have to use Apache Spark and Delta Lake combined.
( https://spark.apache.org/docs/latest/api/python/getting_started/index.html )
Hi @rgsalido,
Thank you @Ugk161610 @Srisakthi and @Gpop13 for your replies.
As we have not received a response from you yet, I would like to confirm whether you have successfully resolved the issue or if you require further assistance.
Thank you for your cooperation. Have a great day.
Hi @Ugk161610 ,
I agree spark session start time is where we are seeing idle in the image at the begining of the job, but I'm little confused on these
at 12:13:00 job started running and again from 12:14:00 to 12:14:40 it shows idle. Any idea why its been idle?
Regards,
Srisakthi
Hi @Srisakthi ,
That idle period usually means Spark was waiting (for executors, shuffle, I/O, or a short blocking operation) — not that your code was wrong. Check the Spark UI stage timeline and executor allocation around 12:14 to see which of the above matches your run, then apply the small fixes above.
If you want, paste the exact stage timeline or a screenshot of the Spark UI for 12:13–12:15 and I’ll point to the most likely cause.
– Gopi Krishna
Hi @rgsalido - Do you have more notebooks running as part of that pipeline? Is high concurrency ON for notebooks?
Also, when you say the cells taking long, did you check the notebook snapshot after the run to compare, if each cell taking longer or just the startup of spark compute taking long?