Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
arlindTrystar
Advocate I
Advocate I

Notebook taking longer in pipeline, compared to running it on its own.

Can someone explain this to me?

arlindTrystar_0-1726818798484.png

Why is the timing in snapshot details, different from the timing in the run details?

When running the notebook itself, it does not take more than 2-3 minutes. When running as part of a pipeline, it takes longer.

For context: this notebook is part of a pipeline that copies tables from source to Azure, and then this notebook is triggered to load to silver lakehouse. So there are multiple tables being loaded, and after each one is loaded as raw, we trigger this notebook.

 

2 ACCEPTED SOLUTIONS
Anonymous
Not applicable

Hi @arlindTrystar ,

 

Thanks for the reply from frithjof_v .

 

In Fabric, the reason Notebook runs slower in a pipeline than on its own is that your Notebook is triggered after each table is loaded, and the cumulative effect of sequential operations can increase the overall duration. However Notebook alone runs only once.

 

The difference in timing between the snapshot details and the run details can be attributed to several factors:

 

There could be delays in allocating resources or initializing the environment, which are captured in the snapshot but not in the run details.

 

The snapshot might include additional overhead from logging and monitoring processes that are not accounted for in the run details.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

View solution in original post

frithjof_v
Super User
Super User

You can use a Master notebook in the pipeline to trigger child notebook runs.

 

This way, the Notebook runs can share the same spark session, meaning you won't have to wait for cluster start-up for each notebook run.

 

https://learn.microsoft.com/en-us/fabric/data-engineering/notebook-utilities#reference-a-notebook

 

There is also something called Threadpooling which is said to be even faster.

 

High concurrency in Data pipelines will probably be a no-code, out-of-the-box solution when it gets released, it is on the roadmap: 

https://learn.microsoft.com/en-us/fabric/release-plan/data-engineering#investment-areas

 

However for now I think the Master notebook -> Child notebook pattern is the available option.

View solution in original post

3 REPLIES 3
frithjof_v
Super User
Super User

You can use a Master notebook in the pipeline to trigger child notebook runs.

 

This way, the Notebook runs can share the same spark session, meaning you won't have to wait for cluster start-up for each notebook run.

 

https://learn.microsoft.com/en-us/fabric/data-engineering/notebook-utilities#reference-a-notebook

 

There is also something called Threadpooling which is said to be even faster.

 

High concurrency in Data pipelines will probably be a no-code, out-of-the-box solution when it gets released, it is on the roadmap: 

https://learn.microsoft.com/en-us/fabric/release-plan/data-engineering#investment-areas

 

However for now I think the Master notebook -> Child notebook pattern is the available option.

Anonymous
Not applicable

Hi @arlindTrystar ,

 

Thanks for the reply from frithjof_v .

 

In Fabric, the reason Notebook runs slower in a pipeline than on its own is that your Notebook is triggered after each table is loaded, and the cumulative effect of sequential operations can increase the overall duration. However Notebook alone runs only once.

 

The difference in timing between the snapshot details and the run details can be attributed to several factors:

 

There could be delays in allocating resources or initializing the environment, which are captured in the snapshot but not in the run details.

 

The snapshot might include additional overhead from logging and monitoring processes that are not accounted for in the run details.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

frithjof_v
Super User
Super User

Are you running the Notebook inside a ForEach activity in the Data Pipeline? So you will have multiple parallell executions of the Notebook?

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.