Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
FatimaArshad
New Member

Fabric pipeline - Notebook crashes after 3 hours

Hi,

 

I have created a modelling notebook. It contains four models and usually takes 12 hours to run. When I put it in a Fabric Pipeline, it generates this error after 3 hours:

FatimaArshad_0-1740143073173.png

 

I an unsure of how to solve this. 

 

Edit: Added configurations 

FatimaArshad_0-1740157915054.png

 

1 ACCEPTED SOLUTION
nilendraFabric
Community Champion
Community Champion

 

thanks for sharing the details 

 

Non-distributed Python training (SARIMAX/GAM) creates single-threaded workloads that don’t leverage Spark’s parallelism

 

Each notebook call in a pipeline creates a new Spark session (even for simple I/O), causing Livy session throttling.

 

Nested loops create 1600 variables × 4 models = 6,400 sequential tasks, overwhelming small compute nodes

 

Some possible solutions 

 

Enable High Concurrency Mode
Reduces Spark session overhead by 70% through session sharing

 

Use Spark Job Definitions
Queueable batch jobs avoid interactive capacity limits

Restructure the training workflow using Spark’s native distributed processing instead of Python loops. Combine with batch job definitions and proper resource allocation to stay within Fabric’s capacity limits while maintaining throughput

 

 

if this is helpful please accept the answer 

View solution in original post

9 REPLIES 9
v-achippa
Community Support
Community Support

Hi @FatimaArshad,

 

Thank you for reaching out to Microsoft Fabric Community.

 

Thank you @nilendraFabric for addressing the issue.

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided by the super user resolved your issue? or let us know if you need any further assistance.
If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.

 

Thanks and regards,

Anjan Kumar Chippa

Hi @FatimaArshad,

 

We wanted to kindly follow up to check if the solution provided by the super user resolved your issue.
If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.

 

Thanks and regards,

Anjan Kumar Chippa

Hi @FatimaArshad,

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided by the super user resolved your issue.
If our super user response resolved your issue, please mark it as "Accept as solution" and click "Yes" if you found it helpful.

 

Thanks and regards,

Anjan Kumar Chippa

nilendraFabric
Community Champion
Community Champion

 

thanks for sharing the details 

 

Non-distributed Python training (SARIMAX/GAM) creates single-threaded workloads that don’t leverage Spark’s parallelism

 

Each notebook call in a pipeline creates a new Spark session (even for simple I/O), causing Livy session throttling.

 

Nested loops create 1600 variables × 4 models = 6,400 sequential tasks, overwhelming small compute nodes

 

Some possible solutions 

 

Enable High Concurrency Mode
Reduces Spark session overhead by 70% through session sharing

 

Use Spark Job Definitions
Queueable batch jobs avoid interactive capacity limits

Restructure the training workflow using Spark’s native distributed processing instead of Python loops. Combine with batch job definitions and proper resource allocation to stay within Fabric’s capacity limits while maintaining throughput

 

 

if this is helpful please accept the answer 

FatimaArshad
New Member

@nilendraFabric  Do you think this is compute issue?

My intial thinking is going for resource and compute constraints. Which SKU you are on, what are the sizes of these models etc. please share further details

Linear Regress, SARIMAX, GAM, XGboost. Training is done on 1600 dependent variables. One loop goes through set of features, second goes through models, and third applies models to all these variables

nilendraFabric
Community Champion
Community Champion

Hello @FatimaArshad 

 

which F Sku are you on ?

I am using Spark 3.4, compute small 1-4 nodes. But I only use spark to read and write to delta tables. I use Python mainly

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.