The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi All,
I am using F4, sometimes its taking more than 10mins to start the spark in notebook which is there in the pipeline. Please refer screenshot1. It took 11mins to start.
1) What could be the reason? How to check this and make it to start with in 3 minutes?
2) Please refer screenshot2. Sometime the nootebook gets struck and says you have hit the Spark limits. How to check this and make sure we are not hiting the limits? Where to monitor these limits?
Screenshot:1
Screenshot:2
Solved! Go to Solution.
Hi @NagaRK
Thank you for contacting the Microsoft Fabric Community Forum.
The long startup times for notebooks and Spark job failures in Microsoft Fabric are mainly due to capacity limitations with your current F4 SKU. In the first screenshot, the notebook takes over 11 minutes to start, and the Spark cluster remains in the "Starting" state. This usually happens when there aren't enough Spark compute resources or when demand is high, such as during peak times or when several jobs run at once. Since F4 is a lower-tier SKU, it has limited compute and concurrency, which can cause delays. Upgrading to a higher SKU like F8 or F16 can help by providing more resources and faster startups. Enabling warm pools, if available, can also reduce cold start times.
The second screenshot shows a notebook failing to run due to hitting Spark compute or API rate limits, resulting in an HTTP 429 error. This happens when the capacity is already at its limit for concurrent Spark jobs or API requests, often because multiple pipelines or notebooks are running at once. To avoid this, monitor Spark job usage with the Spark Monitoring Hub and use the Capacity Metrics App in the Admin Portal to track resource use. Also, make sure notebooks close Spark sessions promptly to free up resources for other jobs.
For additional guidance, refer to the official Microsoft Fabric documentation on Understand your Fabric capacity throttling - Microsoft Fabric | Microsoft Learn.
Regards,
Karpurapu D,
Microsoft Fabric Community Support Team.
Hi @NagaRK
Thank you for contacting the Microsoft Fabric Community Forum.
The long startup times for notebooks and Spark job failures in Microsoft Fabric are mainly due to capacity limitations with your current F4 SKU. In the first screenshot, the notebook takes over 11 minutes to start, and the Spark cluster remains in the "Starting" state. This usually happens when there aren't enough Spark compute resources or when demand is high, such as during peak times or when several jobs run at once. Since F4 is a lower-tier SKU, it has limited compute and concurrency, which can cause delays. Upgrading to a higher SKU like F8 or F16 can help by providing more resources and faster startups. Enabling warm pools, if available, can also reduce cold start times.
The second screenshot shows a notebook failing to run due to hitting Spark compute or API rate limits, resulting in an HTTP 429 error. This happens when the capacity is already at its limit for concurrent Spark jobs or API requests, often because multiple pipelines or notebooks are running at once. To avoid this, monitor Spark job usage with the Spark Monitoring Hub and use the Capacity Metrics App in the Admin Portal to track resource use. Also, make sure notebooks close Spark sessions promptly to free up resources for other jobs.
For additional guidance, refer to the official Microsoft Fabric documentation on Understand your Fabric capacity throttling - Microsoft Fabric | Microsoft Learn.
Regards,
Karpurapu D,
Microsoft Fabric Community Support Team.
Thanks @v-karpurapud .. is it a good approach to scale up the SKU to F16 before we start the pipelines and scale down to F2 or F4 once the pipeline is completed to save the cost?
Hi @NagaRK
Yes, Scaling up to F16 before starting pipelines and then scaling down afterward is an effective way to reduce Spark startup latency and avoid resource throttling. This approach optimizes performance during peak workloads and helps manage costs during less active periods. You can adjust the scale manually via the Azure portal or automate the process using Azure CLI, REST APIs, or Logic Apps, based on your requirements. Also ensure that the lower-tier SKU can handle any ongoing or background tasks after scaling down.
Regards,
Karpurapu D,
Microsoft Fabric Community Support Team.
You need at least F64 capacity which is equivalent to Power BI Premium P1 capacity. This is too small Virtual Machines.
Your capacity is small and only supports a small number of pools. Most likely your notebook had to wait in a queue for a pool to become available.
User | Count |
---|---|
3 | |
2 | |
2 | |
2 | |
2 |
User | Count |
---|---|
13 | |
9 | |
8 | |
6 | |
5 |