Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
FilipO
Advocate I
Advocate I

Slow starter pool session initialization

I am currently experimenting with a F2 capacity. Whenever I try to run a notebook it takes somewhere between 2 and 2.5 minutes to start the spark cluster even though I'm using the starter pool. All settings related to Spark are left as default for the workspace. Tried both running things manually as well as triggering the notebook from data factory.

 

FilipO_0-1698763546062.png

 

 

If I go into a trial workspace and run a notebook the spark session is started in just a few seconds. This makes me think that it might be a limitation depending on which capacity you are using? But the documentation gives no indication of this: https://learn.microsoft.com/en-us/fabric/data-engineering/spark-compute

23 REPLIES 23
FilipO
Advocate I
Advocate I

So I have not worked with notebooks for a few weeks. But realized yesterday that this issue seem to have been resolved and notebooks on my F2 capacity are now starting within 10 seconds. Could we get some type of confirmation from Microsoft?

FilipO
Advocate I
Advocate I

I have unmarked previous reply as solution. For me, the main issue here is that we have no proper way of tracking these type of issues. I have reported multiple issues that have been confirmed by a Microsoft but nothing is added to the known issues page: https://support.fabric.microsoft.com/en-us/known-issues - as the page is not updated we cannot really know if the issue has been acknowledged. The page also gives a false impression of few or no issues in Fabric. It is difficult to maintain credibility towards our customers when we don't have transparency from Microsoft.

Agreed! A Microsoft support engineer told me last week that an issue was in their internal known issues tracker, but that it would not be added to the public known issues page. Very frustrating.

ramonsuarez
Advocate II
Advocate II

It is getting worst for me, sometimes it just fails after a long time of trying to start. 

ramonsuarez_0-1704805639448.png

 

FilipO
Advocate I
Advocate I

Glad to get a confirmation I'm not doing something wrong and that it will be fixed. Thank you!

Hi @FilipO ,

Glad to know that you got some insights for your query. 
Please continue using Fabric Community for any help regarding your queries.

v-gchenna-msft
Community Support
Community Support

Hi @FilipO,

Apologies for the issue you have been facing.
We are reaching out to the internal team to get more information related to your query and will get back to you as soon as we have an update.

Hi @FilipO ,

Apologies for the issue you have been facing. We have an update from the team.
 We got impacted by this limitation on supporting 1 node starter pools mainly for F2 capacity sku. Team have added this to Known Issues section in the document and the latest version should be published end of this week. We are working on a fix to enable 5 second session start experience for F2 capacity. I will update this thread once the fix is rolled out in few weeks.

Thank you

I'm also experiencing extremely slow start times (2-5 minutes) when using a custom environment. My environment uses all default spark settings; all it does is install a pip package.

This should not be an accepted solution. The issue is still there. I'm on a F4 and facing 2+ minutes startup time.

Agree, this should not be closed - we are also using F4 and the Spark Session is not starting, it is actually timing out for us...

wiegelman_0-1709127730224.png

Diagnostics below:

Diagnostic ID: 22a1d13c-d334-4828-8215-dfe8e65dfb60

Timestamp: 2024-02-28T13:38:08.717Z

Message: [object CloseEvent]

JSON
{
"type": "close",
"timeStamp": 731861.099999994,
"code": 1000,
"reason": "{\"reason\":\"Session error or stopped.\",\"state\":\"session-completed\"}",
"wasClean": true,
"target": {
"protocolsProfile": [
7,
4173
]
},
"currentTarget": {
"protocolsProfile": [
7,
4173
]
},
"isTrusted": true
}

Additional info: InstanceId: 2491b595-9a6b-4066-bd0d-0e41152e1754

This is a separate issue than the rest of in this thread. If you search the forums for "Livy session has failed" you will see a number of discussions.

At least issue is acknowledged - hopefully gets fixed soon

wiegelman_0-1709137139788.png

 

I'm getting even slower times to be able to run a notebook, up to over 8 minutes. On average they start in about 2'45", but now I'm getting more and more startups that take much longer. 

ramonsuarez_0-1704364424941.png

 

Changing to an F4 capacity did not change anything. 

 

This issue is not solved and should be changed so that it reflects that it is still open. 

@v-gchenna-msft , has the fix been implemented? It has been ~8 weeks since you said that the fix would be "rolled out in few weeks" and I am still seeing slow startup times on the F2 capacity.

 

Thanks

Hi @cheresier ,

We are reaching out to the internal team to get more information related to your query and will get back to you as soon as we have an update.

Now there's been another two weeks. Any update on this? Many thanks!

Hi @FilipO ,

No not yet. We have a deployment freeze in December and this is still being worked on. Team is  targeting by Q1 for this to be rolled out to all prod regions. Will update this thread once the support for smaller capacities is enabled.

Which capacities are affected by this issue right now beyond F2 and F4? 

F8 appears to be affected as well

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

July Newsletter

Fabric Community Update - July 2024

Find out what's new and trending in the Fabric Community.