Solved: Fabric Notebook takes very long time to call fabri...

SriThiru · ‎04-04-2025

We are using F128 capacity, and the tenant in enabled with Azure Private Link.

I'm facing problem with spark performance in fabric notebooks.

It takes 2 to 3 mins to start the spark session in Fabric notebook. I'm fine with the startup time, but the issue is, every script execution takes very long time.

I have a script like the following to fetch the list of workspaces and the users using semantic-link-labs, it took more than 10 mins to start producing the results, after 30 mins also it was not completing, so I have cancelled it

import sempy_labs as labs

df = labs.admin.list_workspaces()

for idx, ws in df.iterrows():

df_ws = labs.admin.list_workspace_users(workspace=ws['Id'])

display(df_ws)

I tried using the rest api calls like following, but no luck

import sempy.fabric as fabric

restClient = fabric.FabricRestClient()

response = restClient.get(f"v1.0/myorg/admin/groups/{workspaceId}/users")

Can anyone please share the ways to improve the performance?

Do we have different fabric rest api endpoints to run faster in azure private link (ie via microsoft backbone network)?

Can I byepass the azure private link route for specific spark jobs?

Vinodh247 · ‎04-04-2025

If your code is making many REST calls in a loop, esp., across many workspaces that can kill performance. Each iteration spins up a network call to an endpoint that may be routing through private link (which can add significant latency).
The Spark cluster itself has some startup overhead (2 to 3 minutes is normal). The bigger problem is the per-call overhead once your cluster is already up.
Instead of calling labs.admin.list_workspace_users() in a loop for each workspace, see if the sempy or Fabric REST APIs offer a more aggregated/bulk endpoint. If not, consider calling them in a more parallel fashion (Spark can parallelize tasks using rdd or dataframe transformations, or you could use ThreadPoolExecutor in python). Doing one synchronous call per iteration is going to be slow, especially if you have to do it hundreds or thousands of times.
If you must do them one-by-one, see if you can at least reduce the total calls or do them asynchronously.

View solution in original post

v-kathullac · ‎04-25-2025

Hi @SriThiru ,

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Regards,

Chaithanya.

v-kathullac · ‎04-22-2025

Hi @SriThiru ,

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Regards,

Chaithanya.

v-kathullac · ‎04-17-2025

Hi @SriThiru ,

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Regards,

Chaithanya.

Vinodh247 · ‎04-04-2025

If your code is making many REST calls in a loop, esp., across many workspaces that can kill performance. Each iteration spins up a network call to an endpoint that may be routing through private link (which can add significant latency).
The Spark cluster itself has some startup overhead (2 to 3 minutes is normal). The bigger problem is the per-call overhead once your cluster is already up.
Instead of calling labs.admin.list_workspace_users() in a loop for each workspace, see if the sempy or Fabric REST APIs offer a more aggregated/bulk endpoint. If not, consider calling them in a more parallel fashion (Spark can parallelize tasks using rdd or dataframe transformations, or you could use ThreadPoolExecutor in python). Doing one synchronous call per iteration is going to be slow, especially if you have to do it hundreds or thousands of times.
If you must do them one-by-one, see if you can at least reduce the total calls or do them asynchronously.

Fabric Notebook takes very long time to call fabric rest api

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025

Party with Power BI’s own Guy in a Cube

Fabric Notebook takes very long time to call fabric rest api

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025