Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Solved! Go to Solution.
If your code is making many REST calls in a loop, esp., across many workspaces that can kill performance. Each iteration spins up a network call to an endpoint that may be routing through private link (which can add significant latency).
The Spark cluster itself has some startup overhead (2 to 3 minutes is normal). The bigger problem is the per-call overhead once your cluster is already up.
Instead of calling labs.admin.list_workspace_users() in a loop for each workspace, see if the sempy or Fabric REST APIs offer a more aggregated/bulk endpoint. If not, consider calling them in a more parallel fashion (Spark can parallelize tasks using rdd or dataframe transformations, or you could use ThreadPoolExecutor in python). Doing one synchronous call per iteration is going to be slow, especially if you have to do it hundreds or thousands of times.
If you must do them one-by-one, see if you can at least reduce the total calls or do them asynchronously.
Hi @SriThiru ,
As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.
Regards,
Chaithanya.
Hi @SriThiru ,
As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.
Regards,
Chaithanya.
Hi @SriThiru ,
As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.
Regards,
Chaithanya.
If your code is making many REST calls in a loop, esp., across many workspaces that can kill performance. Each iteration spins up a network call to an endpoint that may be routing through private link (which can add significant latency).
The Spark cluster itself has some startup overhead (2 to 3 minutes is normal). The bigger problem is the per-call overhead once your cluster is already up.
Instead of calling labs.admin.list_workspace_users() in a loop for each workspace, see if the sempy or Fabric REST APIs offer a more aggregated/bulk endpoint. If not, consider calling them in a more parallel fashion (Spark can parallelize tasks using rdd or dataframe transformations, or you could use ThreadPoolExecutor in python). Doing one synchronous call per iteration is going to be slow, especially if you have to do it hundreds or thousands of times.
If you must do them one-by-one, see if you can at least reduce the total calls or do them asynchronously.