March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi team
I want to run notebooks in parallel. But i see that the time it takes to run them through pipeline is longer than the time taken to run individually. Is there a setting that needs to be done?
Also is there a way that each notebook doenst spin its own cluster but use the same cluster. The notebooks seem to be queued as well. Why?
Solved! Go to Solution.
Hi @priyankabis ,
I'm sorry this official documentation above didn't help you, but I think you can enable high concurrency mode. This mode allows multiple notebooks to share a Spark session.
1. Navigate to the Data Engineer/Science section.
2. Select the Spark Compute menu.
3. If the High Concurrency Mode option is disabled, enable the High Concurrency Mode option.
You can also look at this link: Introducing High Concurrency Mode in Notebooks for Data Engineering and Data Science workloads in Mi...
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @priyankabis ,
Running notebooks in parallel can sometimes lead to longer execution times due to a variety of factors such as resource contention, overhead of managing parallel tasks, or inefficient configurations.
For your first question, I think first you need to make sure that the cluster has enough resources (CPU, memory) to handle multiple notebooks running in parallel. If resources are limited, tasks may compete for resources and cause delays. Secondly you need to check if the cluster configuration can handle parallel execution efficiently. Running notebooks through a pipeline may incur additional overhead compared to running them individually. This could be due to setup and teardown times, data transfers between stages, or other pipeline management tasks.
For your second question, I think you can Implement a clustering policy that forces all notebooks to use a shared cluster. This helps prevent the creation of multiple clusters and ensures efficient utilization of resources. If laptops are queuing, it may be due to the capacity limitations of the cluster. Increasing the cluster size or optimizing the laptops to run more efficiently can help reduce queuing.
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Can you direct me to the page which mentions about clustering policy
Hi @priyankabis ,
You can check out this document in detail below, which describes how to attach a laptop to a cluster:
Notebook compute resources | Databricks on AWS
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi
I dont think this documentationi s right for Fabric. This documentation is for Databricks
Hi @priyankabis ,
I'm sorry this official documentation above didn't help you, but I think you can enable high concurrency mode. This mode allows multiple notebooks to share a Spark session.
1. Navigate to the Data Engineer/Science section.
2. Select the Spark Compute menu.
3. If the High Concurrency Mode option is disabled, enable the High Concurrency Mode option.
You can also look at this link: Introducing High Concurrency Mode in Notebooks for Data Engineering and Data Science workloads in Mi...
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
7 | |
2 | |
2 | |
2 | |
2 |
User | Count |
---|---|
14 | |
7 | |
7 | |
5 | |
4 |