Solved: Re: Best way to start Fabric notebooks via API or ...

tayloramy · ‎09-05-2025

Hi everyone,

I am building a metadata driven orchestration system in Fabric where jobs are triggered across Bronze, Silver, and Gold layers. One area I am exploring is how to start downstream notebooks programmatically in a way that uses Spark resources efficiently.

From what I can tell there are two main options:

Use the Fabric REST API to start through the Semantic Link library inside a notebook
Call notebookutils.run from within another notebook

My concern is Spark concurrency and resource usage. Ideally I would like jobs to start in a high concurrency session so they do not spin up a brand new Spark context every time, since that adds overhead.

Has anyone found a best practice for programmatically starting notebooks in Fabric so that Spark resources are reused efficiently? For example, is there a recommended way to target a high concurrency session, or clear tradeoffs between using the REST API and notebookutils?

Looking forward to hearing how others are tackling this.

Thanks,
Taylor

tayloramy · ‎09-08-2025

This is my understanding as well.

My goal is to optimize spark resources, and if I'm programatically starting a bunch of notebooks each in their own starter pool, that is not very optimal.

I was hoping someone here would have some magic sauce I could explore.
I think my approach going forward will be to run the notebooks from a pipeline, and then run the pipelines from my orchestration layer, that way in the pipeline I can set up the high ocncurrency settings for Spark.

View solution in original post

tayloramy · ‎09-12-2025

It seems like the best way to do this is to simply have my main orchestration notebook run a pipeline, and then in the pipeline configure it to run the desired notebook in a high concurrency session.
Not ideal, but functional I suppose.

View solution in original post

tayloramy · ‎09-12-2025

It seems like the best way to do this is to simply have my main orchestration notebook run a pipeline, and then in the pipeline configure it to run the desired notebook in a high concurrency session.
Not ideal, but functional I suppose.

anilgavhane · ‎09-08-2025

@tayloramy 1. Fabric REST API via Semantic Link

More scalable
Better control over session reuse
Ideal for metadata-driven orchestration

2. notebookutils.run() from another notebook

Simple chaining
May spin up new Spark context (less efficient)

Spark Efficiency Tips

Use high concurrency sessions
Keep notebooks modular
Avoid long-running transformations downstream

For orchestration across Bronze, Silver, and Gold layers, the REST API approach is generally preferred. Let me know if you want help setting up a sample flow.

BhaveshPatel · ‎09-06-2025

Hi @tayloramy

It should be run most effective way. ( ETL )
First Python --> Apache Spark ( Data Lake ) -- > Delta Lake Layer. -- > Two approaches ( First one is Bronze, Silver and Gold Approach ( ETL ) and the other approach is ( Bronze Layer only and use show views in the subsequently).

You should use Connect in Notebooks and use New standard session cluster.

There are some limitation criteria about High Concurrancy Cluster. For example, You can not use scala. You have to use python or SparkSQL only.

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

spaceman127 · ‎09-06-2025

Hi @tayloramy ,

First of all, and independent of the Rest API, configuring the high concurrency in the workspace would be a first step towards using resources more efficiently.

Here is the Microsoft documentation if not already known.

https://learn.microsoft.com/en-us/fabric/data-engineering/configure-high-concurrency-session-noteboo...

Best regards

tayloramy · ‎09-06-2025

Hi @spaceman127,

Thanks for the response.

yes, I'm familiar with the high concurrency spark sessions, but as far as I know a ntoebook can join a high concurrency session either manually when running it in the interface, or when started in a pipeline, which is outlined here: https://learn.microsoft.com/en-us/fabric/data-engineering/configure-high-concurrency-session-noteboo...

I'm wondering if there's any way to run a notebook in a high concurrency session from inside another notebook, either using semantic link or notebookutils, or though another method.

I'm also curious about what high concurrency looks like across different workspaces - the docs say that the notebooks should be in the same workspace, but the word should makes me think it might be possible to do this cross workspace, though it would likely be very unsupported.

Thanks,

Taylor

spaceman127 · ‎09-06-2025

Hi @tayloramy ,

all right.

To answer your question:
Yes, you can do that with notebookutils.run.

I just tested it again.

Here's an example:

# Here is an Example

result = notebookutils.notebook.run(
    "nb2",                       # Name from Notebooks
    1800,                        # Timeout in seconds
    {"param1": "test"},          # Parameter when you use it
    "12345678-1234-1234-1234-c55a97b0f92a"  # WorkspaceId
)

print(result)

And here is the documentation for it.

https://learn.microsoft.com/en-us/fabric/data-engineering/notebook-utilities

As far as the Rest API is concerned, I could imagine that it would work.
However, I haven't checked that yet.

I hope it helps you.

Best regards

tayloramy · ‎09-07-2025

Hi @spaceman127,

I understand how to run a notebook with both notebookutils and sempy, my question is: Is there a way to do this while also running the notebook in a high concurrency spark pool instead of the default pool?

spaceman127 · ‎09-07-2025

Hi @tayloramy ,

ok i understand this.

I don't think that will work. You can, of course, have multiple Spark pools in a workspace, but you can't do that with notebookutils.notebook.run().

Fabric works with sessions.
The default pool is always used. I'm not aware of any other way of doing this at the moment.

Best regards

tayloramy · ‎09-08-2025

This is my understanding as well.

My goal is to optimize spark resources, and if I'm programatically starting a bunch of notebooks each in their own starter pool, that is not very optimal.

I was hoping someone here would have some magic sauce I could explore.
I think my approach going forward will be to run the notebooks from a pipeline, and then run the pipelines from my orchestration layer, that way in the pipeline I can set up the high ocncurrency settings for Spark.

spaceman127 · ‎09-08-2025

I always try to optimize my notebooks as much as possible.
In this case, however, there is little chance of success.

I will continue testing in this area as soon as I have time. If I find anything, I will get back to you.

Best way to start Fabric notebooks via API or Semantic Link for efficient Spark usage

Helpful resources

Fabric Monthly Update - September 2025

FabCon is coming to Atlanta

Best way to start Fabric notebooks via API or Semantic Link for efficient Spark usage

Helpful resources

Fabric Monthly Update - September 2025