Re: Set NotebookId dynamically in notebook activit...

KhagendraWagle · ‎11-24-2024

I believe most of you have come to the situation where a notebook needs to be run from the current workspace. It can easily be done selecting the the current workspace (e.g. DevWorkspace) and the notebook (e.g. Notebook1) as shown below:

This works perfectly but the issue pops up when the pipeline is deployed to another workspace (e.g ProdWorkspace). There is no Fabric deployment rule available a data pipeline to overwrite workspace from "DevWorkspace" to "ProdWorkspace" at the time of writing this post.

Setting Fabric workspace dynamically in pipeline activity is easier with the availibility of system variable

@pipeline().DataFactory .

There is no inbuilt function available to set NotebookId dynamically. We could set fixed NotebookId of Notebook1 in DevWorkspace but the NotebookId for the same Notebook would be different in ProdWorkspace.

Until there is a proper solution from Microsoft to tackle this issue, we need to build our own solution to set NotebookId dynamically irrespective of the workspace the pipeline runs in.

I am going to explain in 3 steps how I have achieved this so that deployment to different workspace goes smooth.

Background:

- Created Workspace DevWorkspace and ProdWorkpace

- Created Lakehouse LH_BRONZE, LH_SILVER and LH_GOLD (Medallian architecture)

- Created Notebook1 and Notebook2 which need to be run dynamically in DevWorkspace and ProdWorkspace

Step 1: Create a notebook with name Notebook_config under DevWorkspace and written PySpark code to list all the fabric items in the "DevWorkspace" and store the information as a delta table "WorkspaceItems" in the bronze lakehouse.

Cell1# set default lakehouse to LH_BRONZE. This must be first cell in the notebook.

%%configure -f
{ 
    "defaultLakehouse": {
        "name": "LH_BRONZE",
    }
}

Cell2# list workspace items and insert into delta table

import pandas as pd
import sempy.fabric as fabric
from pyspark.sql.functions import *
from delta.tables import *

client = fabric.FabricRestClient()

workspaceId = fabric.get_workspace_id()

response = client.get(f"/v1/workspaces/{workspaceId}/items")

pd_df = pd.json_normalize(response.json()['value'])

spark.sql("SET spark.databricks.delta.schema.autoMerge.enabled = true") 

df_source = spark.createDataFrame(pd_df)

target_table_path = f"Tables/WorkspaceItems"

if DeltaTable.isDeltaTable(spark, f"{target_table_path}"):
    print("The Table already exists so merging data...")
    targetDeltaTable = DeltaTable.forPath(spark, target_table_path)
    (targetDeltaTable.alias("t")
        .merge(df_source.alias("s"), "s.id = t.id")
        .whenMatchedUpdateAll()
        .whenNotMatchedInsertAll()
        .whenNotMatchedBySourceDelete()
        .execute()
    )     
else:
    print("Creating new table and inserting data...")
    df_source.write.format("delta").mode("append").save(f"{target_table_path}")

After the successful run of Notebook_config, Delta table WorkspaceItems is created and loaded with all the fabric items including NotebookIds of Notebook1 and Noteboook2 from the DevWorkspace as shown below:

Step 2: Create a Data pipeline with name p_set_variables which will return NotebookIds of Notebook1 and Notebook2 using pipeline activities Lookup, Filter and set variables as shown below:

Fig: p_set_variables data pipeline with return variables

Fig: Lookup setting

Fig: Filter Notebook1 setting

Fig: Filter Notebook2 setting

Fig: Set variable settings as Pipeline return value

Step 3: Create a Data pipeline with name p_demo_pipeline1 where Notebook1 and Notebook2 need to be run. You need to add Invoke pipeline (legacy) and invoke the data pipeline p_set_variable (step 2) which returns the NotebookIds of Notebook1 and Notebook2 as Pipeline return values.

Fig: Invoking p_set_variables

Fig: Setting run Notebook1 dynamically using Workspace Id and Notebook Id.

Fig: Setting run Notebook2 dynamically using Workspace Id and Notebook Id.

Please note that once you deploy the Fabric workspace items from DevWorkspace to ProdWorkspace, you need to run Notebook_Config one time to load fabric item list to delta table WorkspaceItems in lakehouse (LH_BRONZE) in the production workspace.

Please share your thoughts on this. Anyone has got an alternative simple solution, please share it.

Regards

Khagendra

matskah · ‎06-22-2025

Hello @KhagendraWagle ,

first of all I really like your solution as it seems simple, yet very practical.

However, I have one question regarding your approach and correct my if I am wrong, but:

When you set the connection for your lookup activity (LH_Bronze), will this always refer to the LH_Bronze in your DEV-Workspace? Because that one is not set dynamically and therefore the translation of "LH_Bronze" to the JSON-Code of a Lakehouse-Id is static I think.

Please share your thoughts here, as I am also not fully sure here.

Best regards

Mats K

KhagendraWagle · ‎06-22-2025

Dear Mats,

The solution I shared was one we had successfully implemented in our Fabric platform. The Lakehouse itself was never the issue across environments — the pipeline always referenced the Lakehouse from the workspace it was running in.

However, our company has since updated its approach. We're now using the recently released Variable Library feature. We've configured environment-specific variables within the library and are leveraging them across our data pipelines. The great benefit of the Variable Library is that it can be integrated at any point in the pipeline process, providing flexibility and cleaner configuration management.

I hope this helps clarify things.

Best regards,
Khagendra

matskah · ‎06-23-2025

Thank you very much for the clarification @KhagendraWagle .
I'll definately have a look into Variable Librarys and give it a try!

Best regards

Mats

balveertanwar · ‎02-15-2025

Hi @KhagendraWagle , this is a beautiful solution for limited sets of Notebooks and pipelines. If we are using thousand of items, what could be the best approach. Do you have any solutions to automate this scenario.

Best Regards,

Balveer

KhagendraWagle · ‎03-05-2025

@balveertanwar

I agree that it is impossible to set up all the variables when you have hundreds of notebooks. Unfortunately, I do not have an alternative solution for this. I hope Microsoft comes up with a solution soon.

Regards,
Khagendra

JohannUVA · ‎12-20-2024

Very helpful, thank you!

Anonymous · ‎11-25-2024

Hi @KhagendraWagle ,

Thank you so much for sharing this, it's very informative for anyone who has encountered similar issues before.

Best regards,

Adamk Kong

Set NotebookId dynamically in notebook activity of data pipeline irrespective of Fabric workspace

Helpful resources

Fabric Monthly Update - August 2025

Fabric Community Update - August 2025

Huge last-minute discounts for FabCon Vienna from September 15-18, 2025

Set NotebookId dynamically in notebook activity of data pipeline irrespective of Fabric workspace

Helpful resources

Fabric Monthly Update - August 2025

Fabric Community Update - August 2025