Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified as a Fabric Data Engineer: Check your eligibility for a 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700. Get started

Reply
KhagendraWagle
Frequent Visitor

Set NotebookId dynamically in notebook activity of data pipeline irrespective of Fabric workspace

I believe most of you have come to the situation where a notebook needs to be run from the current workspace. It can easily be done selecting the the current workspace (e.g. DevWorkspace)  and the notebook (e.g. Notebook1) as shown below:

 

KhagendraWagle_0-1732519906265.png

 

This works perfectly but the issue pops up when the pipeline is deployed to another workspace (e.g ProdWorkspace). There is no Fabric deployment rule available a data pipeline to overwrite workspace from "DevWorkspace" to "ProdWorkspace" at the time of writing this post. 

Setting Fabric workspace dynamically in pipeline activity is easier with the availibility of system variable

@pipeline().DataFactory . 
 
KhagendraWagle_1-1732519933715.png

 

There is no inbuilt function available to set NotebookId dynamically. We could set fixed NotebookId of Notebook1 in DevWorkspace but the NotebookId for the same Notebook would be different in ProdWorkspace.

Until there is a proper solution from Microsoft to tackle this issue, we need to build our own solution to set NotebookId dynamically irrespective of the workspace the pipeline runs in. 

 

I am going to explain in 3 steps how I have achieved this so that deployment to different workspace goes smooth. 

Background:

- Created Workspace DevWorkspace and ProdWorkpace 

- Created Lakehouse LH_BRONZE, LH_SILVER and LH_GOLD (Medallian architecture)

- Created Notebook1 and Notebook2 which need to be run dynamically in DevWorkspace and ProdWorkspace

 

KhagendraWagle_2-1732519981691.png

 

Step 1: Create a notebook with name Notebook_config under DevWorkspace and written PySpark code to list all the fabric items in the "DevWorkspace" and store the information as a delta table "WorkspaceItems" in the bronze lakehouse.

 

Cell1# set default lakehouse to LH_BRONZE. This must be first cell in the notebook.

 

%%configure -f
{ 
    "defaultLakehouse": {
        "name": "LH_BRONZE",
    }
}​

 

Cell2# list workspace items and insert into delta table

 

import pandas as pd
import sempy.fabric as fabric
from pyspark.sql.functions import *
from delta.tables import *

client = fabric.FabricRestClient()

workspaceId = fabric.get_workspace_id()

response = client.get(f"/v1/workspaces/{workspaceId}/items")

pd_df = pd.json_normalize(response.json()['value'])

spark.sql("SET spark.databricks.delta.schema.autoMerge.enabled = true") 

df_source = spark.createDataFrame(pd_df)

target_table_path = f"Tables/WorkspaceItems"

if DeltaTable.isDeltaTable(spark, f"{target_table_path}"):
    print("The Table already exists so merging data...")
    targetDeltaTable = DeltaTable.forPath(spark, target_table_path)
    (targetDeltaTable.alias("t")
        .merge(df_source.alias("s"), "s.id = t.id")
        .whenMatchedUpdateAll()
        .whenNotMatchedInsertAll()
        .whenNotMatchedBySourceDelete()
        .execute()
    )     
else:
    print("Creating new table and inserting data...")
    df_source.write.format("delta").mode("append").save(f"{target_table_path}")

 

 

After the successful run of Notebook_config, Delta table WorkspaceItems is created and loaded with all the fabric items including NotebookIds of Notebook1 and Noteboook2 from the DevWorkspace as shown below:

 

KhagendraWagle_4-1732520360470.png

 

Step 2: Create a Data pipeline with name p_set_variables which will return NotebookIds of Notebook1 and Notebook2 using pipeline activities Lookup, Filter and set variables as shown below:

 

KhagendraWagle_0-1732520788979.png

        Fig: p_set_variables data pipeline with return variables

 

KhagendraWagle_1-1732520820336.png

        Fig: Lookup setting

 

KhagendraWagle_2-1732520838952.png

        Fig: Filter Notebook1 setting

 

KhagendraWagle_3-1732520861211.png

        Fig: Filter Notebook2 setting

 

KhagendraWagle_4-1732520882004.png

         Fig: Set variable settings as Pipeline return value

 

Step 3: Create a Data pipeline with name p_demo_pipeline1 where Notebook1 and Notebook2 need to be run. You need to add Invoke pipeline (legacy) and invoke the data pipeline p_set_variable (step 2) which returns the NotebookIds of Notebook1 and Notebook2 as Pipeline return values. 

 

KhagendraWagle_5-1732520948485.png

        Fig: Invoking p_set_variables 

 

KhagendraWagle_6-1732520975132.png

        Fig: Setting run Notebook1 dynamically using Workspace Id and Notebook Id. 

 

KhagendraWagle_7-1732520989285.png

        Fig: Setting run Notebook2 dynamically using Workspace Id and Notebook Id. 

 

Please note that once you deploy the Fabric workspace items from DevWorkspace to ProdWorkspace, you need to run Notebook_Config one time to load fabric item list to delta table WorkspaceItems in lakehouse (LH_BRONZE) in the production workspace. 

Please share your thoughts on this. Anyone has got an alternative simple solution, please share it. 

Regards
Khagendra
3 REPLIES 3
balveertanwar
New Member

Hi @KhagendraWagle , this is a beautiful solution for limited sets of Notebooks and pipelines. If we are using thousand of items, what could be the best approach. Do you have any solutions to automate this scenario.

 

Best Regards,

Balveer

JohannUVA
New Member

Very helpful, thank you!

v-kongfanf-msft
Community Support
Community Support

Hi @KhagendraWagle ,

 

Thank you so much for sharing this, it's very informative for anyone who has encountered similar issues before.

 

Best regards,

Adamk Kong

Helpful resources

Announcements
Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

JanFabricDE_carousel

Fabric Monthly Update - January 2025

Explore the power of Python Notebooks in Fabric!

JanFabricDW_carousel

Fabric Monthly Update - January 2025

Unlock the latest Fabric Data Warehouse upgrades!