One of the more common coding tasks that I reuse often is using a Notebook to retrieve settings and information related to the workspace/environment/etc, to facilitate dynamic and meta-driven development. I will usually use an initial notebook that I run as the first step in a Pipeline which setups logging, retrieves lakehouse/workspace information, etc. I then exit the notebook returning a JSON string with the values gathered: ex. # All of these would have been pre-populated earlier in the code as needed....
ret_cfg = {
"workspace_id": workspace_id,
"workspace_name": workspace_name,
"environment": environment,
"metadata_lakehouse_id": lh_metadata_id,
"bronze_lakehouse_id": lh_bronze_id
}
# Exit notebook with configuration
mssparkutils.notebook.exit(json.dumps(ret_cfg)) In the calling orchestration pipeline, I would then have to create individual Variables as needed, and set the values from the from the Output of the for each: However, as the return values increase than this causes the Pipeline to become cluttered and unmanageable. I'm proposing a JSON variable or Key/Value Variable type (or something similar) in addition to the String/Array/Boolean/Integer types, and a way to parse and/or read the JSON value from the notebook (which could also be useful in other situations different to this.)
... View more