Solved: Re: Deployment pipeline and lakehouse content

AdamFry · ‎04-17-2024

Hi there,

I'd like to understand the deployment pipeline functionality when it comes to working with lakehouses. I have two development workspaces named bronze and silver. Each of the workspaces has a lakehouse. And the bronze lakehouse has some folders and files. The silver lakehouse has a shortcut to the bronze lakehouse root folder. I noticed that when I deploy the dev workspaces to my test workspaces, the lakehouses are created but they are empty. Is this intentional? I have other fabric items in my workspaces like notebooks and pipelines that depend on the content inside the lakehouse so I'm wondering how people handle this situation when deploying their workspace from dev to test to production? Is there some certain amount of manual setup that must follow the deployment pipeline to configure structures and shortcuts in your lakehouse? Do you have some method to generate those items? I'm trying to understand the best practices around setting these things up. Any help would be greatly appreciated!

Anonymous · ‎04-18-2024

Hi @AdamFry ,

Thanks for using Fabric Community.

Right now, the Lake houses are created empty. We are working on improving that and getting the definition of the lakehouse to be exported and deployed. It will come in pieces so it will take time until the full definition will become available.

What we recommend to do with current options:

use data pipelines and notebooks to build the Lakehouse as much as possible. Since they are source controlled and deployed with their definition, you can manage the Lakehouse content and changes by deploying the data pipeline/notebook and running it to apply the changes into the Lakehouse.
For things that can't be managed through data pipeline/notebook, it will still need to happen manually. It needs to be configured once and then it will not be overridden in later deployments. For example- the shortcut to the bronze LH.

Docs to refer -
Lakehouse deployment pipelines and git integration - Microsoft Fabric | Microsoft Learn
Best practices for lifecycle management in Fabric - Microsoft Fabric | Microsoft Learn

Hope this is helpful. Please let me know incase of further queries.

View solution in original post

Dronec · ‎04-18-2024

Hi Adam, Fabric deployment pipeline allows creating deployment rules for notebooks that parametrize default Lakehouse connection. More about this is here: https://learn.microsoft.com/en-us/fabric/data-engineering/notebook-source-control-deployment

View solution in original post

Madalina2801 · ‎06-13-2024

Hi everyone,

I have another question related to this topic. How can we deploy data pipelines across workspaces? What is the most efficient way to achive this?

If I have a data pipeline with different activities (copy data, dataflows) that have different sources/destinations how can I change them during the deployment process? What are the best practices for deploying data pipelines?

Thanks in advance!

Anonymous · ‎04-18-2024

Hi @AdamFry ,

Thanks for using Fabric Community.

Right now, the Lake houses are created empty. We are working on improving that and getting the definition of the lakehouse to be exported and deployed. It will come in pieces so it will take time until the full definition will become available.

What we recommend to do with current options:

use data pipelines and notebooks to build the Lakehouse as much as possible. Since they are source controlled and deployed with their definition, you can manage the Lakehouse content and changes by deploying the data pipeline/notebook and running it to apply the changes into the Lakehouse.
For things that can't be managed through data pipeline/notebook, it will still need to happen manually. It needs to be configured once and then it will not be overridden in later deployments. For example- the shortcut to the bronze LH.

Docs to refer -
Lakehouse deployment pipelines and git integration - Microsoft Fabric | Microsoft Learn
Best practices for lifecycle management in Fabric - Microsoft Fabric | Microsoft Learn

Hope this is helpful. Please let me know incase of further queries.

AdamFry · ‎04-18-2024

I've marked this as a solution but I have another question, please let me know if it makes more sense for me to just create a new topic/thread since this is related but a bit different. Again related to the release pipelines and lakehouses but now considering how release pipelines handle the lakehouse sources in notebooks.

I have bronze_dev workspace with a lakehouse named lakehouse_bronze. Now if I add a notebook in the bronze workspace which has the lakehouse_bronze added as a lakehouse source and my notebook reads from the files in the lakehouse_bronze and writes them as delta tables in lakehouse_bronze, when I publish this workspace from bronze_dev to bronze_test, am I correct that the notebook will be created in bronze_test but the lakehouse that the notebook references is still the lakehouse in bronze_dev? Is it a normal practice after deployment to go through all the notebooks and reconfigure the lakehouse sources in them? Is there a way for the deployment pipeline to automatically update that lakehouse reference?

Dronec · ‎04-18-2024

Hi Adam, Fabric deployment pipeline allows creating deployment rules for notebooks that parametrize default Lakehouse connection. More about this is here: https://learn.microsoft.com/en-us/fabric/data-engineering/notebook-source-control-deployment

AdamFry · ‎04-19-2024

Thank you so much, I totally missed the topic of deployment rules in the deployment pipelines. I really appreciate the help getting me in the right direction!

LoicQ · ‎08-19-2024

@AdamFry You can set up a rule to change the default lakehouse of the notebook but, as far as I know, you'll still have to go and change manually each other lakehouse included one by one... I think that's a needed improvement .

Dronec · ‎08-19-2024

I'm using a Powershell script in Azure Devops pipeline for exactly this purpose: it updates every single connection in all my notebooks, pipelines, etc.

AdamFry · ‎08-19-2024

Would it be possible for you to share your powershell script that you run to reconfigure these from AzureDevOps? I assume you'd need to strip out some personal/company information from it but if you're able to share a sanitized version of it, that would be a huge help!

Dronec · ‎08-19-2024

Sure, no problem at all. The process is this:

1. You deploy your DEV Fabric environment to Production using the built-in Fabric deployment pipeline without any changes. As a result, you get a copy of your DEV environment in PROD with all the DEV connections.

2. Assuming that you have GIT connection in your PROD environment, you commit all your latest PROD changes to GIT. As a result, you get your latest changes (but with updated object IDs) in PROD's GIT.

3. Then you run the following Powershell script (either manually, by pulling the PROD GIT to your machine or, like me, using a Devops pipeline).

param(
$folderPath = "."
)
$csvPath = "$folderPath/NPD2PRD_values.csv"
$fileExtensions = @("*.json", "*.py")

# Read the CSV file into a variable
$replacements = Import-Csv -Path $csvPath

# Get all files with the specified extensions, including subfolders
$files = @()
foreach ($fileExtension in $fileExtensions) {
    $files += Get-ChildItem -Path $folderPath -Recurse -Filter $fileExtension
}

# Loop through each file
foreach ($file in $files) {
    # Read the content of the file
    $content = Get-Content -Path $file.FullName -Raw
Write-Output "Replacing contents of $file..."
    # Perform the replacements
    foreach ($replacement in $replacements) {
        $oldValue = $replacement.NPD
        $newValue = $replacement.PRD
        $content = $content -replace [regex]::Escape($oldValue), [regex]::Escape($newValue)
    }

    # Write the modified content back to the file
    Set-Content -Path $file.FullName -Value $content -NoNewline
}

Note the CSV file that contains all the connections the script replaces. It looks like this:

After that you commit the files, update your PROD Fabric and get all your production connections.

AdamFry · ‎08-22-2024

Thank you so much!

Dronec · ‎08-22-2024

No worries! If you're interested, here's the Devops pipeline that pulls the production branch after the dev (or npd), runs the script and commits the changes back to GIT:

trigger:
- prd

pool:
  vmImage: windows-latest

steps:
- checkout: self
  persistCredentials: true

- script: |
    git config --global user.email "andrey.baranov@....com.au"
    git config --global user.name "ProdLaunch Workflow"
    git checkout -b prd
  displayName: "Checkout production branch"
- task: PowerShell@2
  inputs:
    filePath: '$(Build.SourcesDirectory)/Replace_NPD2PRD_values.ps1'
    arguments: '-folderPath "$(Build.SourcesDirectory)/"'
  displayName: "Getting the repo ready for release"
  
- script: |
   git add .
   git commit -m "Preparing for production release"
   git push origin HEAD:prd
  displayName: "Commit production branch."

AdamFry · ‎04-18-2024

Thank you so much. That makes sense. For now I will look at creating a setup notebook and pipeline that can configure all the manual bits. It would be amazing if the deployment pipeline had an option to run a specified arbitrary pipeline or notebook after the deployment process to handle these types of configurations.

Deployment pipeline and lakehouse content

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025

Join the #PBI10 DataViz contest

Deployment pipeline and lakehouse content

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025