Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
MikeH_SDE
Regular Visitor

Data Replication in Dev & Test Workareas for Data Engineering

We are going to have Dev Test and Production Data Engineering workareas managed by deployment pipelines. We are dealing with a very large enterprise scale telecoms datalakehouse. What are the methods & best practices for selectively replicating the datalakehouse to the Test workarea and Dev with it's multiple feature branches? We do not want to write anything to Prod tables until the pipelines are in Prod.

Thanks

Mike

1 ACCEPTED SOLUTION

Hi @MikeH_SDE,

Thanks for reaching out to the Microsoft fabric community forum.

 

Yes, we can Create a trigger configuration table that specifies which pipelines to run in each environment (Dev, Test, and Production). This table can contain permissions or flags indicating active pipelines per environment. The deployment process should include logic to read from this table and determine which pipelines are enabled based on the current deployment context.

 

By implementing a flexible and dynamic approach, using configurable pipeline controls and execution logic based on the deployment environment, you can significantly minimize the risk of unnecessary data backfilling in your Dev and Test environments. This not only saves resources but also creates a clearer and more manageable environment for testing and development.

 

  • Create a configuration process where each pipeline has a flag or setting that indicates its intended environment (Dev, Test, or Prod). Modify your deployment process to only execute pipelines based on the current environment context.
  • For example, when deploying to Dev, check only the pipelines marked for Dev execution in your configuration. This can be achieved through a similar logic in your deployment scripts.
  • Utilize staging tables or intermediate storage for preparing data before it’s replicated to Dev and Test. This allows for transformation and cleansing without touching production assets. Implement a pipeline that pulls data from the staging area into the Dev/Test environments based on the configuration settings.

For detail information please refer the documentation link for your better understanding:

https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/deploy-content?tabs=new#deploying...

https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/get-started-with-deployment-pipel...

 

I hope my suggestions give you good ideas, if you need any further assistance, feel free to reach out.

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you. 

 

View solution in original post

3 REPLIES 3
MikeH_SDE
Regular Visitor

@v-tsaipranay Thanks for the advice, two things we need to solve on this are:

  1. How do we make changes and fix things, like add new columns and populate old data, correct bad data without rerunning the whole pipeline history?
  2. Where do we create and keep our development assets like data discovery notebooks etc, we will want to keep these things and use them on production data for discovery tests though would need to be on developement data. If we promote those development, discovery & tests items with the dev pipelines it could get messy. I am considering having separate engineering & analytics workspaces. Engineering would have source control and used for pipelines whilst analytics would not and be used by analysts who dont want the hassle of source control and for the engineers tests, etc. 
MikeH_SDE
Regular Visitor

I am wondering, as I believe data does not promote - definitions and processes promote, can we have a trigger configuration table that sets what pipeline in what environment are run?

The deployment process changes which environment column is checked so we only run in Dev and Test environments the pipelines we need. Otherwise Dev & Test would backfill everything when it is test run anyway

Hi @MikeH_SDE,

Thanks for reaching out to the Microsoft fabric community forum.

 

Yes, we can Create a trigger configuration table that specifies which pipelines to run in each environment (Dev, Test, and Production). This table can contain permissions or flags indicating active pipelines per environment. The deployment process should include logic to read from this table and determine which pipelines are enabled based on the current deployment context.

 

By implementing a flexible and dynamic approach, using configurable pipeline controls and execution logic based on the deployment environment, you can significantly minimize the risk of unnecessary data backfilling in your Dev and Test environments. This not only saves resources but also creates a clearer and more manageable environment for testing and development.

 

  • Create a configuration process where each pipeline has a flag or setting that indicates its intended environment (Dev, Test, or Prod). Modify your deployment process to only execute pipelines based on the current environment context.
  • For example, when deploying to Dev, check only the pipelines marked for Dev execution in your configuration. This can be achieved through a similar logic in your deployment scripts.
  • Utilize staging tables or intermediate storage for preparing data before it’s replicated to Dev and Test. This allows for transformation and cleansing without touching production assets. Implement a pipeline that pulls data from the staging area into the Dev/Test environments based on the configuration settings.

For detail information please refer the documentation link for your better understanding:

https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/deploy-content?tabs=new#deploying...

https://learn.microsoft.com/en-us/fabric/cicd/deployment-pipelines/get-started-with-deployment-pipel...

 

I hope my suggestions give you good ideas, if you need any further assistance, feel free to reach out.

 

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

 

Thank you. 

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

JanFabricDE_carousel

Fabric Monthly Update - January 2025

Explore the power of Python Notebooks in Fabric!

JanFabricDW_carousel

Fabric Monthly Update - January 2025

Unlock the latest Fabric Data Warehouse upgrades!

JanFabricDF_carousel

Fabric Monthly Update - January 2025

Take your data replication to the next level with Fabric's latest updates!