Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
DebbieE
Community Champion
Community Champion

Dataflow version control

I am working with a team who are worried that if a dataflow changes then lots of reports break that are joined to the dataflow because people who create the datasets and data reports don't know that its been changed.

 

What they have come up with is having dataflow versions. So people will be working against specific dataflows.

Say Dataflow V1

Dataflow V2 

Dataflow  V3 etc,

 

I dont think this is a good idea after reading up on governance. the data duplication for example to me is a no go but they would like more information on why this isnt a good idea,

 

If anyone has any thoughts on this. That would be great

4 REPLIES 4
FireFighter1017
Advocate III
Advocate III

Ok, that may not help you with "how this is a bad idea" and maybe too late, but hopefully will help you and others understand your team's struggles.

Solutions have been made available by Microsoft since you posted your question so I think you should read this.

 

What they are after is the same thing my team wanted to do around the same time as you posted this question: source code version control.  And Power BI was and is still really bad at it.  But was way worse back in 2022.

 

Let's first address the data protection issue:

The data is irrelevant for this specfic requirement.  But in order to comply with data protection, you would require a similar landscape (DEV-QA-PROD) on your data sources.  Which is not always technically possible or cost effective.  The other option is create and to secure DEV and QA workspace so that only developers and testers have access and still use prod data.

 

Use deployment pipelines for testing and detect issues before rolling out changes in production

The solution your team implemented is close to what Microsoft is expecting us to do: deployment pipelines.  The only difference is that your DEV (DataflowV1), QA (DataflowV2) and PROD (Dataflow V3) dataflows would have the same name only in separate workspaces. 

You should look into deployment pipelines as it would simplify deployments and mitigate the risk of connecting to the wrong dataflow by mistake.

 

That being said, deployment pipelines aren't 100% safe against report failures.  There's always a chance that some other team is connecting to your dataflow and you will break their reports.

So you need a mean to rollback changes.

Deployment pipelines can help only if you detected the issue in ether DEV or QA workspaces.  Once the changes are in PROD workspace, there's no way to rollback changes, unless...

 

Add a version control tool outside of Power BI to archive and document changes

In parallel to deployment pipelines, your teams should consider using Git or Sharepoint to do version control. 

We have been using Sharepoint to manage versions of dataset as .pbix files and dataflows as JSON exports.

We are now planning on using our corporate Github to do more effective version control by saving .pbix files as .pbip in order for Git to detect changes and only store changes instead of a complete copy of the .pbix file.

 

Other teams in our organization are already doing it and are looking to integrate code deployment directly from a Github repo to Power BI dataflows using a CLI.

 

For sure, data protection is the basic requirement and there is a way to address this issue if you are dealing with sensitive data.

But you also definitely want to make sure you have rollback capabilities in case something goes wrong.

 

P.S.: as for @samaguire 's answer, it would require other Microsoft Azure capacities whereas deployment pipelines mixed with Git not only do not add any cost but also gives you rollback capabilities, commit history and documentation of changes. Because in the end, after rolling back the changes you still need to know what changes were made in order to fix the issue it caused.  And in the case where many changes were made, Git would help you pinpoint the exact commit that contains the changes you want to undo.

 

I hope this helps not only you but others that may stumble upon this post.

 

samaguire
Advocate II
Advocate II

This maybe a bit late, however, I came across this searching for something else. Literally just having finished reading the MS Doc for BYO ADLS Gen2 for Dataflows I can say if they want version control they should set this up. The system takes snapshots of not only the data, but the Dataflow metadata with every refresh. Configuring dataflow storage to use Azure Data Lake Gen 2 - Power BI | Microsoft Docs

MFelix
Super User
Super User

Hi @DebbieE ,

 

I use some dataflow on my projects and has you refer we do not tend to use versions of the dataflows, I don't know what type of changes you refer, but if the changes is adding columns or data within the dataflow that should not give you any issue since what can happen is that the columns added aren't picked up.

 

However if the changes passes trough formatting change, renaming, deleting  then your users can have an issue.

 

Using the lineage you can check wich datasets are connected to the dataflow and send warnings about updates, another options is to create a power automate to send an alert to users when there are refresh or updates to dataflow

 

https://docs.microsoft.com/en-us/power-query/dataflows/send-notification-when-dataflow-refresh-compl...

The link above is for a refresh of a dataflow but maybe can be adjusted for an update.

 

Hope this helps to get some guidance.


Regards

Miguel Félix


Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Check out my blog: Power BI em Português



DebbieE
Community Champion
Community Champion

can I give them any specific reasons about version control being the wrong move. I mentioned extra complexity and data duplication and they are still arguing for it. So some good reasons not to do it is what I am after.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.