Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Story time. I was trying to extract a couple distinct resources out of a PBI workspace by way of git. (a notebook and pipeline).
... I configured a new git repo, and committed the things I wanted. After downloading all the assets via zip file (indirectly from the Azure Devops portal), I forgot to detach git from this PBI workspace again. I have done this in the past and was fairly comfortable with the approach - given that I had no problems in the past.
Later in the day I returned to the PBI workspace to find that it is almost entirely gone. I don't have a good understanding of what happened, and haven't found a pattern to understand what was left behind. At first glance it looks like most dataflows and datasets/models and pipelines and notebooks are all gone!
The only things that remain are the things that were committed to git, and lakehouses.
Of all the things that were lost, the datasets are the most shocking. It seems like datasets should not be so ephemeral, and should not simply disappear based on the git connection. The purpose of git is for managing source code (not for DATA). If the use of git is directly conflict with some other concern (like preserving 10 GB of data in a dataset), then the git integration should back off. It should never assume it can just blow away the customer's databases at will. Or if they need to do that, they should triple-check with the user before hand! Or perhaps there should be a workspace setting that says whether I actually care about the data in the workspace or not - so the git integration takes that into consideration.
Anyway, I wanted to share the story in case it helps others. I almost certainly did something "stupid", and it was a learning experience which I'm not likely to forget very soon. Thankfully I wasn't stupid enough to play with this functionality in a production workspace. Has anyone else experienced this? I'm assuming that I can call Mindtree and get my workspace restored, right?
Solved! Go to Solution.
Hi @dbeavon3 ,
Thanks for posting in Microsoft Fabric Community,
I completely understand your frustration. Git in Fabric is designed for managing source code, not large datasets, which is why it can sometimes lead to unintended deletion of untracked resources like datasets. Git integration should ideally not affect datasets or important data by default.
To help prevent something like this in the future, here are a couple of suggestions:
Just wanted to check in—were you able to restore your workspace through Mindtree?
Since the "hot" recycle bin retention is 7 days, I hope you were able to recover everything. Let me know if you need any further assistance!
Best regards,
Vinay.
Hi @dbeavon3
It seems that while using Git integration in Power BI, you committed some assets (like notebooks and pipelines) to a Git repo and forgot to detach it afterward. This caused Power BI to sync your workspace with the Git repo, likely removing datasets, dataflows, and other assets not tracked in Git. Git doesn’t handle data storage, so it might have overwritten or deleted datasets in the process.
Since this wasn’t in a production workspace, you should contact Mindtree or Microsoft Support to see if they can restore your lost data. Going forward, always ensure Git is detached after use and back up datasets separately to avoid this issue.
Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂
Kind Regards,
Poojara
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS
Hi @dbeavon3
It seems that while using Git integration in Power BI, you committed some assets (like notebooks and pipelines) to a Git repo and forgot to detach it afterward. This caused Power BI to sync your workspace with the Git repo, likely removing datasets, dataflows, and other assets not tracked in Git. Git doesn’t handle data storage, so it might have overwritten or deleted datasets in the process.
Since this wasn’t in a production workspace, you should contact Mindtree or Microsoft Support to see if they can restore your lost data. Going forward, always ensure Git is detached after use and back up datasets separately to avoid this issue.
Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂
Kind Regards,
Poojara
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS
>> his caused Power BI to sync your workspace with the Git repo, likely removing datasets, dataflows, and other assets not tracked in Git.
It seems like very poor judgement for Microsoft to blow away datasets without some sort of user verification. Nobody would think it is a reasonable design for a git integration to delete large amounts of user data.
Based on what developers understand about git (from other product integrations), this one behaves in a very unusual and unexpected way. For example, whenever I integrate git with a coding workspace on my desktop, I never worry about the sudden loss of uncommitted assets! Given how strangely this git integration works, there should be a lot more guardrails!
Interestingly Microsoft did not delete the lakehouse assets in the workspace. On some level they seem to understand that deleting data is unacceptable. But for some reason they don't hesitate to delete PBI datasets
Hi @dbeavon3 ,
Thanks for posting in Microsoft Fabric Community,
I completely understand your frustration. Git in Fabric is designed for managing source code, not large datasets, which is why it can sometimes lead to unintended deletion of untracked resources like datasets. Git integration should ideally not affect datasets or important data by default.
To help prevent something like this in the future, here are a couple of suggestions:
Just wanted to check in—were you able to restore your workspace through Mindtree?
Since the "hot" recycle bin retention is 7 days, I hope you were able to recover everything. Let me know if you need any further assistance!
Best regards,
Vinay.
Hey mate, thanks for your post.
Did you manage to recover the deleted files, eventually? It happened the same to me, contacted the MS Support and they said there is no way to recover deleted artifacts from a workspace when deleted via git.... which sounds weird to me, but it could be the case. Just checking if you eventually managed somehow to recover the lost data. I was so surprised a company like MS doesn't have a disaster recovery option for this sort of actions they deployed on publicly available products.
Thanks.
I appreciate your post, and your sincere concern. The reason for the public post is because I'm not that embarrassed to share my mistake, and I want others to be aware of the potential risks in attaching a workspace to a git repo. I'm guessing lots of people have made similar mistakes, but I wasn't actually finding any of the horror stories when I googled for it.
... Again, it was an experience that was unlike every other experience I've had with git. Normally the use of git should give people a warm-and-fuzzy sense of confidence and reassurance that their assets will not be lost.... But in Power BI, it is the exact opposite! In the future, I will always be nervous to introduce git to a workspace; and I'm guessing I'll probably never be brave enough to hook it up to a production workspace.
>> unintended deletion of untracked resources like datasets. Git integration should ideally not affect datasets or important data by default.
Right. There should at least be a configuration option (checkbox) that determines whether a user wants their datasets to be managed under the same umbrella as their python notebooks.
>> "Store Datasets Separately"
I immediately came to the same conclusion you did. Perhaps I had missed some "best practice" whereby users are warned away from from enabling git on any workspace that contains data or datasets? I'm guessing that these PBI workspaces will eventually need to "specialize", and serve in the capacity of EITHER hosting data, OR hosting programming assets (... but NEVER serving in both capacities.) Otherwise the risks of the git integration are too high and Fabric is liable to blow away our stuff.
I don't know if Microsoft is listening, but I think there is an INCORRECT assumption that all the users of Power BI know how to open a support ticket and know how to ask for an engineer to clean up the messes that happen. This incorrect assumption seems to allow Microsoft to create solutions that take unnecessary risks on behalf of their users. For each PBI user (like myself) who is very familiar with the SOP of calling Mindtree to clean up a PBI mess, there are probably two or three users who are NOT familiar with it. Those people are fending for themselves, and are probably not recovering any of their data in a similar scenario. I think there is a "moral hazard" that is present, given that the Microsoft PG assumes that Mindtree will step in and clean up all the messes that arise from the incorrect use of this git integration.
I'm assuming that I can call Mindtree and get my workspace restored, right?
Clock's ticking. 7 days in the "hot" recycle bin, 30 days max retention period.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
28 | |
26 | |
22 | |
22 | |
18 |
User | Count |
---|---|
52 | |
34 | |
28 | |
24 | |
21 |