Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreShape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.
I have a dataflow with 8+ sources.
Within this dataflow, over time I've moved from a development server to a production server, and I've referenced both dev and prod databases.
I've also noticed that I can reference a server & database in a single steps (Source = Sql.Database( <server>, <db> )) instead of two steps, and so I've edited all my entity queries to use this more concise approach. It seems that this single-line reference produces a distinct 'data source' in the lineage view from a 2-line approach, which I understand.
The solution is described in the documentation:
Troubleshooting dataflow issues - connection to the data source - Power Query | Microsoft Learn
Hi @FrankPreusker , thanks for sharing this link. Wasn't easy but it ended up working for me at the end. I'll share my experience, hoping it can help others.
If like me your solution is 100% cloud based, you can skip to the very end of the page Frank refered to in his previous message because it's all for On-premises datasource... except for the "Purple Note" at the bottom on the page. I followed the instructions from that note and it worked ! Not easy I must say, but hey, better then nothing.
My case is a 4 year old dataflow that has a tone of old connections artefacts :
The current Query is has simple as it gets. It uses one connection to another dataflow from another PBI env. There's not trace anywhere of the connections you see in the lineage.
The instructions say : "To remove this connection, in the Power BI service, select the settings icon, go to Manage Connections and Gateways > Connections select the ellipsis menu by the connection name, then select Removeto remove the old/unused data source connection".
The challenge here is to figure out which connections are "old/unused", so you can safely delete. Because I'm the main developper of this solution since it started 4 years ago, I have a pretty good idea. But that screen had 38 connections listed and I can see how it could get daunting.
HINT! Something I discovered later. The little circular arrows under the Status column are buttons that allow you to check the connection status. And you can even get more details on the status by clicking on it after it displays. It not bullet proof method but I think it helps in the decision making regarding which connections you can safely delete.
I did all the deleting I thought was OK (I kept 10 of the 38, not bad for a 4year old clean up). Then I went back to the lineage view and it was all clean !
Note that I didn't even have to do this part of the instruction : "Then go to the Home page, select the workspace, select the dataflow to open it, and then in the Table name page, select Edit tables. Then in the Power Query editor, select Save & Close for the dataflows to be updated to the current data source connection and to remove the older connection."
Also note that the active datasource is not showing. I think that's because connections to another dataflow from another PBI workspace is current not supported in the lineage view... sounds weird right? Let me know if you know more about that last point.
Hope this can help someone.
Have a nice day,
--mo
I created an idea, please vote if you want this issue to be fixed:
https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=1eedc819-8bf2-ee11-a73e-6045bd7cb2b6
done! ty
I am having the same issue now. I am hoping Microsoft will fix this.
I tried the project options workaround mentioned in another comment, but it didn't work in my case.
The only thing that worked for me was exporting the dataflow as a json file and creating a new dataflow from the json, but creating a new dataflow will break downstream usage so I don't want to do that.
I know this is an old post, but figured I would share something that worked for me...
One thing that I did to remove old SQL connections on a dataflow was when EDITing the existing table on the dataflow, I went into OPTIONS > PROJECT OPTIONS, changed the DATA GATEWAY to "(none)", clicked OK.
I then received a "Credentials are required to connect to the SQL source" error message on the query.
I went back into OPTIONS > PROJECT OPTIONS, changed the DATA GATEWAY back to the correct gateway that contains the connection.
Received a "We could not evaluate this query due to invalid or missing credentials" message.
Clicked on CONFIGURE CONNECTION...the connection to my correct gateway popped up, I clicked on CONNECT, and then clicked on SAVE and CLOSE.
After the validation completed and I closed the power query editor screen and dataflow, the lineage view no longer contains the old connections and it only contains the current active connection.
I did this successfully on 20+ dataflows that recently had a SQL server change (same query, just new server name).
Hopefully this helps.
This worked for me too, with one small modification. It actually wouldn't allow me to change the gateway to "None." I tried several times, but every time I clicked "OK", it automatically reverted back to the same gateway that was selected before. What actually worked was changing it to a different gateway, and then changing it back again. Thank you for pointing me in the right direction!
Your suggestion about taking "None", then you get the creds issue, then you select the desired gateway, worked for me. MUCH APPRECIATED!!!
@JenM , you're right, the problem is not resolved.
@v-kelly-msft , the workarounds you suggested don't address the issue at hand.
Here's my similar use case:
The Import from CDS dataflows has 2 entities from the same exact source (source A in the image below). The lineage shows an old artefact of a previously used connection that is no longer active in this dataflows. This is very confusing for the people responsible to maintain it!
When I go to Edit Entities, I can confirm via the Advanced Editor that both entitites are connected to source A.
I also tried to go to the Settings view of this dataflow and it gets worst there! There's a warning saying I can't refresh this dataflow because one of the credentials are invalid. Well, the invalid one is NOT used anymore. And to add to the confusion, there is also a 3rd source listed here which is neither used in that dataflow nor showing up in the lineage view.
I'm desperate to fix this and any help would be greatly appreciated!
Thank you.
Hi @JenM ,
Actually there's no such functionality to work it out,but I have a workaround,that is, you can click on the bottom right corner which I marked in red circle,then you will see that all related to the current datasource will be highlighted,and others will be grayed out.
Here is the related reference.
https://docs.microsoft.com/en-us/power-bi/collaborate-share/service-data-lineage
Sorry, but your workaround doesn't address the bug. If I click on the arrows for the dataflow, all 7 upstream data sources get highlighted. Only one of them is valid, the others should not exist, and they provide bad intelligence.
My (horrible) workaround is to export the dataflow as json, delete the dataflow and then reimport the json to a new dataflow. When I do this, the lineage diagram correctly shows just a single datasource for the dataflow. Of course this breaks every single usage of the dataflow, so in a high-usage scenario this would be unacceptible.
Hi @JenM ,
Is your issue solved?
No, the problem is not resolved. Then only way to remove obsolete data sources is to export and reimport the dataflow (as JSON). But that breaks any downstream usage...
Hi @JenM ,
If so ,I have a simple way for you:
Connect the dateflow in desktop,then go to data source setting and clear your data source then reenter it:
Finally save as a new report or cover the previous report and publish to service.
I guess this way will help solve your issue.
The question was how to remove old Data Sources from a Power BI Dataflow, not a Report in Power BI Desktop. Please see the answer posted by @EG151 above, which actually resolved the issue: Re: Obsolete data sources in dataflow lineage - Microsoft Fabric Community
Hi @JenM ,
You can edit the power queries and you will be able to change the data source.
Thanks,
Sai Teja
User | Count |
---|---|
37 | |
32 | |
20 | |
11 | |
8 |
User | Count |
---|---|
52 | |
42 | |
28 | |
12 | |
11 |