Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
JenM
Advocate II
Advocate II

Obsolete data sources in dataflow lineage

I have a dataflow with 8+ sources.
Within this dataflow, over time I've moved from a development server to a production server, and I've referenced both dev and prod databases.

 

I've also noticed that I can reference a server & database in a single steps (Source = Sql.Database( <server>, <db> )) instead of two steps, and so I've edited all my entity queries to use this more concise approach. It seems that this single-line reference produces a distinct 'data source' in the lineage view from a 2-line approach, which I understand.

 
My issue is that my lineage view shows every data source I've ever used. I expect it to show the current datasource only - In this case, it seems like it should just show one (#3):
 
  1. currentserver.database.windows.net
  2. DevDB-currentserver.database.windows.net
  3. ProdDB-currentserver.database.windows.net
  4. ProdDB-obsoleteserver.database.windows.net
What am I missing?
16 REPLIES 16
FrankPreusker
Advocate III
Advocate III

Hi @FrankPreusker , thanks for sharing this link. Wasn't easy but it ended up working for me at the end.  I'll share my experience, hoping it can help others.

 

If like me your solution is 100% cloud based, you can skip to the very end of the page Frank refered to in his previous message because it's all for On-premises datasource... except for the "Purple Note" at the bottom on the page. I followed the instructions from that note and it worked !  Not easy I must say, but hey, better then nothing.

 

My case is a 4 year old dataflow that has a tone of old connections artefacts :

molegris_0-1720708363082.png

The current Query is has simple as it gets.  It uses one connection to another dataflow from another PBI env. There's not trace anywhere of the connections you see in the lineage.

molegris_1-1720708444996.png

 

The instructions say : "To remove this connection, in the Power BI service, select the settings icon, go to Manage Connections and Gateways > Connections select the ellipsis menu by the connection name, then select Removeto remove the old/unused data source connection". 

 

The challenge here is to figure out which connections are "old/unused", so you can safely delete. Because I'm the main developper of this solution since it started 4 years ago, I have a pretty good idea. But that screen had 38 connections listed and I can see how it could get daunting. 

molegris_2-1720708904137.png

 

HINT! Something I discovered later. The little circular arrows under the Status column are buttons that allow you to check the connection status. And you can even get more details on the status by clicking on it after it displays. It not bullet proof method but I think it helps in the decision making regarding which connections you can safely delete.

molegris_3-1720709090763.png

 

I did all the deleting I thought was OK (I kept 10 of the 38, not bad for a 4year old clean up). Then I went back to the lineage view and it was all clean !

Note that I didn't even have to do this part of the instruction : "Then go to the Home page, select the workspace, select the dataflow to open it, and then in the Table name page, select Edit tables. Then in the Power Query editor, select Save & Close for the dataflows to be updated to the current data source connection and to remove the older connection."

molegris_4-1720709495873.png

Also note that the active datasource is not showing. I think that's because connections to another dataflow from another PBI workspace is current not supported in the lineage view... sounds weird right? Let me know if you know more about that last point.

 

Hope this can help someone.

Have a nice day,

 

--mo

 

 

 

frithjof_v
Community Champion
Community Champion

I created an idea, please vote if you want this issue to be fixed:

 

https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=1eedc819-8bf2-ee11-a73e-6045bd7cb2b6

done!  ty

frithjof_v
Community Champion
Community Champion

I am having the same issue now. I am hoping Microsoft will fix this.

 

I tried the project options workaround mentioned in another comment, but it didn't work in my case.

The only thing that worked for me was exporting the dataflow as a json file and creating a new dataflow from the json, but creating a new dataflow will break downstream usage so I don't want to do that.

EG151
Frequent Visitor

I know this is an old post, but figured I would share something that worked for me...

One thing that I did to remove old SQL connections on a dataflow was when EDITing the existing table on the dataflow, I went into OPTIONS > PROJECT OPTIONS, changed the DATA GATEWAY to "(none)", clicked OK. 

I then received a "Credentials are required to connect to the SQL source" error message on the query. 

I went back into OPTIONS > PROJECT OPTIONS, changed the DATA GATEWAY back to the correct gateway that contains the connection.

Received a "We could not evaluate this query due to invalid or missing credentials" message.

Clicked on CONFIGURE CONNECTION...the connection to my correct gateway popped up, I clicked on CONNECT, and then clicked on SAVE and CLOSE. 

After the validation completed and I closed the power query editor screen and dataflow, the lineage view no longer contains the old connections and it only contains the current active connection.

I did this successfully on 20+ dataflows that recently had a SQL server change (same query, just new server name).

Hopefully this helps.

This worked for me too, with one small modification. It actually wouldn't allow me to change the gateway to "None." I tried several times, but every time I clicked "OK", it automatically reverted back to the same gateway that was selected before. What actually worked was changing it to a different gateway, and then changing it back again. Thank you for pointing me in the right direction!

Your suggestion about taking "None", then you get the creds issue, then you select the desired gateway, worked for me.   MUCH APPRECIATED!!!

molegris
Advocate IV
Advocate IV

@JenM , you're right, the problem is not resolved.

@v-kelly-msft , the workarounds you suggested don't address the issue at hand.

 

Here's my similar use case:

 

The Import from CDS dataflows has 2 entities from the same exact source (source A in the image below).  The lineage shows an old artefact of a previously used connection that is no longer active in this dataflows.  This is very confusing for the people responsible to maintain it!

 

molegris_0-1595352097748.png

When I go to Edit Entities, I can confirm via the Advanced Editor that both entitites are connected to source A.  

I also tried to go to the Settings view of this dataflow and it gets worst there!  There's a warning saying I can't refresh this dataflow because one of the credentials are invalid.  Well, the invalid one is NOT used anymore.  And to add to the confusion, there is also a 3rd source listed here which is neither used in that dataflow nor showing up in the lineage view.

 

I'm desperate to fix this and any help would be greatly appreciated!

 

Thank you.

v-kelly-msft
Community Support
Community Support

Hi @JenM ,

 

Actually there's no such functionality to work it out,but I have a workaround,that is, you can click on the bottom right corner which I marked in red circle,then you will see that all related to the current datasource will be highlighted,and others will be grayed out.

 

Untitled.png

 

Here is the related reference.

https://docs.microsoft.com/en-us/power-bi/collaborate-share/service-data-lineage

 

Best Regards,
Kelly
Did I answer your question? Mark my post as a solution!

 

Sorry, but your workaround doesn't address the bug. If I click on the arrows for the dataflow, all 7 upstream data sources get highlighted. Only one of them is valid, the others should not exist, and they provide bad intelligence.

 

My (horrible) workaround is to export the dataflow as json, delete the dataflow and then reimport the json to a new dataflow. When I do this, the lineage diagram correctly shows just a single datasource for the dataflow. Of course this breaks every single usage of the dataflow, so in a high-usage scenario this would be unacceptible.

Hi @JenM

 

Is your issue solved?

 

Best Regards,
Kelly
Did I answer your question? Mark my post as a solution!
 

No, the problem is not resolved. Then only way to remove obsolete data sources is to export and reimport the dataflow (as JSON). But that breaks any downstream usage...

Hi @JenM ,

 

If so ,I have a simple way for you:

Connect the dateflow in desktop,then go to data source setting and clear your data source then reenter it:

Annotation 2020-04-07 094342.pngAnnotation 2020-04-07 094858.png

 

Finally save as a new report or cover the previous report and  publish to service.

I guess this way will help solve your issue.

 

Best Regards,
Kelly
Did I answer your question? Mark my post as a solution!

The question was how to remove old Data Sources from a Power BI Dataflow, not a Report in Power BI Desktop. Please see the answer posted by @EG151 above, which actually resolved the issue: Re: Obsolete data sources in dataflow lineage - Microsoft Fabric Community

Hi @JenM ,

 

You can edit the power queries and you will be able to change the data source.

 

Thanks,

Sai Teja 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.