Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
LearninPowerBI
New Member

Question about how Dataflows and Dataset work

Hello fellow PowerBI users,

I just started using PowerBI dataflows and have some questions which I am hoping someone can answer. 

I have two dataflows, Dataflow A and B. Dataflow A connects to SQL Server 'XYZ' and a PostgreSQL DB 'PDB'. Dataflow B connects to few tables from Dataflow A and some additional Oracle tables and an Excel file. What I noticed is it also connects to SQL Server 'XYZ' and brings few tables in which could have used from Dataflow A but I am not sure why the developer chose to connect to the SQL Server separately. I verfied the table and underlying data is same at both the places. Now I have been tasked with optimizing the Dataflow B as its consuming lot of resources on the Premium capacity. In order to do that I need to get few things clarified.

 

1. Can we have a standalone dataflow? i.e. there is no Dataset which is getting populated at the end

2. Does the Data from a dataflow get saved anywhere ? I believe yes, in Azure Gen2 storage.

3. When we trigger a datalfow refresh which use another dataflow (In my case if I trigger Dataflow B refresh), does it run another dataflow as well? (in my case Dataflow A) or will it simly get data from Gen2 storage where output fo Dataflow A is saved? (Related to my 2nd question above)

4. If I have both the Datflows set to be refreshed every 2 hours, will there be any conflict?

5. Will using tables from existing Dataflow help with overall refresh time, capacity resource utilization reduction instead of again connecting to tables from Database?

 

Thanks!

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

1. yes, but why would you do that?

2. yes, Parqet files in the Azure cloud.  Not accessible directly.

3. no

4. define "conflict".  They potentially will be out of sync. But refreshes for dataflows and datasets do not impact the "current"  data until the refresh completes successfully at which point the data will be swapped in.  If the refresh fails then the "current"  data continues to be available.

5. Depends.  The purpose of a dataflow is to shield you (the developer) from slow data sources. It does nothing for your report users, and is not very useful if you have a high performance data source.

View solution in original post

2 REPLIES 2
LearninPowerBI
New Member

Thank you!

lbendlin
Super User
Super User

1. yes, but why would you do that?

2. yes, Parqet files in the Azure cloud.  Not accessible directly.

3. no

4. define "conflict".  They potentially will be out of sync. But refreshes for dataflows and datasets do not impact the "current"  data until the refresh completes successfully at which point the data will be swapped in.  If the refresh fails then the "current"  data continues to be available.

5. Depends.  The purpose of a dataflow is to shield you (the developer) from slow data sources. It does nothing for your report users, and is not very useful if you have a high performance data source.

Helpful resources

Announcements
OCT PBI Update Carousel

Power BI Monthly Update - October 2024

Check out the October 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors