Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
LearninPowerBI
New Member

Question about how Dataflows and Dataset work

Hello fellow PowerBI users,

I just started using PowerBI dataflows and have some questions which I am hoping someone can answer. 

I have two dataflows, Dataflow A and B. Dataflow A connects to SQL Server 'XYZ' and a PostgreSQL DB 'PDB'. Dataflow B connects to few tables from Dataflow A and some additional Oracle tables and an Excel file. What I noticed is it also connects to SQL Server 'XYZ' and brings few tables in which could have used from Dataflow A but I am not sure why the developer chose to connect to the SQL Server separately. I verfied the table and underlying data is same at both the places. Now I have been tasked with optimizing the Dataflow B as its consuming lot of resources on the Premium capacity. In order to do that I need to get few things clarified.

 

1. Can we have a standalone dataflow? i.e. there is no Dataset which is getting populated at the end

2. Does the Data from a dataflow get saved anywhere ? I believe yes, in Azure Gen2 storage.

3. When we trigger a datalfow refresh which use another dataflow (In my case if I trigger Dataflow B refresh), does it run another dataflow as well? (in my case Dataflow A) or will it simly get data from Gen2 storage where output fo Dataflow A is saved? (Related to my 2nd question above)

4. If I have both the Datflows set to be refreshed every 2 hours, will there be any conflict?

5. Will using tables from existing Dataflow help with overall refresh time, capacity resource utilization reduction instead of again connecting to tables from Database?

 

Thanks!

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

1. yes, but why would you do that?

2. yes, Parqet files in the Azure cloud.  Not accessible directly.

3. no

4. define "conflict".  They potentially will be out of sync. But refreshes for dataflows and datasets do not impact the "current"  data until the refresh completes successfully at which point the data will be swapped in.  If the refresh fails then the "current"  data continues to be available.

5. Depends.  The purpose of a dataflow is to shield you (the developer) from slow data sources. It does nothing for your report users, and is not very useful if you have a high performance data source.

View solution in original post

2 REPLIES 2
LearninPowerBI
New Member

Thank you!

lbendlin
Super User
Super User

1. yes, but why would you do that?

2. yes, Parqet files in the Azure cloud.  Not accessible directly.

3. no

4. define "conflict".  They potentially will be out of sync. But refreshes for dataflows and datasets do not impact the "current"  data until the refresh completes successfully at which point the data will be swapped in.  If the refresh fails then the "current"  data continues to be available.

5. Depends.  The purpose of a dataflow is to shield you (the developer) from slow data sources. It does nothing for your report users, and is not very useful if you have a high performance data source.

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors