Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
Hello fellow PowerBI users,
I just started using PowerBI dataflows and have some questions which I am hoping someone can answer.
I have two dataflows, Dataflow A and B. Dataflow A connects to SQL Server 'XYZ' and a PostgreSQL DB 'PDB'. Dataflow B connects to few tables from Dataflow A and some additional Oracle tables and an Excel file. What I noticed is it also connects to SQL Server 'XYZ' and brings few tables in which could have used from Dataflow A but I am not sure why the developer chose to connect to the SQL Server separately. I verfied the table and underlying data is same at both the places. Now I have been tasked with optimizing the Dataflow B as its consuming lot of resources on the Premium capacity. In order to do that I need to get few things clarified.
1. Can we have a standalone dataflow? i.e. there is no Dataset which is getting populated at the end
2. Does the Data from a dataflow get saved anywhere ? I believe yes, in Azure Gen2 storage.
3. When we trigger a datalfow refresh which use another dataflow (In my case if I trigger Dataflow B refresh), does it run another dataflow as well? (in my case Dataflow A) or will it simly get data from Gen2 storage where output fo Dataflow A is saved? (Related to my 2nd question above)
4. If I have both the Datflows set to be refreshed every 2 hours, will there be any conflict?
5. Will using tables from existing Dataflow help with overall refresh time, capacity resource utilization reduction instead of again connecting to tables from Database?
Thanks!
Solved! Go to Solution.
1. yes, but why would you do that?
2. yes, Parqet files in the Azure cloud. Not accessible directly.
3. no
4. define "conflict". They potentially will be out of sync. But refreshes for dataflows and datasets do not impact the "current" data until the refresh completes successfully at which point the data will be swapped in. If the refresh fails then the "current" data continues to be available.
5. Depends. The purpose of a dataflow is to shield you (the developer) from slow data sources. It does nothing for your report users, and is not very useful if you have a high performance data source.
Thank you!
1. yes, but why would you do that?
2. yes, Parqet files in the Azure cloud. Not accessible directly.
3. no
4. define "conflict". They potentially will be out of sync. But refreshes for dataflows and datasets do not impact the "current" data until the refresh completes successfully at which point the data will be swapped in. If the refresh fails then the "current" data continues to be available.
5. Depends. The purpose of a dataflow is to shield you (the developer) from slow data sources. It does nothing for your report users, and is not very useful if you have a high performance data source.
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.