Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Anonymous
Not applicable

Performance issues while using one dataflow as source for many Datasets

We have a dataflow that's a bit large, with around 55 million records and we use it as a data source for four datasets. 

All datasets use Import mode to get the data from this dataflow. 

Three of those, use only this dataflow as a data source. The fourth uses this same dataflow plus another dataflow which is small with ~100k records only.

 

We have noticed that when we trigger those four datasets to refresh concurrently, the performance is much worse then if we trigger them individually, one at a time. At first we thought it could be a matter of server resources, but we tried running those on an almost dedicated premium P1 capacity and we still found issues, including an error with the following message:

Data source error:The operation was throttled by Power BI Premium because there were too many datasets being processed concurrently. Please try again later.


To put that to the test, we tried refreshing 5 big datasets, all having big dataflows as sources as well, but different dataflows, not the same one.  We did not see the same performance issues in that test, refresh times weren't far away from the expected.

 

I know dataflows are CSV files in Azure Data Lake in the background, so I'm wondering if anybody knows if there's some good practice to follow here, with regards to sharing a dataflow to be imported by many datasets. Is there such a thing as a dataset locking down a dataflow? For example, making it unavailable for other datasets?

If that is the case, then the more concurrent dataset refreshes are executed using the same dataflow, the slower the performance would be. If that is so, I'd be surprised since one of the main advantages of a dataflow is exactly reusability. 

 

Are we missing something here? Is there some setting in the premium capacity that can enhance parallel reading of a dataflow for datasets?

3 REPLIES 3
V-lianl-msft
Community Support
Community Support

Hi @Anonymous ,

 

See if it will help you.

https://radacad.com/how-to-use-dataflow-to-make-the-refresh-of-power-bi-solution-faster 

 

Best Regards,
Liang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous
Not applicable

Hi Liang,

 

The article you shared only speaks about creating dataflows and then using them as a datasource for your datasets, which is what I stated we're already doing. 

 

Thanks,

Daniel

lbendlin
Super User
Super User

Your P1 capacity has a very limited set of renderers - six concurrent and 2400 per hour. What I would do is investigate if your dataset refresh is calling the dataflow twice, similar to what is described in this blog https://blog.crossjoin.co.uk/2020/07/05/why-is-power-bi-running-my-sql-query-twice/ . 

 

Have you tried to use the dataflow source in direct query mode?

Helpful resources

Announcements
July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.