Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
buildbod
Most Valuable Professional
Most Valuable Professional

Dataflow size limit

I'm considering breaking one of my larger models into a series of Dataflows to improve the overall refresh performance of the model. I was wondering whether an individual dataflow had a size limit? Also if several dataflows are combined into a single model, will refreshing the model force a refresh of the associated dataflows or will they refresh independently based on their individual refresh frequencies?

11 REPLIES 11
aldupler
Microsoft Employee
Microsoft Employee

The biggest limit I've run into is refresh time out (2 hours). I'm running a lot of Native Queries against AAS. These run slow as it is, but my feeling is that dataflows are slower than a regular service refresh. As a result, I've broken up a big model into entities in 5-6 different dataflows in the same workspace. Then I just staggered the refresh of each by two hours over the course of the night. Now everything runs fine.

Hi there

Yes extracting data out of AAS could potentially be very slow if you are using it as a data extraction tool with row by row data.

That will work with the staggering of dataflows refreshing.




Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

GilbertQ
Super User
Super User

Hi there

From what I understand there is a size limit in terms of storage depending on if you have Power BI Pro (10GB) or Power BI Premium (100TB)

Then if you had to break them up, they would all sit in their separate data stores in dataflows.

As it currently stands you would then manage the refresh of the dataflows to ensure that the data is up to date. If you are using Power BI Premium you could use Linked Entities or Computed Entities (Depending on requirements).
Then you would need to import the dataflows into your Power BI Desktop file (PBIX) and then upload that to the Power BI Service.
Once in the Power BI Service, you would then need to schedule the refresh for your PBIX file.




Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

buildbod
Most Valuable Professional
Most Valuable Professional

Thank you. That confirms my thinking.

 

Does the refresh of the PBIX model force a refresh of the associated dataflows or will they refresh independently based on their individual refresh frequencies?

Hi there

The only way that they could be refreshed is if you use Power Premium and Linked Entities




Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

As long as you don't append the "partitions" until you ingest the data into a model, you're golden.

Hi,

 

can you please explain the reason behind the suggestion not to append the results of the different dataflows / partitions?

 

KR Alex

 

buildbod
Most Valuable Professional
Most Valuable Professional

I've successfully broken my large model out into a number of Dataflows and they are all refreshing on their own schedules 🙂 I am facing a new challenge in that any Dataflow that is over about 300MB will refresh in Dataflows and display in Power BI Desktop Query Editor but fail to load when I apply the query in Power BI Desktop. The error is:

 

Failed to save modifications to the server. Error returned: 'OLE DB or ODBC error: [DataSource.Error] Received an unexpected EOF or 0 bytes from the transport stream..'.

 

When applying it will happily load the data to a point - normally to the full size as it pauses for a long time at the same value and then crashes with the error. I'm going to break the larger items out into smaller Dataflows and then combine them as a new Table in Power BI Desktop. That should reduce the sizes being transferred over the wire.

I have larger tables so you might want to put in a support ticket.
buildbod
Most Valuable Professional
Most Valuable Professional

Yes. I've opened a ticket and I am awaiting a solution.

No. The schedules are separate. You can even have multiple dataflows with separate schedules feed the same report which can give you an extremely crude approximation of incremental refresh with out premium capacity.

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.