Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Power BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.

Reply
Anonymous
Not applicable

What is the proper use case for Dataflows

My organization is fairly new to Power BI, and I'm not 100% sure we are trying to use Dataflows how they're intended.  For instance, we have a star schema and want to consume that data through Power BI.  However, instead of hitting the tables directly on DB2, we have taken all dimensions and fact tables and created Dataflows for each.  Then, we grouped the Dataflows into categorical Linked Entities and scheduled refreshes.  The only thing happening to the data during this process is friendly naming.  

 

The first hurdle we hit is that our fact tables were too large to load into the Dataflow.  Now, we are to the point where we have this massive star schema modeled and all dataflows refreshing for dimensions, but the dataset is too large to refresh due to capping out our memory.

 

In the past with other BI tools, we would have built the model hitting the tables directly.  Without truly knowing the best use case for a Dataflow, I can't help but think we're trying to use Dataflows for something they weren't intended to be used for.  We are essentially creating a copy of the database in the service hoping to gain some optimization through Dataflows.  Does this sound like a practical application of this tool?  I can't help but feel like we should have always taken the approach of using DirectQuery against the tables on DB2.

3 REPLIES 3
Greg_Deckler
Super User
Super User

@Anonymous - So, in general Dataflows were created by Microsoft to make queries reuseable. So, originally, if you had 5 Power BI PBIX files you might have the same query and transformation in each one. So, if a change happened to the source data and you needed to make adjustments you ended up having to modify all 5 PBIX files. Dataflows allowed you to have a single place to make modifications and you could use that dataflow within your PBIX files.

 

So, that's the original intent of dataflows or what they originally brought to the table. Now, if you have having issues with the size of your datasource, you have 2 options.

1. See if you can implement incremental refresh

2. Move to Direct Query

 

If you move to Direct Query dataflows are basically out of the picture unless you do the preview thing where you direct query to a dataflow but that's kind of a different thing.



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

@Greg_Deckler  - Thanks for the advice.  Are you saying there is an incremental refresh option for the dataset?  I know of the incremental refresh option for dataflows, but the dataflows refresh fine.  We only run into memory issues when we attempt to refresh the dataset built from the dataflows.

 

We enabled the enhanced compute engine and tried directquery against the dataflows themselves.  We ended up using about 1/3 of the tables with directquery and 2/3 dataflows and still exceeded our memory.  We may just switch it over to directquery against DB2 and see how that works.  

Hi @Anonymous ,

 

1. For the role of Dataflow, you can study this document.

“As data volume continues to grow, so does the challenge of wrangling that data into well-formed, actionable information. We want data that’s ready for analytics, to populate visuals, reports, and dashboards, so we can quickly turn our volumes of data into actionable insights. With self-service data prep for big data in Power BI, you can go from data to Power BI insights with just a few clicks...”

Self-service data prep in Power BI 

 

2.  “We only run into memory issues when we attempt to refresh the dataset built from the dataflows.”

Maybe your local computer’s configuration can not load such a large dataset. You can refer to these links and try to reduce the query cache.

Are Your Power BI Performance Issues Due To High Memory Consumption? 

Ten Techniques for Optimising Memory Usage in Microsoft Power BI 

Performance Tip for Power BI; Enable Load Sucks Memory Up 

 

Incremental refresh in Power BI 

 

Best regards,
Lionel Chen

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

 

 

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.