Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
SamD02
Regular Visitor

Incremental Refresh in Power BI dataflow Gen 1 refreshing from past 13 hours

Hi All,

I have my Gen1 Dataflow containg monthly data from 2016-09. New Month's data load at 1st of each month containing that month's data. Size of table is too large, more than 21M records.

I want to go with incremental refresh, with settings like store 30years of data, refresh only past 2 day of data (cause each months contain only yyyy-MM-01).

With above confuguration, my dataflow is refreshing from past 13 hours, I know first load takes time but is this amout of time normal ?

Is there something wrong with my configuration settings that is causing this? 

Appreaciate any advice/suggestions!!

3 ACCEPTED SOLUTIONS
lbendlin
Super User
Super User

If the data source for your incremental refresh does not fold then you will not see any performance improvements, you will only get the "free" partition management.

 

NOTE:  The partition size limit is 10 GB.  Your amount of data is not nearly large enough for incremental refresh to make much sense (unless your data source has timeout issues)

View solution in original post

rohit1991
Super User
Super User

Hi @SamD02 ,
It's expected that the initial load for incremental refresh in Power BI Dataflow Gen1 can take a significant amount of time, especially with a large dataset like yours containing over 21 million records. Since you're storing 30 years of data and applying a refresh policy for only the past 2 days, the first refresh has to load and partition the entire historical dataset, which is resource-intensive and time-consuming. A 13-hour refresh duration for the first load, while long, may still be within the realm of normal depending on the complexity of your transformations, data source performance, and dataflow design.

 

However, to optimize performance, ensure that your queries are folded properly (i.e., query folding is applied), avoid unnecessary transformations before filters are applied, and consider breaking down the data into smaller, more manageable tables if possible. Once the initial load is done, subsequent refreshes should be significantly faster as only recent partitions will be refreshed.

 

Passionate about leveraging data analytics to drive strategic decision-making and foster business growth.

Connect with me on LinkedIn: Rohit Kumar.

View solution in original post

v-pnaroju-msft
Community Support
Community Support

Thankyou, @lbendlin@rohit1991 , for your response.

Hi @SamD02,

We appreciate your inquiry through the Microsoft Fabric Community Forum.
In addition to the information provided by @lbendlin and @rohit1991 , please follow the steps below to optimize performance:

  1. Adjust Historical Data Range.If your data begins from 2016, storing 30 years of history may create unnecessary overhead. It is advisable to reduce the stored rows value to 10 years to mitigate partition load and streamline metadata management.

  2. Since your data consistently loads as yyyy-MM-01, consider increasing the refresh rows to 5 days (from the current 2 days) to provide a better buffer and to avoid missed partitions due to processing delays.

  3. If feasible, consider upgrading to Dataflow Gen2, which leverages an enhanced compute engine for faster refreshes and improved scalability.

Additionally, for better insights into what might be slowing down the refresh, please utilize Power Query diagnostics or the Dataflow refresh history in the workspace.

Kindly refer to the following link for more information:
Dataflows best practices - Power BI | Microsoft Learn

If you find our response helpful, please mark it as the accepted solution and provide kudos. This will assist other community members facing similar queries.

Thank you.

View solution in original post

6 REPLIES 6
v-pnaroju-msft
Community Support
Community Support

Hi SamD02,

We are following up to see if your query has been resolved. Should you have identified a solution, we kindly request you to share it with the community to assist others facing similar issues.

If our response was helpful, please mark it as the accepted solution and provide kudos, as this helps the broader community.

Thank you.

v-pnaroju-msft
Community Support
Community Support

Hi SamD02,

We wanted to check in regarding your query, as we have not heard back from you. If you have resolved the issue, sharing the solution with the community would be greatly appreciated and could help others encountering similar challenges.

If you found our response useful, kindly mark it as the accepted solution and provide kudos to guide other members.

Thank you.

v-pnaroju-msft
Community Support
Community Support

Hi SamD02,

We have not received a response from you regarding the query and were following up to check if you have found a resolution. If you have identified a solution, we kindly request you to share it with the community, as it may be helpful to others facing a similar issue.

If you find the response helpful, please mark it as the accepted solution and provide kudos, as this will help other members with similar queries.

Thank you.

v-pnaroju-msft
Community Support
Community Support

Thankyou, @lbendlin@rohit1991 , for your response.

Hi @SamD02,

We appreciate your inquiry through the Microsoft Fabric Community Forum.
In addition to the information provided by @lbendlin and @rohit1991 , please follow the steps below to optimize performance:

  1. Adjust Historical Data Range.If your data begins from 2016, storing 30 years of history may create unnecessary overhead. It is advisable to reduce the stored rows value to 10 years to mitigate partition load and streamline metadata management.

  2. Since your data consistently loads as yyyy-MM-01, consider increasing the refresh rows to 5 days (from the current 2 days) to provide a better buffer and to avoid missed partitions due to processing delays.

  3. If feasible, consider upgrading to Dataflow Gen2, which leverages an enhanced compute engine for faster refreshes and improved scalability.

Additionally, for better insights into what might be slowing down the refresh, please utilize Power Query diagnostics or the Dataflow refresh history in the workspace.

Kindly refer to the following link for more information:
Dataflows best practices - Power BI | Microsoft Learn

If you find our response helpful, please mark it as the accepted solution and provide kudos. This will assist other community members facing similar queries.

Thank you.

rohit1991
Super User
Super User

Hi @SamD02 ,
It's expected that the initial load for incremental refresh in Power BI Dataflow Gen1 can take a significant amount of time, especially with a large dataset like yours containing over 21 million records. Since you're storing 30 years of data and applying a refresh policy for only the past 2 days, the first refresh has to load and partition the entire historical dataset, which is resource-intensive and time-consuming. A 13-hour refresh duration for the first load, while long, may still be within the realm of normal depending on the complexity of your transformations, data source performance, and dataflow design.

 

However, to optimize performance, ensure that your queries are folded properly (i.e., query folding is applied), avoid unnecessary transformations before filters are applied, and consider breaking down the data into smaller, more manageable tables if possible. Once the initial load is done, subsequent refreshes should be significantly faster as only recent partitions will be refreshed.

 

Passionate about leveraging data analytics to drive strategic decision-making and foster business growth.

Connect with me on LinkedIn: Rohit Kumar.

lbendlin
Super User
Super User

If the data source for your incremental refresh does not fold then you will not see any performance improvements, you will only get the "free" partition management.

 

NOTE:  The partition size limit is 10 GB.  Your amount of data is not nearly large enough for incremental refresh to make much sense (unless your data source has timeout issues)

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.