Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
smpa01
Super User
Super User

Datflow gen2 incremental refresh parameters

smpa01_0-1750877517088.png

 

I am trying to understand this conceptually and came up with this. Is it correct understanding? I am simply trying to decipher what fields to utilize for filter, change data and correct sliding window parameter for a mutable data source that results in an optimized dataflow. If someone from MS can please take a look and verify would be great. 

 

I understand the above screesnshot as below

Assumption:
The filter column (OrderDate) is used with the assumption that the data is mutable (i.e., updates can happen to past records), and that all changes will occur within a sliding window of the past 14 days.

If updates happen beyond the 14-day window, those changes will not be picked up unless you adjust the “Extract data from the past” setting accordingly.

 

1. 🔍 Find only recent rows from new_data / current load by utilizing Filter and Extract Parameter

SELECT * FROM new_data
WHERE OrderDate >= GETDATE() - 14

2. 📦 Split that data into daily buckets by utilizing Bucket Parameter

GROUP BY CAST(OrderDate AS DATE)

3. 🧠 For each bucket, find the latest change old_data

MAX(ModifiedDate)

4. 🔁 Compare with what was previously loaded

 

For each daily bucket:

- If MAX(ModifiedDate) from new_data ≠ MAX(ModifiedDate) from **old_data` → reload that bucket

- If they’re equal → skip that bucket

 

Sliding Window consideration for mutable data

Filter-by column is utilized with the assumption that the user is working with mutable data that has a sliding window of 14 days.
If updates happen outside this window, they will not be picked up unless the window is increased.
Any changes beyond this window will not be refreshed incrementally.

 

My use case

⚠️ The max number of bucket is 50 and I must set up Bucket size as day and the furthest I can go back is 50 days sliding window (i.e., the timeframe of data that might change and should be re-evaluated during each refresh).

 

If I set up the bucket to Month (anything other than day) - Incremental refresh don’t care which day changed, just whether the whole month did. So the data ends up not being refreshed. Is it correct?

 

If I have a mutable data that can have old timestamp (that does not get updated by the source application) for any fields updates by the application users (for the same row) will be excluded by this incremental refresh (timestamp is beyond 50 days) ? - is that correct?

 

Thanks you in advance.

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs
1 ACCEPTED SOLUTION
v-hashadapu
Community Support
Community Support

Hi @smpa01 , Thank you for reaching out to the Microsoft Community Forum.

 

Yes, your understanding of incremental refresh is correct and aligns with how it works in Microsoft Fabric and Power BI. When dealing with mutable data, using a filter column like OrderDate defines a sliding window, for example, the last 14 or 50 days. Only data within this window is considered during each refresh. Any changes outside it will be ignored unless you increase the window.

 

Within that window, the data is logically split into buckets based on your chosen granularity, typically daily in your case. For each bucket, the system checks whether the maximum value of ModifiedDate has changed compared to the last refresh. If it has, the bucket is refreshed; if not, it’s skipped. This is how Fabric efficiently refreshes only the changed data.

 

You’re also right that monthly buckets behave differently, if anything in the month changes, the entire month’s data is refreshed. It doesn’t track day-level changes within the month, so daily buckets are better for precise, recent updates, especially when you're working within the 50-bucket system limit. You are correct, if ModifiedDate isn’t updated when a record changes, the system will miss that change entirely. Incremental refresh relies fully on the accuracy of this column to detect updates.

 

Incremental refresh and real-time data for semantic models

Incremental refresh in Dataflow Gen2

Overview of query evaluation and query folding in Power Query

Direct Lake overview

How Direct Lake mode works with Power BI reporting

 

If this helped solve the issue, please consider marking it “Accept as Solution” so others with similar queries may find it more easily. If not, please share the details, always happy to help.
Thank you.

View solution in original post

4 REPLIES 4
lbendlin
Super User
Super User

The filter column (OrderDate) is used with the assumption that the data is mutable (i.e., updates can happen to past records), and that all changes will occur within a sliding window of the past 14 days.

Incremental refresh fundamentally only works with immutable partition boundary fields.

Yes, that's right. Immutable data is easy to manage (append new rows at the bottom yeterday's row). Mutable data manging is more challenging which is why Delta Merge or UPSERT concept comes into play in a db /lakehouse env. But there are sources which can only be consumed by df and how one makes sense of IR in respect to the same mutable data source scenario was the objective of creating this post. In business vase. on not works with immutable but also mutable data.

Did I answer your question? Mark my post as a solution!
Proud to be a Super User!
My custom visualization projects
Plotting Live Sound: Viz1
Beautiful News:Viz1, Viz2, Viz3
Visual Capitalist: Working Hrs

at the end of the day this falls back onto manual partition management. That is something you can do with Semantic Models.  I am not aware of a way to do that with the buckets of DF Gen2.

v-hashadapu
Community Support
Community Support

Hi @smpa01 , Thank you for reaching out to the Microsoft Community Forum.

 

Yes, your understanding of incremental refresh is correct and aligns with how it works in Microsoft Fabric and Power BI. When dealing with mutable data, using a filter column like OrderDate defines a sliding window, for example, the last 14 or 50 days. Only data within this window is considered during each refresh. Any changes outside it will be ignored unless you increase the window.

 

Within that window, the data is logically split into buckets based on your chosen granularity, typically daily in your case. For each bucket, the system checks whether the maximum value of ModifiedDate has changed compared to the last refresh. If it has, the bucket is refreshed; if not, it’s skipped. This is how Fabric efficiently refreshes only the changed data.

 

You’re also right that monthly buckets behave differently, if anything in the month changes, the entire month’s data is refreshed. It doesn’t track day-level changes within the month, so daily buckets are better for precise, recent updates, especially when you're working within the 50-bucket system limit. You are correct, if ModifiedDate isn’t updated when a record changes, the system will miss that change entirely. Incremental refresh relies fully on the accuracy of this column to detect updates.

 

Incremental refresh and real-time data for semantic models

Incremental refresh in Dataflow Gen2

Overview of query evaluation and query folding in Power Query

Direct Lake overview

How Direct Lake mode works with Power BI reporting

 

If this helped solve the issue, please consider marking it “Accept as Solution” so others with similar queries may find it more easily. If not, please share the details, always happy to help.
Thank you.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.