- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does incremental refresh do anything if I don't have a data destination ?
Is a data destination is required for incremental refresh to work ? I'm able to set up incremental refresh without a data destination, but does that do anything ? My issue is incremental refresh doesn't work for lakehouse destinations (and its very useful automatic schema updates).
For example :
Dataflow A has incremental refresh enabled, but without a destination. Dataflow B gets data from dataflow A, and dataflow B's destination is a lakehouse.
I now have what I want : Incremental refresh with a lakehouse destination. But is the incremental refresh from dataflow A actually doing anything if there's no data destination ?
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @mrozzano ,
Incremental Refresh requires a data target to operate effectively. Without a data destination, the Incremental Refresh setting will have no place to store the refreshed data, rendering the setting invalid.
In your scenario, Dataflow A has incremental refresh enabled but no target, while Dataflow B gets data from Dataflow A and Dataflow B's target is a datalake repository. Due to the lack of a target to store the refreshed data, the incremental refresh in Dataflow A won't actually do anything.
For incremental refresh to work properly, you need to ensure that the Dataflow has a valid target so that the refreshed data can be stored and accessed by subsequent Dataflows or reports.
For more details, you can refer to below document:
Incremental refresh in Dataflow Gen2 - Microsoft Fabric | Microsoft Learn
Best Regards,
Adamk Kong
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @mrozzano ,
Incremental Refresh requires a data target to operate effectively. Without a data destination, the Incremental Refresh setting will have no place to store the refreshed data, rendering the setting invalid.
In your scenario, Dataflow A has incremental refresh enabled but no target, while Dataflow B gets data from Dataflow A and Dataflow B's target is a datalake repository. Due to the lack of a target to store the refreshed data, the incremental refresh in Dataflow A won't actually do anything.
For incremental refresh to work properly, you need to ensure that the Dataflow has a valid target so that the refreshed data can be stored and accessed by subsequent Dataflows or reports.
For more details, you can refer to below document:
Incremental refresh in Dataflow Gen2 - Microsoft Fabric | Microsoft Learn
Best Regards,
Adamk Kong
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for clarifying !
Some feedback : Because a data destination is required to use incremental refresh, I suggest dimming / disabling that menu item until an oppropriate data destination (warehouse, Azure SQL db) is selected. Then, make incremental refresh selectable / enableable.
Otherwise, it contradicts the documentation or causes confustion : I've enabled incremental refresh without selecting a data destination. Hence my question, is it doing anything ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
An incremental dataflow refresh works as follows:
"
- Evaluate Changes: The dataflow compares the maximum value in the change detection column with the previous refresh. If the value has changed, the bucket is marked for processing.
- Retrieve Data: The dataflow retrieves data for the changed buckets in parallel, loading it into the staging area.
- Replace Data: The dataflow replaces the data in the destination with the new data, ensuring only the updated buckets are affected. Any historical data or data that is outside the range of buckets marked for processing is not touched or changed. This way you can retain long term history in your destination.
If you don't have a data destination, it cannot replace the updated values. I have not tested this, but I suspect that you will get the rows that were changed inserted into your second dataflow destination. However, those inserted values could be "duplicates" (with a change in at least the column that you've selected to check for changes per bucket) from previous loads. It will not replace anything. Therefore, your lakehouse will eventually contain more rows that the original source and it will keep growing when time pass by.

Helpful resources
Subject | Author | Posted | |
---|---|---|---|
01-18-2024 09:35 PM | |||
08-16-2024 04:57 AM | |||
08-18-2024 11:18 PM | |||
02-02-2024 12:14 PM | |||
06-24-2024 03:40 AM |