March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
I have a question about implementing Incremental Refresh on a dataset that connects solely to Dataflows. I tried setting it up but it appears Dataflows do not support query folding.
Few questions:
- Is it possible in any way to get Incremental Refresh working on a dataset that runs on Dataflows at this moment?
- If not, is there anything on the roadmap to enable this?
- If not, is there any way to refresh only part of a Dataflow using a workaround?
It feels really weird to have all the data in Azure Data Lake (Dataflows) but not being able to load it into a dataset due to memory issues.
Hi @Anonymous ,
You can refer the following documentation to configure incremental refresh for dataflow:
Using incremental refresh with Power BI dataflows
Incremental Refresh for Dataflows
Best Regards
Rena
@yingyinr Hi - you did not answer the actual question.
@Anonymous is asking whether a user can connect a dataset (using PBI desktop) to an existing dataflow and then turn on incremental refresh on the dataset.
We can have incremental refresh datasets for database sources, but can we do the same against a dataflow source? It seems to allow me to do so when I tested it, but I cannot find any official documentation on the subject.
@v-yiruan-msft Did you actually read my message? I'm not trying to set up Incremental Refresh in Dataflows, I'm trying to set it up for a dataset that uses Dataflows as a data source.
Could you provide any information on that or perhaps even address my actual questions?
Hey @v-yiruan-msft,
Would you happen to have an answer on the questions I asked? Or maybe a colleague could help you out?
Thanks so much in advance.
Hi @Anonymous,
Sorry for delay. There are two ways to configure incremental refresh: one is configure increment refresh in Power BI Desktop (it is available for Pro users) and another is use increment refresh with Power BI dataflows(currently only for Premium users). Currently you are connecting to dataflow as data source in Power BI Desktop, since the original data source is the dataflow, you can just configure increment refresh for the dataflow in Power BI Service.
Best Regards
Rena
Hi @v-yiruan-msft,
Thanks for your answer. It is absolutely clear to me that I can set up incremental refresh in dataflows. This is what I currently do. So the main benefit of this is that we only refresh the last 14 days of data each morning (approx. 2 million rows of data). This is working well and I'm happy with the feature.
However, our total dataset consists of approximately 70 million rows of data. And each morning, when I'm refreshing my dataset, it will refresh all of these 70 million rows of data. Even when I implement incremental refresh on the dataset, it will still refresh all rows. What I would expect it to do is refresh only the last 14 days of data, so that it only has to refresh 500 MB of data, instead of the entire ~7 GB.
How can I make sure that my dataset will only refresh the part that is refreshed in dataflows (the incremental part)?
Hi @Anonymous ,
According to the official documentation, it is recommend to use data sources which support query folding when configure increment refresh. Otherwise, it may execute the complete refresh... You can check the following sentences:
Query folding
It's important the partition filters are pushed to the source system when queries are submitted for refresh operations. To push filtering down means the datasource should support query folding. Most data sources that support SQL queries support query folding. However, data sources like flat files, blobs, and web feeds typically do not. In cases where the filter is not supported by the datasource back-end, it cannot be pushed down. In such cases, the mashup engine compensates and applies the filter locally, which may require retrieving the full dataset from the data source. This can cause incremental refresh to be very slow, and the process can run out of resources either in the Power BI service or in the on-premises data gateway if used.
So please check if the original data sources using in dataflow support query folding or not. You can refer this video to check it.
In addition, hope the following documenations can help you.
Power BI Incremental Refresh and Query Folding
Power BI - Checking Query Folding with View Native Query
Best Regards
Rena
Hi @v-yiruan-msft ,
Thanks again for your answer.
I do want to emphasize that I'm not trying to set up incremental refresh in Dataflows. I already got that working. I'm trying to set up incremental refresh in the dataset that connects to dataflows. So, because dataflows is the source, the question is if dataflows supports query folding. Does it?
Hi @Anonymous ,
Yes, please check whether the underlying data source used in the dataflow support query folding or not. For the data sources that do not support query folding, you can set incremental refresh, but it may retrieve the entire data anyways.
Best Regards
Rena
@v-yiruan-msft How is the underlying data source relevant? I'm connecting to Dataflows. The dataset doesn't care where I get the data from before that point, does it?
The question is, when I'm refreshing my dataset, do the queries it sends to dataflows get folded and is Incremental Refresh possible in this case? Or will it always load in the entire dataflow?
We have the very same question (dataset incremental refresh using dataflows as source, regardless of how the dataflow itself gets its data).
Has this been answered?
The simple answer is NO.
Many have tried workarounds, but nothing official from @microsoft.
@pat_mecee Have you tested this? I ran a test and it seemed to work, but I did not proceed (my responsibility is just to create the dataflow sources and I do not use PBI Desktop or datasets much).
It would be nice to have official communication here.
"Dataflow incremental refresh and dataset incremental refresh are designed to work in tandem. It's acceptable and supported to have an incrementally refreshing entity in a dataflow, fully loaded into a dataset, or a fully loaded entity in a dataflow incrementally loaded to a dataset.
Both approaches work according to your specified definitions in the refresh settings."
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
39 | |
26 | |
15 | |
11 | |
10 |
User | Count |
---|---|
58 | |
51 | |
23 | |
14 | |
11 |