Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I have a staging workspace where I set up a dataflow that imports data from an on-prem SQL Server database. On one table I set up an incremental refresh and it takes 30 seconds to update.
In a separate workspace dataflow, I linked to that same incrementally refreshed table. However, it takes 3 minutes to update.
Do I need to set up another incremental refresh for that linked table?
Edit: The linked table has a Referenced table associated with it. So do I set up an incremental refresh on the Referenced table?
Solved! Go to Solution.
Hey @arock-well ,
as always it depends. A dataflow is a data source from the perspective of dataset. From my experience loading data from a dataflow if often much more faster than importing data from the data source directly. This reason for this simple, all the heavy lifting is done inside the dataflow. Loading data from a dataflow into a table should always be like this "SELECT * FROM TABLE." There is no need for any transformation, no need for data shaping.
But then, if you add 10k rows incrementally to a dataflow that already holds 1BN rows, then, of course loading 1BN rows to a dataset takes more time than adding only the increment.
Your choice, you pay faster dataset refresh with solution complexity.
What I do, load dataflows incrementally, to get the cleaned and beautified data as fast as possible, then I give the full load to the dataset a try.
The reason for this: I always want to keep my solution as fast, but also as simple as possible.
Hopefuyll, this helps to find your way.
Regards,
Tom
Hey @arock-well ,
If there are chains of dataflows, and you want to experience of shorter data refresh duration in subsequent stages you have to configure incremental refresh for all the instances of that table.
I do not understand your 2nd question, as i do not know how the referenced table will impact the increment or will be impacted by the impact.
Hopefully, this provides some help to tackle your challenge.
Regards,
Tom
@TomMartens Do I also need to set up a separate incremental refresh on any datasets that connect to an already incrementally refreshed dataflow? This post says no need to do that, but interested in your take on it: Solved: Re: Help with incremental refresh dataflow and dat... - Microsoft Fabric Community
Hey @arock-well ,
as always it depends. A dataflow is a data source from the perspective of dataset. From my experience loading data from a dataflow if often much more faster than importing data from the data source directly. This reason for this simple, all the heavy lifting is done inside the dataflow. Loading data from a dataflow into a table should always be like this "SELECT * FROM TABLE." There is no need for any transformation, no need for data shaping.
But then, if you add 10k rows incrementally to a dataflow that already holds 1BN rows, then, of course loading 1BN rows to a dataset takes more time than adding only the increment.
Your choice, you pay faster dataset refresh with solution complexity.
What I do, load dataflows incrementally, to get the cleaned and beautified data as fast as possible, then I give the full load to the dataset a try.
The reason for this: I always want to keep my solution as fast, but also as simple as possible.
Hopefuyll, this helps to find your way.
Regards,
Tom
@TomMartens Thank you for that. I'm assuming I would also need to set up an incremental refresh for any datasets that use that computed entity, too?
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
86 | |
46 | |
25 | |
21 | |
19 |