Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more

Reply
Aventuran
New Member

Gen 1 incremental Refresh Migration to Fabric

Hi everyone,

 

We are currently evaluating the migration of our architecture from Power BI Dataflows Gen1 to Microsoft Fabric (Dataflows Gen2), and we are facing several challenges. We would really appreciate feedback or guidance based on your experience.

 

### Current situation
- Our reports are built in import mode using Dataflows Gen1.
- Most of our dataflows use incremental refresh:
  - We store ~2 years of historical data
  - We refresh only the last ~3 months
- Data sources are on-premises SQL Server, many of them exposed via stored procedures.
- We trigger dataflow refreshes using Power Automate (since data availability time varies daily).

 

### Issues encountered in Fabric

 

1. **Incremental refresh in Dataflows Gen2**
   - From what we understand, incremental refresh requires:
     - A destination such as Lakehouse/Warehouse.
     - Query folding and a Date/DateTime column.
   - In our case:
     - We are using stored procedures as source
     - We are unable to get incremental refresh working properly
   - Additionally, the storage/retention behavior differs from Gen1, so we need to rethink the architecture (e.g. landing data in a Lakehouse).
   - In our tests, Dataflows Gen2 appear slower than Gen1 for similar workloads.
 

 

 
   We are considering switching to pipelines (Copy Activity, etc.)
   - However:
     - Our data loads do not finish at a fixed time
     - We currently rely on Power Automate to trigger Gen1 dataflows dynamically
### Our concerns

 

Given these limitations, we are unsure about the best approach:

 

- Should we fully migrate to Dataflows Gen2?
- Should we move to pipelines + Lakehouse instead of Dataflows?
- Is it recommended to keep Gen1 for certain scenarios (e.g. incremental refresh + Power Automate trigger)?

 

Thanks in advance!

 

 

1 ACCEPTED SOLUTION
tayloramy
Super User
Super User

Hi @Aventuran

 

While dataflow gen 1 is still supported, no improvements or development effort are going into maintaining them, so it is a good idea to start thinking about a migration to gen 2. Microsoft has committed to giving at least 12 months notice before retiring gen 1 dataflows, and that notice has not yet been given, so you have some time to figure out your migration path. 

 

There are some major architectural differences that you need to work with when using dataflow gen 2, mainly that dataflow gen 2 does not have internal storage like gen 1 does, making a data store like a lakehouse or a warehouse mandatory. 

 

For your scheduling problems, dataflow gen 2 can be schedueld with a pipeline, or you can use the Scheduler API to trigger them from Power Automate and keep your existing scheduling: https://learn.microsoft.com/en-us/rest/api/fabric/core/job-scheduler/run-on-demand-item-job?tabs=HTT...

 

Copy activities and almost all other Fabric data factory items can be run the same way, so if you wanted to move to copy activities and the scheduling was your only road block, you can schedule them with power automate using this API. 

 

The decision to move to pipelines instead of dataflows really depends on your data, how much transformation is needed, and what functionalities you are wanting. 

 

As for the incremental data refresh, you will need to have some sort of watermark column to be able to determine when rows were added, removed, or changed for any sort of incremental load to work nicely. If the built in incremental loads of copy jobs or dataflow gen 2 don't work for you, then it might be worth looking at switching your incremental loads to use notebooks instead, which will give you far more control over how the data is loaded and how that incremental window is determined. 

 





If you found this helpful, consider giving some Kudos.
If I answered your question or solved your problem, mark this post as the solution!

Join the Fabric Discord!

Proud to be a Super User!





View solution in original post

5 REPLIES 5
v-abhinavmu
Community Support
Community Support

Hi @Aventuran,

May I check if this issue has been resolved? If not, Please feel free to contact us if you have any further questions.


Thank you

v-abhinavmu
Community Support
Community Support

Hi @Aventuran,

Thank you for posting your query in the Microsoft Fabric Community Forum, and thanks to @lbendlin & @stoic-harsh & @tayloramy for sharing valuable insights.

 

Could you please confirm if your query has been resolved by the provided solutions? This would be helpful for other members who may encounter similar issues.

 

Thank you for being part of the Microsoft Fabric Community.

lbendlin
Super User
Super User

Think about what incremental refresh is - automated partition management based on a temporal axis

 

Think about what Dataflows Gen1 are - CSV files in Azure Blob storage.

 

So your process amounts to a round of musical chairs where you reshape existing partitions (your SPs) and then do incremental refresh just to end up with another set of "partitions" (the CSV files). 

 

A simple approach would be to do your own partition management (as you are already doing with the SPs) and then stuff these partitions into a lakehouse (where you can refresh/overwrite them individually) . Then in your warehouse and/or semantic model you can recombine the partitions as needed.

stoic-harsh
Solution Supplier
Solution Supplier

Hey @Aventuran,

Gen1 should work just fine, but since it won't receive any updates, you should plan a gradual migration toward Gen2 / Fabric-native architecture rather than building new dependencies on Gen1. And, no need to consider pipelines, if dataflows suffice your requirement.

A good setup, I'd suggest:

  • Keep Gen1 temporarily for critical workloads, in the beginning.

  • Convert Stored Procedures to views, wherever possible. Views/tables support query folding in Gen2.

    • Parameterized SPs may work too, but are less reliable.
    • Also, since query folding breaks for SPs, it might explain slower performance of Gen2 in your test cases.
  • Pipeline is a long-term solution for SP-heavy scenarios. As long as dataflows suffice your requirement, you don't need to push for pipelines.
    • Consider pipeline only if your setup can't shift from SPs (for instance, business rules are too complex to be put into views, etc.)
    • Power Automate can still trigger Fabric items via REST APIs, and Dataflow Gen2 can be called within pipelines, so scheduling refreshes based on triggers won't be a bother.
  • Gradually, move everything from Gen1 to newer Fabric-native items.

Hope this helps!

Best,

Harshit

tayloramy
Super User
Super User

Hi @Aventuran

 

While dataflow gen 1 is still supported, no improvements or development effort are going into maintaining them, so it is a good idea to start thinking about a migration to gen 2. Microsoft has committed to giving at least 12 months notice before retiring gen 1 dataflows, and that notice has not yet been given, so you have some time to figure out your migration path. 

 

There are some major architectural differences that you need to work with when using dataflow gen 2, mainly that dataflow gen 2 does not have internal storage like gen 1 does, making a data store like a lakehouse or a warehouse mandatory. 

 

For your scheduling problems, dataflow gen 2 can be schedueld with a pipeline, or you can use the Scheduler API to trigger them from Power Automate and keep your existing scheduling: https://learn.microsoft.com/en-us/rest/api/fabric/core/job-scheduler/run-on-demand-item-job?tabs=HTT...

 

Copy activities and almost all other Fabric data factory items can be run the same way, so if you wanted to move to copy activities and the scheduling was your only road block, you can schedule them with power automate using this API. 

 

The decision to move to pipelines instead of dataflows really depends on your data, how much transformation is needed, and what functionalities you are wanting. 

 

As for the incremental data refresh, you will need to have some sort of watermark column to be able to determine when rows were added, removed, or changed for any sort of incremental load to work nicely. If the built in incremental loads of copy jobs or dataflow gen 2 don't work for you, then it might be worth looking at switching your incremental loads to use notebooks instead, which will give you far more control over how the data is loaded and how that incremental window is determined. 

 





If you found this helpful, consider giving some Kudos.
If I answered your question or solved your problem, mark this post as the solution!

Join the Fabric Discord!

Proud to be a Super User!





Helpful resources

Announcements
Fabric Data Days is here Carousel

Fabric Data Days 2026

Don't miss out on Data Days, June 15 through August 7. Learn Fabric, Power BI, SQL, AI and more.

June Fabric Update Carousel

Fabric Monthly Update - June 2026

Check out the June 2026 Fabric update to learn about new features.