Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Did you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now

Reply
M98
New Member

Dataflow Gen2 Incremental Load

I’m performing an incremental load in a Dataflow Gen2 using the LastModifiedDate column from the source table. This column is stored in Unix epoch format, so I convert it to a full datetime using this Power Query transformation:

 

#datetime(1970,1,1,0,0,0) + #duration(0,0,0,[LastModifiedDate]/1000)

 

I use this converted datetime to compare against a source reference table to determine which rows are new or updated, and then load only those rows into the destination Lakehouse table.

 

But, when I run the dataflow, it causes the SQL Server to crash, it seems unable to handle the load. My question is: could this failure be caused by the incremental load not capturing all changes correctly, or maybe my incremental load is not configured correctly? In other words, is it possible that the reference table used for the incremental logic is out of sync with the destination table, leading to this issue?

1 ACCEPTED SOLUTION

The process has already been folded, but I suspect the SQL crash could also be due to concurrent access from other applications connected to the same SQL server. The additional load might have contributed to the failure.

View solution in original post

8 REPLIES 8
v-ssriganesh
Community Support
Community Support

Hello @M98,

Hope everything’s going great with you. Just checking in has the issue been resolved or are you still running into problems? Sharing an update can really help others facing the same thing.

Thank you.

The process has already been folded, but I suspect the SQL crash could also be due to concurrent access from other applications connected to the same SQL server. The additional load might have contributed to the failure.

Hi @mmmmm_98,

Thanks for the update. If the process is already folded, that’s a good sign query folding is working as intended. Your suspicion about concurrent access from other applications adding load to the SQL Server is plausible and could definitely contribute to the crash, especially if the server is already under strain. To mitigate this, consider scheduling your Dataflow run during a low-traffic period or coordinating with other app owners to reduce simultaneous load. You could also check SQL Server logs for resource usage spikes during the failure to confirm. 

Best regards,
Ganesh Singamshetty

Hello @M98

Could you please confirm if your query has been resolved by the provided solutions? This would be helpful for other members who may encounter similar issues.

 

Thank you for being part of the Microsoft Fabric Community.

v-ssriganesh
Community Support
Community Support

Hello @M98,

We hope you're doing well. Could you please confirm whether your issue has been resolved or if you're still facing challenges? Your update will be valuable to the community and may assist others with similar concerns.

Thank you.

 

v-ssriganesh
Community Support
Community Support

Hi @M98,
Thank you for reaching out to the Microsoft fabric community forum.

The SQL Server crash is unlikely to be directly caused by the incremental load missing changes or being misconfigured, though these could contribute to larger data pulls. More likely, the issue stems from broken query folding, where the full dataset is processed instead of just incremental rows due to the datetime conversion happening before filtering. A reference table out of sync could amplify this by reloading excessive data, but the crash points more to the load configuration. Ensuring folding and proper sync should resolve it.

Best regards,
Ganesh Singamshetty.



AntoineW
Super User
Super User

Hello @M98 , 

You’re using Dataflow Gen2 in Microsoft Fabric with an incremental load pattern based on LastModifiedDate (in Unix epoch).
Your logic:

#datetime(1970,1,1,0,0,0) + #duration(0,0,0,[LastModifiedDate]/1000)

is perfectly valid to convert Unix timestamps to UTC datetime.
However, the issue seems to appear during refresh execution (SQL Server crash or overload).


🔍Root Causes to Consider

1. Transformation applied row-by-row before filtering

When you convert [LastModifiedDate] before applying the incremental filter, the transformation forces Power Query to materialize and compute all rows from the source before it can filter them.

This means your SQL Server is being asked to:

  • Pull all data (not just new/updated rows),

  • Apply the conversion logic on each row,

Then compare to your reference table.

⚠️Result: high CPU and memory load → SQL Server crash or timeout.

 

💡Recommendation:
Perform filtering at the source using native SQL, or delay conversion until after filtering.

 

2. Reference table out of sync with destination

If your incremental logic uses a “reference” table (for example, a stored Max(LastModifiedDate) from the previous load) but this table is not updated transactionally with the destination table, you can end up:

  • Re-loading existing rows (duplicates or merge conflicts),

  • Missing updates (if the reference table lags behind),

  • Triggering large scans due to reprocessing overlaps.

This can amplify data volume, making SQL Server appear to “crash”.

 

Best practice:

  • Always update the reference table after a successful write to the destination.

  • Ensure both are in the same Lakehouse or database for atomicity.


3. Incremental logic not pushed down to SQL Server

Power Query transformations (like your #datetime + #duration) are not foldable for most SQL sources.
If query folding breaks, Power BI/Fabric will extract the entire dataset and process it in the Dataflow runtime instead of pushing a filtered query to SQL Server.

 

You can check this by right-clicking the step → “View Native Query”.
If it’s disabled, the folding is broken.

 

Fix:
Perform the conversion in SQL before connecting the Dataflow. Example: 

SELECT *, DATEADD(SECOND, LastModifiedDate/1000, '1970-01-01') AS LastModifiedUTC
FROM dbo.SourceTable

Then use LastModifiedUTC directly in your incremental logic.


🧠 Recommended Approach

Here’s a robust incremental pattern for Dataflow Gen2 using Unix timestamps:

1. In SQL Source (preferred):

 
SELECT *, DATEADD(SECOND, LastModifiedDate / 1000, '1970-01-01') AS LastModifiedUTC 
FROM dbo.SourceTable
WHERE LastModifiedDate > @LastLoadEpoch

→ Folding stays intact, server filters data efficiently.

 

2. In Dataflow Gen2:

Parameterize your incremental filter:

 
Filtered = Table.SelectRows(Source, each [LastModifiedUTC] > RangeStart and [LastModifiedUTC] <= RangeEnd)
  • Enable Incremental Refresh in the UI.

  • Ensure your destination Lakehouse table uses append mode.

 

3. After each load:

Update your reference table with Max(LastModifiedUTC) from the new batch.

 

Official References : 

https://learn.microsoft.com/en-us/power-query/best-practices

https://learn.microsoft.com/en-us/fabric/data-factory/dataflow-gen2-incremental-refresh

https://learn.microsoft.com/en-us/power-query/query-folding-basics

 

Hope it can help you !

Best regards,

Antoine

Hi AntoineW, thanks for your response. I tried the second method, but the append option isn’t visible  to use. I noticed that I can’t enable incremental load if the table update method is set to append  it only works when it’s set to replace.

mmmmm_98_0-1760342122540.png

 

Helpful resources

Announcements
April Fabric Update Carousel

Fabric Monthly Update - April 2026

Check out the April 2026 Fabric update to learn about new features.

Fabric SQL PBI Data Days

Data Days 2026 coming soon!

Sign up to receive a private message when registration opens and key events begin.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.