Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Next up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now

Reply
Suzanne_Shuford
Frequent Visitor

Data Flow Gen2 ODBC connection to AWS for Incremental Refresh

 Hi, I am in higer Ed and we are not currently on Fabric License. 
I have a query against an AWS Athena table using  an AWS ODBC Connection.

The table is upward of 17 million rows and approximately 70 columns, many are ID columns.  I would like to incrementally update this data.

I have try many ways to do this unsuccessfully for one reason or anther.   Either it does not fold or it does not do an incremental refesh.

My current state:  dsn, catalog and table are in the source (suggestion in troubleshooting)

Primary query: set to allow staging, it has no destination (Microsoft doc)

Reference query: does not allow staging, it is set to not require folding (it does in Power Query) ,this is where the incremental update is set.  I want to refresh an entire year, when the update field has changed. 

Destination is a lake house table columns are not dynamic, options set to limit 10

Parameters are RangeStart, RangeEnd points to a query to todays date.

 

I would really love it if anyone has an issue with AWS and figured this out.   (Dates are set as they are per another blog, stating date issues)

This query runs, but it does not appear to do incremental updates.  The second time I run it after adding additional data is longer then the first.  

    // 1. Format dates for Athena

    StartText = DateTime.ToText(RangeStart, "yyyy-MM-dd"),

    EndText   = DateTime.ToText(RangeEnd,   "yyyy-MM-dd"),

 

    // 2. The SQL Statement with manual folding built-in

    FullQuery = "

        SELECT

            journal_line_wid,

            as_entered_company, (lots of fields)

        FROM edl_workday_fin.cur_journal_transactions_vw

        WHERE accounting_date >= (select date '" & StartText & "') and accounting_date < (select date '" & EndText & "')",

 

    // 3. The Actual Data Connection

    Source = Odbc.Query("dsn=DSNName;Catalog=AwsDataCatalog;Schema=SchemaName;",

        FullQuery

    )

in

    Source

1 ACCEPTED SOLUTION
bariscihan
Resolver II
Resolver II

Hi Suzanne,

What you’re seeing is usually expected with ODBC + “hand-built SQL” in Dataflow Gen2: the incremental refresh feature only behaves incrementally when the RangeStart/RangeEnd filter step can be query-folded back to the source. When we inject RangeStart/RangeEnd into a SQL string (or use Odbc.Query directly), the engine often can’t detect/validate folding, so it ends up re-reading more data than you expect.

Here are two practical paths:

Option A (try to make it fold):

  1. Keep RangeStart / RangeEnd as Date/DateTime parameters (don’t convert them to text for the main filter logic).
  2. Use a normal Power Query filter step on your date column first (e.g., Table.SelectRows with >= RangeStart and < RangeEnd) and only then rely on folding.
  3. If you must use a native query, wrap it with Value.NativeQuery and pass the folding option (where supported) instead of Odbc.Query. This is the pattern Microsoft documents for enabling folding with native queries:
    • Value.NativeQuery(Source, SqlText, null, [EnableFolding=true])

If Athena’s ODBC driver still doesn’t fold reliably, incremental refresh will not be effective (and refresh times can even grow on subsequent runs).

Option B (recommended for Athena/ODBC): do incremental outside Dataflow IR If folding can’t be guaranteed through the Athena ODBC driver, the reliable approach is:

  • Land data into OneLake/Lakehouse via a pipeline/Copy activity (or another ingestion method that can push predicate to Athena),
  • Use a watermark (max accounting_date / last_updated) and load only new/changed rows,
  • Then MERGE into a Lakehouse Delta table. This avoids depending on ODBC folding and gives you deterministic incremental behavior.

Also note: Dataflow Gen2 incremental refresh updates destination buckets using replace, not append, so make sure your bucket size is appropriate (e.g., daily/monthly) and that your “change detection” column is a true last-updated timestamp if you want updates within the same date range handled correctly.

Docs reference for Dataflow Gen2 incremental refresh behavior (bucket replace): https://learn.microsoft.com/fabric/data-factory/dataflow-gen2-incremental-refresh

Docs reference for enabling folding on native queries (Value.NativeQuery + EnableFolding): https://learn.microsoft.com/power-query/native-query-folding

Hope this helps — if you can confirm whether “View Native Query” is enabled/disabled on the filter step (or whether Athena ODBC supports folding for your query), we’ll know immediately whether Option A is feasible or if Option B is the best route.

View solution in original post

3 REPLIES 3
v-kpoloju-msft
Community Support
Community Support

Hi @Suzanne_Shuford

Thank you for reaching out to the Microsoft Fabric Community Forum. Also, thanks to @bariscihan, for those inputs on this thread.

Has your issue been resolved? If the response provided by the community member @bariscihan, addressed your query, could you please confirm? It helps us ensure that the solutions provided are effective and beneficial for everyone.

Hope this helps clarify things and let me know what you find after giving these steps a try happy to help you investigate this further.

Thank you for using the Microsoft Community Forum.

Hi @Suzanne_Shuford

Just wanted to follow up. If the shared guidance worked for you, that’s wonderful hopefully it also helps others looking for similar answers. If there’s anything else you'd like to explore or clarify, don’t hesitate to reach out.

Thank you.

bariscihan
Resolver II
Resolver II

Hi Suzanne,

What you’re seeing is usually expected with ODBC + “hand-built SQL” in Dataflow Gen2: the incremental refresh feature only behaves incrementally when the RangeStart/RangeEnd filter step can be query-folded back to the source. When we inject RangeStart/RangeEnd into a SQL string (or use Odbc.Query directly), the engine often can’t detect/validate folding, so it ends up re-reading more data than you expect.

Here are two practical paths:

Option A (try to make it fold):

  1. Keep RangeStart / RangeEnd as Date/DateTime parameters (don’t convert them to text for the main filter logic).
  2. Use a normal Power Query filter step on your date column first (e.g., Table.SelectRows with >= RangeStart and < RangeEnd) and only then rely on folding.
  3. If you must use a native query, wrap it with Value.NativeQuery and pass the folding option (where supported) instead of Odbc.Query. This is the pattern Microsoft documents for enabling folding with native queries:
    • Value.NativeQuery(Source, SqlText, null, [EnableFolding=true])

If Athena’s ODBC driver still doesn’t fold reliably, incremental refresh will not be effective (and refresh times can even grow on subsequent runs).

Option B (recommended for Athena/ODBC): do incremental outside Dataflow IR If folding can’t be guaranteed through the Athena ODBC driver, the reliable approach is:

  • Land data into OneLake/Lakehouse via a pipeline/Copy activity (or another ingestion method that can push predicate to Athena),
  • Use a watermark (max accounting_date / last_updated) and load only new/changed rows,
  • Then MERGE into a Lakehouse Delta table. This avoids depending on ODBC folding and gives you deterministic incremental behavior.

Also note: Dataflow Gen2 incremental refresh updates destination buckets using replace, not append, so make sure your bucket size is appropriate (e.g., daily/monthly) and that your “change detection” column is a true last-updated timestamp if you want updates within the same date range handled correctly.

Docs reference for Dataflow Gen2 incremental refresh behavior (bucket replace): https://learn.microsoft.com/fabric/data-factory/dataflow-gen2-incremental-refresh

Docs reference for enabling folding on native queries (Value.NativeQuery + EnableFolding): https://learn.microsoft.com/power-query/native-query-folding

Hope this helps — if you can confirm whether “View Native Query” is enabled/disabled on the filter step (or whether Athena ODBC supports folding for your query), we’ll know immediately whether Option A is feasible or if Option B is the best route.

Helpful resources

Announcements
FabCon and SQLCon Highlights Carousel

FabCon &SQLCon Highlights

Experience the highlights from FabCon & SQLCon, available live and on-demand starting April 14th.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Fabric Update Carousel

Fabric Monthly Update - March 2026

Check out the March 2026 Fabric update to learn about new features.