Incremental refresh of only latest file from share...

CheenuSing · ‎06-05-2025

I have local folder RawData. This folder has data for different days .
RawData_20250601.xlsx and son on.

The file RawData_20250601.xlsx contain historical data from 2025-0101

On 2025-06-03 data for RawData_20250602.xlsx is extracted and uploaded in this folder
How to set up incremental refresh to load only RawData_20250602.

Going forward the need is to incrementally refresh only latest RawData file and not all the files in the RawData folder

Need steps to achieve above in power bi desktop and power bi service

Cheers

CheenuSing

Did I answer your question? Mark my post as a solution and also give KUDOS !

Proud to be a Datanaut!

CheenuSing · ‎06-06-2025

Hi @SamsonTruong

The solution is doable, if one has premium per user license.
If the files are still in the sharepoint folder, the same data gets appended again, if no new data is loaded.

I think Power Automate is a good option.
Need steps in powerr automate, to check any file present in

the incremental data folder, then move this to staging dataflow, then from staging data append to historical dataflow, at the end of it move the file in incremental folder to archve folder in sharepoint.

Can you help with the above

Cheers

CheenuSing

Did I answer your question? Mark my post as a solution and also give KUDOS !

Proud to be a Datanaut!

SamsonTruong · ‎06-06-2025

Hi @CheenuSing ,

In the solution I suggested, the same data won't get appended again unless the dataflow is refreshed again while no new file has been added. This won't occur since we will use an event based trigger to trigger the dataflow. Using Power Automate and the Fabric REST API, the dataflow will only refresh when a new file is added to SharePoint. If no new file has been added, the dataflow will not refresh. In the dataflow, you can also filter each run to only pull in the most recent file. This way, there will be no need to orchestrate file movement in SharePoint, as each run will only look at the latest file.

If this helped, please mark it as the solution so others can benefit too. And if you found it useful, kudos are always appreciated.

Thanks,

Samson

CheenuSing · ‎06-10-2025

Hi @SamsonTruong ,

How can we ensure the target dataflow which is a combination of history and current file, does not loose the daily incremental files previoulsy loaded ?
Also can you share brief power automate flow.

Cheers

CheenuSing

Did I answer your question? Mark my post as a solution and also give KUDOS !

Proud to be a Datanaut!

SamsonTruong · ‎06-10-2025

Hi @CheenuSing , with dataflow gen 2, you have the option of landing the data in a storage layer such as a Fabic Lakehouse. In this case, unlike dataflow gen 1, the data is not stored in the dataflow. The dataflow gen 2 is then used as an ETL tool that appends the data to your storage layer.

In terms of a Power Automate flow. An example would be to use the following:

Sharepoint connector to monitor when a new file is added: https://learn.microsoft.com/en-us/sharepoint/dev/business-apps/power-automate/sharepoint-connector-a...
HTTP Activity that refreshes the dataflow gen 2 by calling the Fabric API: https://learn.microsoft.com/en-us/rest/api/fabric/dataflow/background-jobs/run-on-demand-execute?tab...

If this helped, please mark it as the solution so others can benefit too. And if you found it useful, kudos are always appreciated.

Thanks,

Samson

Connect with me on LinkedIn

Check out my Blog

Going to the European Microsoft Fabric Community Conference? Check out my Session

v-aatheeque

Hi @CheenuSing ,

Just following up on this thread.

The guidance shared by @SamsonTruong @Nasif_Azam earlier outlines how Dataflow Gen2 works as an ETL tool, with data stored in a Fabric Lakehouse.

If that helped resolve your scenario, please consider marking it as the Accepted Answer this helps others in the community with similar questions.

If you're still facing issues or need further clarification, feel free to share more details we’d be happy to assist further!

Thank you for being a part of the Microsoft Fabric Community!

v-aatheeque

Hi @CheenuSing

I wanted to follow up on this discussion.

The guidance provided by @SamsonTruon and @Nasif_Azam earlier explains how Dataflow Gen2 functions as an ETL tool, utilizing data stored in a Fabric Lakehouse.If this information has resolved your issue, please consider marking it as the Accepted Answer. This will assist others in the community who may have similar questions.

If you are still experiencing challenges or require further clarification, please don’t hesitate to share more details. We are here to help!

Nasif_Azam · ‎06-05-2025

Hey @CheenuSing ,

To set up incremental refresh for only the latest file (e.g., RawData_20250602.xlsx) from a local or SharePoint folder in Power BI, you can follow this two-part approach:

Issue

You have a folder (local or SharePoint) containing Excel files named like:

RawData_20250601.xlsx
RawData_20250602.xlsx
...

Each file contains historical data starting from January 1, 2025. You want to load only the latest file each day and refresh only that.

Idea: Instead of classic Power BI incremental refresh (which is date-based), you’ll use Power Query logic to dynamically pick the latest file and load only its data.

Steps in Power BI Desktop

1. Connect to the Folder

For local folder: Home > Get Data > Folder > Browse RawData folder
For SharePoint folder: Get Data > SharePoint Folder > Enter site URL

2. Transform & Filter Latest File

Click “Combine Files” then go into Power Query Editor and do the following:

a. Extract File Date

In the Source query:

Add a new column to extract date from filename:

= Table.AddColumn(Source, "FileDate", each try Date.FromText(Text.Middle([Name], 8, 8)) otherwise null, type date)

b. Filter to the Latest File

Sort FileDate descending.
Keep the top 1 row using: Home > Keep Top Rows > 1

c. Invoke Custom Function

If Power BI created a custom function like Transform Sample File, invoke it only on the filtered row.

Now only the latest file is processed.

Steps in Power BI Service

1. Publish to Power BI Service: Click Home > Publish.

2. Set Up Scheduled Refresh

In Power BI Service:

Go to Dataset Settings.
Set Refresh Schedule (e.g., daily).
Make sure gateway is configured:
- For local files: use On-premises Data Gateway.
- For SharePoint: no gateway needed.

Optional Enhancement by using Parameter

Use a Power Query parameter like LatestDate and filter files accordingly:

= Table.SelectRows(Source, each [FileDate] = LatestDate)

You can even make it dynamic by setting LatestDate to:

= List.Max(Source[FileDate])

If you found this solution helpful, please consider accepting it and giving it a kudos (Like) it’s greatly appreciated and helps others find the solution more easily.

Best Regards,
Nasif Azam

CheenuSing · ‎06-06-2025

Hi @Nasif_Azam ,

I have add the latest data to the existing file. In the above approach will it overwrite the existing file or append data.

Cheers

CheenuSing

Did I answer your question? Mark my post as a solution and also give KUDOS !

Proud to be a Datanaut!

Nasif_Azam · ‎06-06-2025

Hey @CheenuSing ,

My approach was mostly right for the original scenario where a new file is created every day and you load only the latest file based on the filename (e.g., RawData_20250602.xlsx).

However, you clarified: "I have added the latest data to the existing file."

They are no longer using a new file per day. They're now appending new data to the same Excel file, which changes the solution strategy.

What Actually My Approach Do

My logic to pick the latest file from a folder using filename-based filtering was spot-on if the scenario involved one new file per day.
I clearly explained how to filter and process only the latest file using Power Query a creative workaround when traditional incremental refresh doesn't fit.

Where My Approach Didn't Work

My approach did not address the case where:

Data is continuously added to the same Excel file (instead of new files each day)
The need is to incrementally process only the newly added data

In such a case, it could load the entire file again every refresh, which defeats the goal of incremental refresh.

If you found this solution helpful, please consider accepting it and giving it a kudos (Like) it’s greatly appreciated and helps others find the solution more easily.

Best Regards,
Nasif Azam

CheenuSing · ‎06-06-2025

Hi @Nasif_Azam ,

My need is like this.
On day 1 the folder contains FIle 1 which is loaded to the power bi.
On day 2 user uploads the FIle 2 to the folder which contains new data only.
On day 2 Need to append the data from FIle 2 to FIle 1 already loaded in the data model.

Please suggest steps, first in Desktop to test and then service

Cheers

CheenuSing

Did I answer your question? Mark my post as a solution and also give KUDOS !

Proud to be a Datanaut!

Nasif_Azam · ‎06-06-2025

Hey @CheenuSing ,

Thanks for the clarification now I fully understand the workflow you're aiming for.

Updated Scenario

On Day 1, File 1 (e.g., RawData_20250601.xlsx) is loaded into Power BI.
On Day 2, a new file File 2 (e.g., RawData_20250602.xlsx) is added to the folder. It contains only new data, and you want to append this to the previously loaded data.
The goal: Only process and append the latest file's data each day without reloading older files.

Updated Solution Strategy

This is a hybrid approach, combining Power Query logic and Power BI's append strategy, though not using the traditional Incremental Refresh feature. Here's how you can do it:

Steps in Power BI Desktop

Connect to Folder
- Home > Get Data > Folder > Choose your RawData folder

Extract Dates from File Names

In Power Query:

= Table.AddColumn(Source, "FileDate", each try Date.FromText(Text.Middle([Name], 8, 8)) otherwise null, type date)

Identify the Latest File
- Sort FileDate column in descending order.
- Keep top 1 row: Home > Keep Top Rows > 1
Invoke Custom Function
- If Power BI auto-generated a function like Transform Sample File, apply it to this top file only.
Enable Data Append
- This will now load only the latest file’s data, but it will overwrite the table on every refresh.

To Truly Append Data Across Days:

You need to store Day 1's loaded data somewhere persistently, because Power BI Desktop doesn't keep previous refresh states. Two options:

Option 1: Dataflow with Append Logic (Recommended in Service)

Create a Power BI Dataflow in Service.
One query loads existing historical data (could be stored in a CSV, SharePoint list, or OneDrive Excel).
Second query loads the latest file using the above steps.
Use Append Queries to combine both inside the dataflow.

Option 2: Staging Dataset or Table

Load Day 1 data into a Power BI table or export to SharePoint/Excel/Azure SQL.
On Day 2, load the new file and use Append Queries to combine with that persisted table.
Save or export again after combining, so future refreshes don’t lose the history.

Steps in Power BI Service

1. Publish the Report

File > Publish to Power BI

2. Set Scheduled Refresh

Go to Dataset Settings > Scheduled Refresh
For:
- Local folders ➜ Use On-premises Gateway
- SharePoint folders ➜ No gateway needed

3. Auto-Append Strategy

If you use a Dataflow + staging Excel file (like a HistoricalData.xlsx), let Power BI keep appending new file's data to it.
You could also use Power Automate to move processed files to another folder after refresh to avoid duplicates.

Summary

You’ll extract and filter the latest file in Power Query.
To append and retain history, use Dataflows or external persisted storage.
Power BI Desktop alone does not support true incremental appends unless integrated with an external store.

If you found this solution helpful, please consider accepting it and giving it a kudos (Like) it’s greatly appreciated and helps others find the solution more easily.

Best Regards,
Nasif Azam

CheenuSing · ‎06-06-2025

Hi @Nasif_Azam ,

Option 1: Dataflow with Append Logic (Recommended in Service)

Create a Power BI Dataflow in Service.
One query loads existing historical data (could be stored in a CSV, SharePoint list, or OneDrive Excel).
Second query loads the latest file using the above steps.
Use Append Queries to combine both inside the dataflow.

The above is Ok for History and Day 2 file combined as Fullset.

on Day3 if we use the same approach it will be historical and Day3 data only in the FUllset. . Day 2 data lost.

So how to overcome this .

Option 2: Staging Dataset or Table

Load Day 1 data into a Power BI table or export to SharePoint/Excel/Azure SQL.
On Day 2, load the new file and use Append Queries to combine with that persisted table.
Save or export again after combining, so future refreshes don’t lose the history

How the saving or export to be done. please add code or screen shots from power bi service

Cheers

CheenuSing

Did I answer your question? Mark my post as a solution and also give KUDOS !

Proud to be a Datanaut!

SamsonTruong · ‎06-05-2025

Hi @CheenuSing ,

Here is a solution I would recommend. I would recommend using a dataflow gen 2 and connect it to your SharePoint folder. In the dataflow, add a step to filter out all files except the most recently added file. The filter condition can be on file name for example since your file name is incrementing. This will ensure the dataflow will only load the most recent file. Then have the dataflow append the data to your target.

To take this a step further and fully automate it, you can create a power automate flow that will detect when a new file lands in your SharePoint folder, then the flow can call the Fabric REST API to kick off the dataflow.

If this helped, please mark it as the solution so others can benefit too. And if you found it useful, kudos are always appreciated.

Thanks,

Samson