March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hello.
Please help with advice.
For a long time, data for the report is downloaded daily from folder (about 800 Excel files). Every day the file for the current day is added, respectively the number of files and the volume of data uploaded is constantly increasing. In rare cases there is not only new file, but also corrected file for some previous date.
It was decided to load only new files in the future (filter by modification date), adding new data to the Orders table, which contains all data for all years.
Example: today we need to load files Orders_13102022.xls (new data) and Orders_29092022.xls (updated data).
As an option, I can after loading recent data in Stage tables:
1. Load into the Orders data from Orders_13102022.xls.
2. Delete from Orders data for 29092022 and load data from Orders_29092022.xls.
Please advise if there is a better solution.
Thank you in advance.
With best regards
Option 1:
Instead of loading files, you could explore sharepoint folder as a source ( https://learn.microsoft.com/en-us/power-query/connectors/sharepointfolder). The folder will have all your xls files. Say tomorrow you generate Orders_14102022.xls and save it in the sharepoint folder. The data will automatically come in Power Query since the new file is in the sharepoint folder.
Say you want to make a change on Orders_29092022.xls. You open Orders_29092022.xls located in the sharepoint folder and make the changes or overwrite with the new Orders_29092022.xls. Those changes will be reflected in power Query as power Query will always refresh by loading all the xls files located in the Sharepoint folder.
Only con of this is when you're using Sharepoint folder and a large number of xls files, the refresh time becomes very long. Another alternative below.
Option 2:
Create a Consolidated file. Where all the 800 files are appended into one master excel sheet. Then create a power automate or Robotic Automation flow where everytime a new file is generated, that the data from that file and copy pasted it into the master file.
If you want to make a change on a previous date, let's say 29092022, manually open the master file, filter out the 29092022 data, delete them and paste the new one in.
Thanks for the reply.
The task is just to no longer upload the entire array of files from SharePoint every time, but to upload only new and/or changed files.
For example, I will upload all 800 files for the last time today, and from tomorrow I will upload only new/modified files and do all transformations already in PowerQuery. In this case, I need to save a table with data from all files in PBI and then load only data from new files into it.
I'm trying to find a solution.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.