Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
dgolovanova
Frequent Visitor

Remove and add data based on condition

Hello.
Please help with advice.
For a long time, data for the report is downloaded daily from folder (about 800 Excel files). Every day the file for the current day is added, respectively the number of files and the volume of data uploaded is constantly increasing. In rare cases there is not only new file, but also corrected file for some previous date.
It was decided to load only new files in the future (filter by modification date), adding new data to the Orders table, which contains all data for all years.

Example: today we need to load files Orders_13102022.xls (new data) and Orders_29092022.xls (updated data). 

As an option, I can after loading recent data in Stage tables:
1. Load into the Orders data from Orders_13102022.xls.
2. Delete from Orders data for 29092022 and load data from Orders_29092022.xls.

Please advise if there is a better solution.

Thank you in advance.

With best regards

2 REPLIES 2
Imrans123
Advocate V
Advocate V

Option 1:

 

Instead of loading files, you could explore sharepoint folder as a source ( https://learn.microsoft.com/en-us/power-query/connectors/sharepointfolder). The folder will have all your xls files. Say tomorrow you generate Orders_14102022.xls and save it in the sharepoint folder. The data will automatically come in Power Query since the new file is in the sharepoint folder.

 

Say you want to make a change on Orders_29092022.xls. You open Orders_29092022.xls located in the sharepoint folder and make the changes or overwrite with the new Orders_29092022.xls. Those changes will be reflected in power Query as power Query will always refresh by loading all the xls files located in the Sharepoint folder.

 

Only con of this is when you're using Sharepoint folder and a large number of xls files, the refresh time becomes very long. Another alternative below.

 

Option 2: 

Create a Consolidated file. Where all the 800 files are appended into one master excel sheet. Then create a power automate or Robotic Automation flow where everytime a new file is generated, that the data from that file and copy pasted it into the master file. 

 

If you want to make a change on a previous date, let's say 29092022, manually open the master file, filter out the 29092022 data, delete them and paste the new one in. 

 

 

Thanks for the reply.
The task is just to no longer upload the entire array of files from SharePoint every time, but to upload only new and/or changed files.
For example, I will upload all 800 files for the last time today, and from tomorrow I will upload only new/modified files and do all transformations already in PowerQuery. In this case, I need to save a table with data from all files in PBI and then load only data from new files into it.

I'm trying to find a solution.

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Power BI Carousel June 2024

Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Top Solution Authors
Top Kudoed Authors