Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
WalmZ
New Member

Get data from source, save that data to a table each day, retain records that have been deleted

Hi all,

This is my first time posting on this forum. I have seeked help and come across this platform as I have been trying for several days to get my head around something I am trying to do, to no success! Hopefully someone hear can help. I am sure this must have been asked before but I cannot find a similar question.

I am connecting to some data in power query from an excel file. It is a file containing transaction/invoice information. This source file will grow each day, so will contain all previous transactions, but then include the new information for that day. Accounts can appear several times if there has been an update to the account on a later date.

 

So in summary, I want to connect to the data, copy it all to a new table. Then the following day, connect to the file again (but now the file is updated) and bring through anything new to my copied table (and then remove duplicates). I have managed to achieve this but the issue I have is that if an invoice has been paid, it will drop off the source file, therefore dropping off the copied table. I do not want this to happen. I want my copied table to retain all information that has ever been on the source file.

I hope this makes sense. I appreciate any advice.
Thanks
Lee

1 ACCEPTED SOLUTION
BA_Pete
Super User
Super User

Hi Lee,

 

Power Query is not a data warehouse and is not designed to retain data beyond its immediate use.

The PQ refresh process works like this:

Wipe all data > Import available data from source > Perform transformations > Push to Data Model.

As such, there's no function to create and retain history/archives of changing source data.

 

The closest native function (within the Power BI universe) would be Incremental Refresh, which comes with its own significant caveats:

https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview 

 

Beyond this, you will need to arrange some kind of data warehousing or storage capability at your end to keep the history that you can then query with PQ.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

4 REPLIES 4
christinepayton
Super User
Super User

If you're trying to retain a history for the file in a way that's accessible to PBI, you could use Power Automate on a schedule to copy it to a new file and use the date in the filename to give it uniqueness. If you use the folder connector then, I guess you could get all versions that ever existed... it might be a good idea to have Power Automate set a "snapshot date" field in association with the file copy, or extract it from the filename qhen you're querying, so that you have some idea of when the snapshots are from.

 

It sounds kind of like you could use a real records management system here, though, vs trying to manage this in Excel. Dynamics or even a basic SharePoint list would be better - at least in a SP list you can trigger actions based on things happening to the record, such as status changing to paid, that give you an indication of what's changed when. 

BA_Pete
Super User
Super User

Hi Lee,

 

Power Query is not a data warehouse and is not designed to retain data beyond its immediate use.

The PQ refresh process works like this:

Wipe all data > Import available data from source > Perform transformations > Push to Data Model.

As such, there's no function to create and retain history/archives of changing source data.

 

The closest native function (within the Power BI universe) would be Incremental Refresh, which comes with its own significant caveats:

https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview 

 

Beyond this, you will need to arrange some kind of data warehousing or storage capability at your end to keep the history that you can then query with PQ.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Hi BA_Pete,

Thank you for taking the time to reply to me.

 

If Power Query is not the best tool for this, do you have any other suggestions how I could go about this in Excel? Essentially taking a report, making a copy and then downloading the report the following day, merging the new report with the copy, and then removing duplicates. So I am creating a rolling report day by day....?

Thanks, Lee

 

It depends on what quantity of data we're talking about.

 

The simplest solution would probably be to save every source file into a SharePoint or network folder then connect to that using either the 'SharePoint Folder' or 'Folder' connector, then add a Remove Duplicates step into the Transform Sample File query. However, if we're talking about hundreds of thousands of rows and 50-odd columns in every file (and growing) then this may not be feasible (or just incredibly slow and/or resource-intensive).

If the files are actually individually very large, then I think you'll want to look into a proper data warehousing solution.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.