Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Score big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount

Reply
RichardJ
Responsive Resident
Responsive Resident

Advice for best way to structure and import a dataset (multiple excel files) for refresh performance

Hi,

Up until now I have dealt with small data sets. For the next job I had intended to use the same approach but I’m asking for advice on whether there is a better way to structure or store the source data.

 

A machine takes several measurements and saves the measurements into an excel file. I'm importing each Excel file and extracting the key information (Date, Component Number, Measured Point, Measurement Value, etc).

 

I'm using Power BI to look into the Folder where the files are stored to retrieve and import the excel files.

 

Over time there are going to be tens of thousands of excel files and it will take increasingly longer for Power BI to read each Excel file and import the results.

 

The Folder structure is currently as shown

Directory Structure.JPG

which enables me to extract from the File Path whether the Measurement is to be included in the 'Current' analysis or is 'Archived'.

 

Archived FlagArchived Flag

 

Sometimes we would want to include Archived data in the analysis but most of the time it will be current values only (perhaps the last couple of months files).

 

My concern is that when new measurement files are put into the Current Folder, upon refresh the archived folder will read all the files in there too. This will become time consuming for every refresh.

 

The Excel files are the only available data source (can’t connect to the machine or write straight to a database from the machine).

Once the files are in the archived directory the contents of the files will never change.

 

Is there a better way to approach this and avoid an increasingly long refresh time?

 

Is it possible to set a flag/update a setting  to prevent the refresh of the 'Archived' excel files in the instances where I really want to only refresh the 'Current' files and not the 'Archived' ones ?

 

Could I automatically/periodically migrate the 'archived' table into a separate database somehow to save repeatedly re-opening archived excel files?


Hope the questions makes sense - thanks for any assistance,

Cheers,

Richard

 

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

Use file based incremental refresh. Microsoft may tell you it cannot be done but that's incorrect. It's just that some Power Query commands are not possible as they mess with the "query folding" aspect.

 

https://www.poweredsolutions.co/2020/01/19/incremental-refresh-for-files-in-a-folder-or-sharepoint-p...

 

Oh, and a biased opinionated advice: Don't use Excel files. Use CSV files instead. Much faster to work with.

View solution in original post

2 REPLIES 2
lbendlin
Super User
Super User

Use file based incremental refresh. Microsoft may tell you it cannot be done but that's incorrect. It's just that some Power Query commands are not possible as they mess with the "query folding" aspect.

 

https://www.poweredsolutions.co/2020/01/19/incremental-refresh-for-files-in-a-folder-or-sharepoint-p...

 

Oh, and a biased opinionated advice: Don't use Excel files. Use CSV files instead. Much faster to work with.

RichardJ
Responsive Resident
Responsive Resident

Thanks @lbendlin. I didn't realise that incremental refresh could be used in this instance so i'll give it a go.

Appreciate your opinion on csv versus excel also.

Cheers,

Richard

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.