Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
RichardJ
Responsive Resident
Responsive Resident

Advice for best way to structure and import a dataset (multiple excel files) for refresh performance

Hi,

Up until now I have dealt with small data sets. For the next job I had intended to use the same approach but I’m asking for advice on whether there is a better way to structure or store the source data.

 

A machine takes several measurements and saves the measurements into an excel file. I'm importing each Excel file and extracting the key information (Date, Component Number, Measured Point, Measurement Value, etc).

 

I'm using Power BI to look into the Folder where the files are stored to retrieve and import the excel files.

 

Over time there are going to be tens of thousands of excel files and it will take increasingly longer for Power BI to read each Excel file and import the results.

 

The Folder structure is currently as shown

Directory Structure.JPG

which enables me to extract from the File Path whether the Measurement is to be included in the 'Current' analysis or is 'Archived'.

 

Archived FlagArchived Flag

 

Sometimes we would want to include Archived data in the analysis but most of the time it will be current values only (perhaps the last couple of months files).

 

My concern is that when new measurement files are put into the Current Folder, upon refresh the archived folder will read all the files in there too. This will become time consuming for every refresh.

 

The Excel files are the only available data source (can’t connect to the machine or write straight to a database from the machine).

Once the files are in the archived directory the contents of the files will never change.

 

Is there a better way to approach this and avoid an increasingly long refresh time?

 

Is it possible to set a flag/update a setting  to prevent the refresh of the 'Archived' excel files in the instances where I really want to only refresh the 'Current' files and not the 'Archived' ones ?

 

Could I automatically/periodically migrate the 'archived' table into a separate database somehow to save repeatedly re-opening archived excel files?


Hope the questions makes sense - thanks for any assistance,

Cheers,

Richard

 

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

Use file based incremental refresh. Microsoft may tell you it cannot be done but that's incorrect. It's just that some Power Query commands are not possible as they mess with the "query folding" aspect.

 

https://www.poweredsolutions.co/2020/01/19/incremental-refresh-for-files-in-a-folder-or-sharepoint-p...

 

Oh, and a biased opinionated advice: Don't use Excel files. Use CSV files instead. Much faster to work with.

View solution in original post

2 REPLIES 2
lbendlin
Super User
Super User

Use file based incremental refresh. Microsoft may tell you it cannot be done but that's incorrect. It's just that some Power Query commands are not possible as they mess with the "query folding" aspect.

 

https://www.poweredsolutions.co/2020/01/19/incremental-refresh-for-files-in-a-folder-or-sharepoint-p...

 

Oh, and a biased opinionated advice: Don't use Excel files. Use CSV files instead. Much faster to work with.

RichardJ
Responsive Resident
Responsive Resident

Thanks @lbendlin. I didn't realise that incremental refresh could be used in this instance so i'll give it a go.

Appreciate your opinion on csv versus excel also.

Cheers,

Richard

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.