March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
Up until now I have dealt with small data sets. For the next job I had intended to use the same approach but I’m asking for advice on whether there is a better way to structure or store the source data.
A machine takes several measurements and saves the measurements into an excel file. I'm importing each Excel file and extracting the key information (Date, Component Number, Measured Point, Measurement Value, etc).
I'm using Power BI to look into the Folder where the files are stored to retrieve and import the excel files.
Over time there are going to be tens of thousands of excel files and it will take increasingly longer for Power BI to read each Excel file and import the results.
The Folder structure is currently as shown
which enables me to extract from the File Path whether the Measurement is to be included in the 'Current' analysis or is 'Archived'.
Sometimes we would want to include Archived data in the analysis but most of the time it will be current values only (perhaps the last couple of months files).
My concern is that when new measurement files are put into the Current Folder, upon refresh the archived folder will read all the files in there too. This will become time consuming for every refresh.
The Excel files are the only available data source (can’t connect to the machine or write straight to a database from the machine).
Once the files are in the archived directory the contents of the files will never change.
Is there a better way to approach this and avoid an increasingly long refresh time?
Is it possible to set a flag/update a setting to prevent the refresh of the 'Archived' excel files in the instances where I really want to only refresh the 'Current' files and not the 'Archived' ones ?
Could I automatically/periodically migrate the 'archived' table into a separate database somehow to save repeatedly re-opening archived excel files?
Hope the questions makes sense - thanks for any assistance,
Cheers,
Richard
Solved! Go to Solution.
Use file based incremental refresh. Microsoft may tell you it cannot be done but that's incorrect. It's just that some Power Query commands are not possible as they mess with the "query folding" aspect.
Oh, and a biased opinionated advice: Don't use Excel files. Use CSV files instead. Much faster to work with.
Use file based incremental refresh. Microsoft may tell you it cannot be done but that's incorrect. It's just that some Power Query commands are not possible as they mess with the "query folding" aspect.
Oh, and a biased opinionated advice: Don't use Excel files. Use CSV files instead. Much faster to work with.
Thanks @lbendlin. I didn't realise that incremental refresh could be used in this instance so i'll give it a go.
Appreciate your opinion on csv versus excel also.
Cheers,
Richard
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
124 | |
89 | |
84 | |
70 | |
51 |
User | Count |
---|---|
206 | |
146 | |
97 | |
79 | |
69 |