Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
Hi, we have roughly 100 excel files per month, all with exactly the same format. Each file is less than 2MB in size. We are saving these files into a Share Point folder. (These files are the output of a simulation model which runs as an execl plug in. We make sure to overwrite everything as text before saving to SharePoint which reduces the file size)
We have created a PowerBI that connects to this folder and combines all the data into one table. We want this table to be a core database instead of having stranded data in excel files. And then we can create reports from this PowerBI Dataset.
There are a few transformations in the helper query to bring in only relevant columns, etc. We have tried to keep transformations as minimum as possible (is therea way to measure and share this information?). The size of the PBI Desktop file is 8MB.
Currently we have 124 excel files and Power BI schedule refresh history shows that it took 36 minutes to refresh.
We will be adding roughly 100 files every month.
1. Is this a sustainable solution - as in, will the refresh time get longer as more files are added to SharePoint? At what point do we need to start worrying about scalability?
2. If we move to a premium capacity and schedule refresh for 48 times a day, what happens if any one refresh takes more than half hour to complete; will we ever see refreshed data?
I've tried to share all details, this is the first question I'm posting here. Please do ask for any other details if that can help.
Thank you very much for trying to help me!
Solved! Go to Solution.
1) My suspicion is you'll be ok for a few months but then start to hit issues. I think on premium the max refresh time is 5 hours. Most of issue will be sharepoint rather than number of files.
The fact that the pbix file is only 8mb isn't an indication of the amount of data processed because of the compression in use.
Personally I would look at ingesting the files into a sql database as they arrive using ssis
or azure data factory. Alternatively have a look at saving them as csv in: https://docs.microsoft.com/en-us/azure/storage/blobs/create-data-lake-storage-account
You could also look at https://www.fourmoo.com/2020/06/10/how-you-can-incrementally-refresh-any-power-bi-data-source-this-e...
2) The service won't let a new refresh start while one is running so you should be ok.
Thanks @bcdobbs ,
I have a co worker telling me I should consider data flows coupled with loading only hot / new files with every refresh. So the cold / old files don't need to be read in again and again.
I'm not familiar with how to do either, so its going to be a try and learn scenario. I'm thinking last resort will have to be SQL database.
1) My suspicion is you'll be ok for a few months but then start to hit issues. I think on premium the max refresh time is 5 hours. Most of issue will be sharepoint rather than number of files.
The fact that the pbix file is only 8mb isn't an indication of the amount of data processed because of the compression in use.
Personally I would look at ingesting the files into a sql database as they arrive using ssis
or azure data factory. Alternatively have a look at saving them as csv in: https://docs.microsoft.com/en-us/azure/storage/blobs/create-data-lake-storage-account
You could also look at https://www.fourmoo.com/2020/06/10/how-you-can-incrementally-refresh-any-power-bi-data-source-this-e...
2) The service won't let a new refresh start while one is running so you should be ok.
User | Count |
---|---|
64 | |
59 | |
47 | |
33 | |
32 |
User | Count |
---|---|
84 | |
75 | |
56 | |
50 | |
44 |