The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Dear Community,
I would like to know how you would load the following data into PowerQuery and finally into PowerBI. I am provided with eight different exports that cannot be changed. The column names of the different exports are the same with different values and have to summed up after they have been loaded. Below is an overview of the data.
xlsx-files | no. of columns | no. of rows |
file 1 | 600 | 35 000 |
.. | ... | ... |
file 8 | 600 | 35 000 |
Unfortunately, the 15-minute time resolution must be retained so that the number of rows is not reduced. The same applies to the number of columns. The Excel files are stored on a SharePoint and are only updated once a year.
The transformation of the files is not a problem in itself, the problem is the unacceptable waiting times when performing transformations such as groub by etc.
My current solution is that I have written a Python script that takes over the transformations from Power Query.
But is there also a solution that I can implement only with the help of PowerQuery and Power BI? Maybe customize the import mode of the Excel files or something similar?
I look forward to your thoughts
Best regards
jb
Solved! Go to Solution.
I ran into something similar recently. The lessons I learned were
- don't try to be cute and use List.Generate or something like that. Create eight (yes, eight) separate Power Query queries, one for each export. This will yield maximum possible performance (don't forget to set parallelism to 8 or more).
- Use Binary.Buffer when fetching the files from SharePoint. That will tell the SharePoint server that you are impatient and not willing to wait for the chunking.
- Do as few transforms as possible. Maybe change column types, but that's it.
- load all queries into Power BI and do a UNION calculated table there.
I ran into something similar recently. The lessons I learned were
- don't try to be cute and use List.Generate or something like that. Create eight (yes, eight) separate Power Query queries, one for each export. This will yield maximum possible performance (don't forget to set parallelism to 8 or more).
- Use Binary.Buffer when fetching the files from SharePoint. That will tell the SharePoint server that you are impatient and not willing to wait for the chunking.
- Do as few transforms as possible. Maybe change column types, but that's it.
- load all queries into Power BI and do a UNION calculated table there.
Wow, those are some interesting and new findings. Generally, you prefer to keep the data model in Power BI small and clear, but obviously the performance of Power BI (Measures and Calculated Tables) seems to be much more effective. Thank you very much!
It is important to note that this is a corner case. Usually you want code that is more concise, easier to maintain, even if it is a little slower. But in your scenario when brute force power is required you need to compromise on the code maintainability.
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.