Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
I'm using dataflows in our premium embedded workspace to pull various files/tables to report in Power BI and make them available across multiple datasets. This works great for all the dataflows I've built so far. Refreshes are taking 5-10 minutes for dataflows in the 500 mb to 2 gb range with sources being sales data in ADLS Gen2 data lake CSV and txt files, along with the Spark tables in our Azure Synapse workspace. However, I've tried to create a dataflow based on an "Archive" structure of hourly files in our data lake. Dataflows based on files in our data lake of one file per day are running fine (15 minutes to refresh the dataflow). This Archive structure has 24 files per day, many of which are empty. This dataflow is taking over 2 hours to refresh. I should note that the actual files for the daily single file and the archive files are VERY similar in data (sales type data of 5-10 string columns and 5-10 numeric columns).
The lake file path structure looks like this:
Customer / marketplace / Archive or snapshot / Year/month/day (depending on archive or snapshot)
I've looked at our Tenant Metrics in the Azure Portal, and the 2 hour refresh finishes and has the same memory and QPU load whether we're on tier A2 or A4. I'm at a loss, and assume its something to do with the backend of Power Query and how it reads files. I'm also worried that as our data lake size increases with 24 new files every day for each customer, the dataflows refresh time will skyrocket and make it unusable. Any ideas?
Hi @Anonymous
have you had a look at the updated dataflows connector with the Aug 2021 version of Power BI Desktop with the Enhanced compute engine for data flows?
Here is a blog post which I think will help: Chris Webb's BI Blog: How Query Folding And The New Power BI Dataflows Connector Can Help Dataset Refresh Performance Chris Webb's BI Blog (crossjoin.co.uk)
I have enhanced compute on for this dataflow, but this is all in the service - no Desktop involved. This is all within a data flow within my tenant on app.powerbi.com
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.
| User | Count |
|---|---|
| 58 | |
| 17 | |
| 11 | |
| 10 | |
| 10 |