The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi guys, I am new to Power BI so sorry in advance if this is a silly question.
I have a Power BI dataflow as my data source which is about 4 million rows and 15 columns.
I can connect successfully to the dataflow however this is extremely slow and I'm not sure if this is to be expected or a problem with my settings.
1. Power BI Desktop takes ages to import the data which is approx 1.6 GB and that takes around 10-15 mins to load, also every time I do something in Power Query once I press close and apply it downloads that whole data again.
2. Power Query is really slow on simple queries such as filtering, adding, and removing columns ( each query takes around 2-3 mins to process)
Thank you in advance 😁
Solved! Go to Solution.
Hi @georgec96,
Another option to consider, if you need all that data, is to use direct query for dataflows.
https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-directquery
I've had to use this recently for a large dataset and it works well. It then gives you the option to build aggregation tables in your model to improve performance but also keep the detailed data when necessary.
Caveat, this will only work with premium or premium per user as you need to be able to turn on the 'Enhanced compute engine settings' for the Dataflow.
Hope this helps.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
Did you find a soultion to this?
I am the same as you - Using Dataflows is extremly slow. Every single transformation results in all the data redownloading for each applied step, very slowly.
I think it relates to API download limits, but thats just a guess.
Only solution I can think of, is to do all your transformations in the PowerBi service prior to completing the Dataflow. Then you only have to download it once in your PBIX file
These suggestions are helpful, but I would like to understand better what PBI Desktop is re-downloading the data from my dataflow every time I apply a transformation. If I am performing a simple operation such as adding two columns together to create a new column, why is PBI re-downloading the data?
When you create a calculated column or a measure in Power BI, the data is not re-downloaded because these operations are performed at the Data Model level rather than in Power Query. Since the calculated column is created at the Data Model level, it does not require Power Query to re-download the data because it has not made changes to the source data as such.
In contrast, when adding a new column using Power Query, access to all the data to perform transformations / manipulations is required. Therefore, with each new change in Power Query, once you hit "Apply" or "Close & Apply," Power Query will need to re-download all of the data again.
Importantly, the benefit of making transformations in Power Query is that they do not slow down the Data Model view. If anything, the transformations / manipulations via Power Query often lead to much faster processing in the front end and far greater system efficiencies / speed.
In summary, when you create a calculated column or a measure, the operation is performed at the Data Model level and does not require Power Query to re-download the data. However, it will slow down the speed / efficiencies in the front end (subject to how much data your Data Model has). Conversley, when you use Power Query to do the changes, you have much greater speeds / efficiences, but whenever you make these changes, you then need to "re-download" the data.
Hope this helps.
Theo 🙂
If I have posted a response that resolves your question, please accept it as a solution to formally close the post.
Also, if you are as passionate about Power BI, DAX and data as I am, please feel free to reach out if you have any questions, queries, or if you simply want to connect and talk to another data geek!
Want to connect?www.linkedin.com/in/theoconias
Thank you for your response Theo,
I do undertand that changes to things like measures in Power BI do not require a re-download and that makes sense to me. What I am having trouble understanding is, when I am applying transformations in Power Query, why is the existing data in my preview being re-downloaded? Today, I created a new column in Power Query by combining two columns, which triggered a re-download of the dataflow. When I realized that I didn't like the column name that I chose, I simply changed the name of the column, and it re-downloaded all the data again. This is before I have even clicked "Apply" or "Close and Apply". This makes for a very slow and frustrating design experience no matter how small I shrink my dataset. I don't understand why it is necessary, and it leads me to think that I am either doing something wrong, or perhaps the technology (dataflows) is just new, and they haven't had a chance to optimize it yet.
Hi @georgec96,
Another option to consider, if you need all that data, is to use direct query for dataflows.
https://docs.microsoft.com/en-us/power-bi/transform-model/dataflows/dataflows-directquery
I've had to use this recently for a large dataset and it works well. It then gives you the option to build aggregation tables in your model to improve performance but also keep the detailed data when necessary.
Caveat, this will only work with premium or premium per user as you need to be able to turn on the 'Enhanced compute engine settings' for the Dataflow.
Hope this helps.
Have I solved your problem? Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;). |
If you found this post helpful, please give Kudos. It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen. If you find my signature vaguely amusing, please give Kudos. | Proud to be a Super User! |
@TheoC, Thanks alot for your input, that was extremely helpful, I have one more question though.
Would a dataset (either as an import or direct query) be quicker than a dataflow when it comes to loading and refreshing times?
Hi @georgec96 I think the theoretical view is that using data flows will be quicker because the transformations occur in the data load process. However, if there is limited transformation and the data itself is quite clean, I would say this theory may be challenged. Take a look here at this link from Reza. He goes into it a bit but the comments on the blog are brilliant too: https://radacad.com/how-to-use-dataflow-to-make-the-refresh-of-power-bi-solution-faster
I hope this has been of some.use!
Theo 🙂
If I have posted a response that resolves your question, please accept it as a solution to formally close the post.
Also, if you are as passionate about Power BI, DAX and data as I am, please feel free to reach out if you have any questions, queries, or if you simply want to connect and talk to another data geek!
Want to connect?www.linkedin.com/in/theoconias
Hi @georgec96
A few things that can help:
Check out this link as well: https://docs.microsoft.com/en-us/power-bi/guidance/power-bi-optimization
There are a lot of ways you can reduce the load times and refresh rates. Hopefully the above will help guide you in the right direction.
All the best.
Theo
If I have posted a response that resolves your question, please accept it as a solution to formally close the post.
Also, if you are as passionate about Power BI, DAX and data as I am, please feel free to reach out if you have any questions, queries, or if you simply want to connect and talk to another data geek!
Want to connect?www.linkedin.com/in/theoconias
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
114 | |
79 | |
78 | |
47 | |
39 |
User | Count |
---|---|
147 | |
115 | |
65 | |
64 | |
53 |