The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
Hi,
I'm loading data from an Azure Data Lake Gen2 folder (10 CSV files totalling around 5GB) using Power Query (started in Desktop and now doing this in Dataflows). I have a GroupBy transformation which is reducing the data down to around 10MB after aggregation. The performance is around 15 minutes to load the data in and perform the GroupBy transformation. Can anyone tell me if Power BI is doing all the work here? In that Power BI isn't able to fold any transformation queries down to the Data Lake (it's just storage right?) and that the raw data is being loaded into the dataflow engine and then transformed?
Thanks
Solved! Go to Solution.
Hello @AndyDDC
for sure there is no way to fold back to a CSV-file. Maybe there are methologies where it would be possible to fold, because another engine could load the csv to a database and accepting some native query.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Hello @AndyDDC
for sure there is no way to fold back to a CSV-file. Maybe there are methologies where it would be possible to fold, because another engine could load the csv to a database and accepting some native query.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Thanks Jimmy, that makes sense about CSV files.
How about Parquet files? I'm wondering if they are able to have some processing folded?
Hello @AndyDDC
sorry but I never heard about Parquet files. When you are loading from files it's always best to start with reducing data in the first steps, because this has an impact on loading time. So filter-steps, remove columns-steps and then group steps first
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy