Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi. I'd like to ask what might be a really dumb, obvious question. But after Googling and reviewing documentation, I haven't found a definitive answer. So here's the question: why would I want to connect dataflows to Azure Data Lake? What are the advantages of doing so?
My best guess comes down to this: a capacity has finite resources for storage and query processing. Without using ADL that data from dataflows is stored in internal storage in the Power BI capacity. So the available resources available for query processing would be total capacity less anything consumed by datasets and dataflows. ADLS would allow offloading dataflows onto the data lake, resulting in more resources for query processing than otherwise possible. Am I wrong? Did I miss anything???
Thank you! 🙂
Solved! Go to Solution.
@littlemojopuppy So the short answer is that with the internal storage, you don't really have any way to connect to the data. With ADLS Gen2 your data is stored as CDM and includes metadata:
"Power BI stores the data in the common data model (CDM) format, which captures metadata about your data in addition to the actual data generated by the dataflow itself. This feature unlocks many powerful capabilities and enables your data and the associated metadata in CDM format to now serve extensibility, automation, monitoring, and backup scenarios. When you make this data available and widely accessible in your own environment, it enables you to democratize the insights and data created within your organization. It also unlocks the ability for you to create further solutions with a wide range of complexity. Your solutions can be CDM aware custom applications and solutions in Power Platform, Azure, and those available through partner and independent software vendor (ISV) ecosystems. Or you can create an application to read a CSV. Your data engineers, data scientists, and analysts can now work with, use, and reuse a common set of data that is curated in ADLS Gen 2."
Configuring dataflow storage to use Azure Data Lake Gen 2 - Power BI | Microsoft Learn
Thanks for taking the time to reply. I probably should have included more about our environment/platform and use case up front.
We have a bunch of source systems, all of which get pulled into a data lake. Go through staging, domain and exposed in the final layer as tables in our data warehouse. We'd create dataflows (one for each fact and dimension) pointing at the data warehouse for users to build and refresh their reports against so they're not constantly hitting the data warehouse/data lake itself. And we could park the output of the dataflows on the lake (again) but in CDM format. With the data originating in the lake, wouldn't data would already be availble to other applications?
I feel like I'm missing something 🙂
@littlemojopuppy No, in your case I don't think you would get much benefit.
@littlemojopuppy So the short answer is that with the internal storage, you don't really have any way to connect to the data. With ADLS Gen2 your data is stored as CDM and includes metadata:
"Power BI stores the data in the common data model (CDM) format, which captures metadata about your data in addition to the actual data generated by the dataflow itself. This feature unlocks many powerful capabilities and enables your data and the associated metadata in CDM format to now serve extensibility, automation, monitoring, and backup scenarios. When you make this data available and widely accessible in your own environment, it enables you to democratize the insights and data created within your organization. It also unlocks the ability for you to create further solutions with a wide range of complexity. Your solutions can be CDM aware custom applications and solutions in Power Platform, Azure, and those available through partner and independent software vendor (ISV) ecosystems. Or you can create an application to read a CSV. Your data engineers, data scientists, and analysts can now work with, use, and reuse a common set of data that is curated in ADLS Gen 2."
Configuring dataflow storage to use Azure Data Lake Gen 2 - Power BI | Microsoft Learn