March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
I have some questions regarding dataflow gen1 vs. dataflow gen2 on a Fabric capacity.
Q1:
If I have the same M code in a dataflow gen1 and a dataflow gen2 (and the dataflow gen2 is writing its output to a Lakehouse destination), should I expect the dataflow gen1 or the dataflow gen2 to spend most refresh time and most CU's?
Q2:
Same as Q1, but this time the dataflow gen2 does not write to a destination. I.e. just using dataflow gen2 in the same way as we use dataflow gen1. Should I expect the dataflow gen1 or the dataflow gen2 to spend most refresh time and most CU's?
Q3:
Should I - in order to minimize CU consumption - as a rule of thumb use Dataflow Gen2 or as a rule of thumb use Dataflow Gen1?
Assume I don't need a Lakehouse, I just want to use dataflows and connect Power BI to the dataflows using Import mode in Power BI.
Thanks in advance!
Hello,
Data Volume and Complexity: For very small datasets or simple transformations, Dataflow Gen1 might be sufficient. However, for larger datasets and complex Official Site transformations, Dataflow Gen2 is the clear choice. Specific Use Cases: There might be specific use cases where Dataflow Gen1 is still a viable option, such as when working with legacy systems or for certain types of data processing.
Hi @frithjof_v ,
Q1:
I think you should expect dataflow Gen2 to spend more refresh time and consume more capacity units. This is because writing to a Lakehouse destination involves additional processing and storage operations.
Q2:
If the dataflow Gen2 is not writing to a destination and is used in the same way as dataflow Gen1, the performance and CU consumption should be more comparable. However, dataflow Gen2 might still have a slight edge in terms of efficiency due to its improved architecture and high-scale compute capabilities.
Q3:
To minimize CU consumption, you might want to use dataflow Gen1 if you don’t need the advanced features of Gen2, such as Lakehouse integration or enhanced monitoring. Dataflow Gen1 is generally more lightweight and might be more efficient for simpler use cases.
But dataflow Gen2 offers new incremental refreshes, AI insights, or better integration with data pipelines, and while they consume more CUs, they are improved in efficiency and usage.I think you can look at this document: Differences between Dataflow Gen1 and Dataflow Gen2 - Microsoft Fabric | Microsoft Learn
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |