Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
frithjof_v
Community Champion
Community Champion

Dataflow gen1 vs. Dataflow gen2

Hi, 

 

I have some questions regarding dataflow gen1 vs. dataflow gen2 on a Fabric capacity.

 

Q1:

If I have the same M code in a dataflow gen1 and a dataflow gen2 (and the dataflow gen2 is writing its output to a Lakehouse destination), should I expect the dataflow gen1 or the dataflow gen2 to spend most refresh time and most CU's?

 

Q2:

Same as Q1, but this time the dataflow gen2 does not write to a destination. I.e. just using dataflow gen2 in the same way as we use dataflow gen1. Should I expect the dataflow gen1 or the dataflow gen2 to spend most refresh time and most CU's?

 

Q3:

Should I - in order to minimize CU consumption - as a rule of thumb use Dataflow Gen2 or as a rule of thumb use Dataflow Gen1?

Assume I don't need a Lakehouse, I just want to use dataflows and connect Power BI to the dataflows using Import mode in Power BI.

 

 

Thanks in advance!

2 REPLIES 2
david2658
New Member

Hello,

Data Volume and Complexity: For very small datasets or simple transformations, Dataflow Gen1 might be sufficient. However, for larger datasets and complex Official Site transformations, Dataflow Gen2 is the clear choice. Specific Use Cases: There might be specific use cases where Dataflow Gen1 is still a viable option, such as when working with legacy systems or for certain types of data processing.

v-yilong-msft
Community Support
Community Support

Hi @frithjof_v ,

Q1:

I think you should expect dataflow Gen2 to spend more refresh time and consume more capacity units. This is because writing to a Lakehouse destination involves additional processing and storage operations.

vyilongmsft_0-1722476930370.png

 

Q2:

If the dataflow Gen2 is not writing to a destination and is used in the same way as dataflow Gen1, the performance and CU consumption should be more comparable. However, dataflow Gen2 might still have a slight edge in terms of efficiency due to its improved architecture and high-scale compute capabilities.

 

Q3:

To minimize CU consumption, you might want to use dataflow Gen1 if you don’t need the advanced features of Gen2, such as Lakehouse integration or enhanced monitoring. Dataflow Gen1 is generally more lightweight and might be more efficient for simpler use cases.

But dataflow Gen2 offers new incremental refreshes, AI insights, or better integration with data pipelines, and while they consume more CUs, they are improved in efficiency and usage.I think you can look at this document: Differences between Dataflow Gen1 and Dataflow Gen2 - Microsoft Fabric | Microsoft Learn

vyilongmsft_1-1722477898799.png

 

 

 

Best Regards

Yilong Zhou

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

November Update

Fabric Monthly Update - November 2024

Check out the November 2024 Fabric update to learn about new features.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.