Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Nathan_Mosher
Frequent Visitor

Dataflow Gen2 and Data Pipeline Consumes more CUs than Dataflow Gen1?

I've read in many places that data ingestion with the new Fabric items (such as Dataflow Gen2 and Pipelines) is more efficient than the traditional Dataflow Gen1 method. I was wondering if anyone has seen an improvement after switching to Gen2. However, I'm starting to suspect that Microsoft is penalizing CU usage for new Fabric items, as I’ve noticed our CU consumption increasing as we migrate to Fabric.

To explore this further, I set up a small experiment. In my testing, I created seven independent test cases, each ingesting the same data from the same endpoint using different methods. I then compared them against the legacy Dataflow Gen1 as a baseline. While I’m still conducting further testing, my preliminary results indicate that CU consumption is 5 to 9 times higher, depending on the method used.

For clarity, I'm using Dataflow Gen2 CICD items, while Pipelines perform a single copy action per table. The OData endpoint downloads 18 flat tables of varying dimensions. Warehouse/Lakehouse staging CU usage is excluded, as I couldn't differentiate between them, but it appears that each test utilizing staging consumed approximately 700 CUs. No transformations are being performed—this is strictly raw data ingestion. Additionally, I have not yet tested FastCopy, as the documentation states it is only beneficial for big data.

Now, here’s an interesting finding: I also tested chaining Dataflow Gen1 to Dataflow Gen2 before writing to storage—and believe it or not, this approach consumes fewer CUs than using Dataflow Gen2 directly. The refresh time is also comparable, with Gen1 averaging slightly faster execution. The results show that the most efficient Dataflow Gen2 configuration averages 28.7k CUs, while the most efficient Pipeline configuration averages 24.6k CUs. However, a Gen1-to-Gen2 chained approach averages 18k CUs (5k from Gen1 + 13k from Gen2)—a 40% savings.

Is anyone else experiencing similar results? I would appreciate other experiences when converting Dataflow Gen1's to Dataflow Gen2's or Data Pipelines for ingestion.

 

Results:

Nathan_Mosher_0-1740796216065.png

 

 

 

Test Case Items:

ItemName

ItemType

TestCase

Destination

Step

A2025_01_A_DFG1

Dataflow Gen1

01

 

A

A2025_01_C_DFG1

Dataflow Gen1

01

 

C

A2025_01_D_DFG1

Dataset

01

 

D

A2025_02_A_DFG2

Dataflow Gen2 CICD

02

 

A

A2025_02_C_DFG1

Dataflow Gen1

02

 

C

A2025_02_D_DS

Dataset

02

 

D

A2025_03_A_DFG2

Dataflow Gen2 CICD

03

A2025_03_B_LH

A

A2025_03_B_LH

Lakehouse

03

 

B

A2025_03_C_DFG1

Dataflow Gen1

03

 

C

A2025_03_D_DS

Dataset

03

 

D

A2025_04_A_DFG2

Dataflow Gen2 CICD

04

A2025_04_B_WH

A

A2025_04_B_WH

Warehouse

04

 

B

A2025_04_C_DFG1

Dataflow Gen1

04

 

C

A2025_04_D_DS

Dataset

04

 

D

A2025_05_A_PL

Data Pipeline

05

A2025_05_B_LH

A

A2025_05_B_LH

Lakehouse

05

 

B

A2025_05_C_DFG1

Dataflow Gen1

05

 

C

A2025_05_D_DS

Dataset

05

 

D

A2025_06_A_PL

Data Pipeline

06

A2025_06_B_WH

A

A2025_06_B_WH

Warehouse

06

 

B

A2025_06_C_DFG1

Dataflow Gen1

06

 

C

A2025_06_D_DS

Dataset

06

 

D

A2025_07_A_PL

Data Pipeline

07

A2025_07_B_LH

A

A2025_07_B_LH

Lakehouse

07

 

B

A2025_07_C_DFG1

Dataflow Gen1

07

 

C

A2025_07_D_DS

Dataset

07

 

D

 

 

 

 

1 ACCEPTED SOLUTION
v-vpabbu
Community Support
Community Support

Hi @Nathan_Mosher,

 

Thank you for reaching out to Microsoft Fabric Community Forum.

 

Its true that, Dataflow Gen2 and Data Pipelines generally consume more CUs than Dataflow Gen1.
Dataflow Gen1 consumes fewer CUs due to its sequential execution, lazy evaluation, and minimal staging, processing only necessary data on demand. In contrast, Gen2 utilizes parallel processing, batch execution, and persistent staging, leading to higher CU consumption. Additionally, metadata tracking and lineage features in Gen2 add further overhead, making it more resource-intensive but scalable.

 

If this post helps, then please consider Accepting as solution to help the other members find it more quickly, don't forget to give a "Kudos" – I’d truly appreciate it!


Regards,
Vinay Pabbu

View solution in original post

5 REPLIES 5
andrewsommer
Super User
Super User

Brunner BI did a good breakdown about the differences between Gen 1 and Gen 2 dataflows.  Basically, you are trading speed for compute. 

en.brunner.bi/post/comparing-cost-of-dataflows-gen1-vs-gen2-in-power-bi-and-fabric-1

 

Please mark this post as solution if it helps you. Appreciate Kudos.

v-vpabbu
Community Support
Community Support

Hi @Nathan_Mosher,

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

 

Regards,
Vinay Pabbu

Hi @Nathan_Mosher,

 

May I ask if you have gotten this issue resolved?

If it is solved, please mark the helpful reply or share your solution and accept it as solution, it will be helpful for other members of the community who have similar problems as yours to solve it faster.

 

Regards,
Vinay Pabbu

Hi @Nathan_Mosher,

 

As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

 

Regards,
Vinay Pabbu

v-vpabbu
Community Support
Community Support

Hi @Nathan_Mosher,

 

Thank you for reaching out to Microsoft Fabric Community Forum.

 

Its true that, Dataflow Gen2 and Data Pipelines generally consume more CUs than Dataflow Gen1.
Dataflow Gen1 consumes fewer CUs due to its sequential execution, lazy evaluation, and minimal staging, processing only necessary data on demand. In contrast, Gen2 utilizes parallel processing, batch execution, and persistent staging, leading to higher CU consumption. Additionally, metadata tracking and lineage features in Gen2 add further overhead, making it more resource-intensive but scalable.

 

If this post helps, then please consider Accepting as solution to help the other members find it more quickly, don't forget to give a "Kudos" – I’d truly appreciate it!


Regards,
Vinay Pabbu

Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.