Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
Ananth_Bhirappa
Regular Visitor

Huge Data Volume_490Million records

Hi Folks,

 

We are trying to pull data from onpremsql to fabriclakehouse which is around 490million records with 24 columns..

I have two options 

1. Pipeline - I did this scenario it is taking 8hours

2. Dataflow - Fastcopy.

 

Need your suggestion on dataflow with fast copy how much time will it takes and how much will be the cost?

1 ACCEPTED SOLUTION
nilendraFabric
Solution Supplier
Solution Supplier

Hi @Ananth_Bhirappa 

It will be difficult to give an exact estimate for how long a Dataflow with Fast Copy would take or its precise cost for your specific scenario. However, let me try to get some insights on the potential benefits and considerations of using Dataflow with Fast Copy:
Performance Improvement: Fast Copy can potentially reduce the ingestion time compared to standard Dataflow or Pipeline operations. In some cases, it has been reported to process billions of rows in minutes rather than hours

  1. Cost Efficiency: Fast Copy generally consumes fewer Capacity Units (CUs) compared to standard Dataflow operations. This can lead to lower costs, especially for large data volumes.
  2. Limitations: Fast Copy has some restrictions, such as limited support for transformations and specific file formats (.csv and .parquet)

 

Cost Considerations

The cost for Dataflow Gen2 with Fast Copy is calculated based on the following: 

  • Standard Compute: 16 CUs per hour
  • High Scale Dataflows Compute: 6 CUs per hour
  • Data movement: 1.5 CUs per hour

Points to consider :

  1. Test with a Subset: Before running the full 490 million records, test Fast Copy with a smaller subset to gauge performance and cost.
  2. Monitor with Fabric Metrics App: Use the Fabric Metrics App to accurately measure CU consumption and duration for your specific scenario.

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks

Nilendra



View solution in original post

2 REPLIES 2
nilendraFabric
Solution Supplier
Solution Supplier

Hi @Ananth_Bhirappa 

It will be difficult to give an exact estimate for how long a Dataflow with Fast Copy would take or its precise cost for your specific scenario. However, let me try to get some insights on the potential benefits and considerations of using Dataflow with Fast Copy:
Performance Improvement: Fast Copy can potentially reduce the ingestion time compared to standard Dataflow or Pipeline operations. In some cases, it has been reported to process billions of rows in minutes rather than hours

  1. Cost Efficiency: Fast Copy generally consumes fewer Capacity Units (CUs) compared to standard Dataflow operations. This can lead to lower costs, especially for large data volumes.
  2. Limitations: Fast Copy has some restrictions, such as limited support for transformations and specific file formats (.csv and .parquet)

 

Cost Considerations

The cost for Dataflow Gen2 with Fast Copy is calculated based on the following: 

  • Standard Compute: 16 CUs per hour
  • High Scale Dataflows Compute: 6 CUs per hour
  • Data movement: 1.5 CUs per hour

Points to consider :

  1. Test with a Subset: Before running the full 490 million records, test Fast Copy with a smaller subset to gauge performance and cost.
  2. Monitor with Fabric Metrics App: Use the Fabric Metrics App to accurately measure CU consumption and duration for your specific scenario.

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks

Nilendra



v-kongfanf-msft
Community Support
Community Support

Hi  @Ananth_Bhirappa ,

 

Using Dataflow with Fast Copy in Microsoft Fabric may be more efficient than traditional pipelines. It utilizes the fabric compute engine to significantly reduce the time required for data ingestion and transformation.

 

For more details, you can refer to below document:

Fast copy in Dataflows Gen2 - Microsoft Fabric | Microsoft Learn

 

Best Regards,
Adamk Kong

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

JanFabricDE_carousel

Fabric Monthly Update - January 2025

Explore the power of Python Notebooks in Fabric!

JanFabricDW_carousel

Fabric Monthly Update - January 2025

Unlock the latest Fabric Data Warehouse upgrades!

JanFabricDF_carousel

Fabric Monthly Update - January 2025

Take your data replication to the next level with Fabric's latest updates!