This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
In this blog, you'll learn how the Fast Copy feature helps to enhance the performance and cost-efficiency of your Dataflows Gen2.
In March, we announced the Public Preview of Fast Copy in Dataflows Gen2 within Microsoft Fabric. This feature allows you to ingest large amounts of data efficiently, leveraging the same backend as the Copy Activity in data pipelines. This helps to reduce data processing duration, and helps to improve cost efficiency. You also can find detailed instructions on enabling Fast Copy.
Let's explore a real-world example to show case the benefits of enabling Fast Copy in Dataflows Gen2 within Microsoft Fabric. We used Dataflow Gen2 to load four CSV files, totaling 6GB, into a Lakehouse table. By comparing the performance and cost before and after enabling Fast Copy, we'll demonstrate the significant improvements you can achieve.
To accomplish this scenario, create a Dataflow Gen2 by following these steps:
Dataflow_Gen2_refresh_of_using_native_Dataflow_Gen2_engine_without_Fast_Copy
The Dataflow Gen2 Refresh operation consumed about 30 minutes with 28,816 CU seconds.
| Metric | Compute consumption |
| Dataflow Gen2 Refresh CU seconds | 28,816 CU seconds |
Total run cost at $0.18/CU hour = (28,816) / (60*60) CU-hours * ($0.18/CU hour) ~= $1.44
To accomplish this scenario, you need to create a dataflow with the same steps from the previous test case. The only different step is to enable Fast Copy Feature as below:
Enable_Fast_Copy_in_Dataflow_Gen2
Dataflow_Gen2_refresh_of_using_Dataflow_Gen2_powered_Fast_Copy
The Dataflow Gen2 Refresh operation consumed almost about 4 minutes with 3,696 CU seconds on Dataflow Gen2 Refresh and 5,448 CU seconds on Data movement.
| Metric | Compute consumption |
| Dataflow Gen2 Refresh CU seconds | 3,696 CU seconds |
| Data movement CU seconds | 5,448 CU seconds |
Total run cost at $0.18/CU hour = (3,696 + 5,448) / (60*60) CU-hours * ($0.18/CU hour) ~= $0.46
| Feature | Performance | Cost | Conclusion |
| Dataflow Gen2 without Fast Copy | Copy Duration: 29:56 | $1.44 | |
| Dataflow Gen2 powered by Fast Copy | Copy Duration: 3:47 | $0.46 | 8x increase in performance 3x decrease in cost. |
With Fast Copy in Dataflow Gen2, you will see significantly reduced data processing times and improved cost efficiency. From the example above, loading a 6 GB CSV file to a Lakehouse table in Microsoft Fabric results in an 8x increase in performance and a 3x decrease in cost.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.