Dataflow performance and more

DennesTorres · ‎12-28-2023

Hi,

I did some performance comparison of dataflows execuiton and I'm not sure if I completely understand the results.

The data to be loaded was this: https://azuresynapsestorage.blob.core.windows.net/sampledata/WideWorldImportersDW/tables/fact_sale.p...

First execution

Loading to a data warehouse

Adding 3 calculated fields

Execution time: 58m49sec

Second Execution

Loading to a data warehouse

No calculation

Execution time: 39m59sec

Conclusion: Simple calculations in a big amount of data can take 19 minutes to run

Third Execution

Disable Staging

Load to a lakehouse

Execution Time: 13m12s

Conclusion: The fact it's faster without staging I understand. The fact the data warehouse doesn't work as a destination without staging I don't. How to explain to someone that if he chooses a data warehouse, the dataflows gen 2 will be 26 minutes slower because the operations can't be done in memory when the target is a data warehouse?

Is there any improvement planned on this ?

Fourth Execution

Data Pipeline to a lakehouse

Execution time: 16m 58s

Conclusion: I don't know how to explain the difference between a pipeline and a dataflow gen 2 without staging. Are there some configurations I should be checking?

Fifth Execution

Data pipeline to a data warehouse

Execution Time: 4m17s

Conclusion: The data warehouse is way faster than the lakehouse for data ingestion? Why this power can't be used on dataflows gen 2?

Sixth Execution

COPY INTO in the data warehouse

Execution time: 1m32sec

Conclusion: I'm not sure where to start about this last one. So, Polaris has all this power, but we can't use any of this for data ingestion (dataflows/pipelines) ?

All these differences makes the scenario a bit difficult to choose when to use each one of the solutions. We may end up choosing according the technical limitations related to data transformations and having to accept the performance loss when changing from one solution to another.

Am I missing something? Are there additional guidelines in relation to this?

Kind Regards,

Dennes

Anonymous · ‎12-29-2023

Hello @DennesTorres ,

Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.

Dataflow performance and more

Helpful resources

Fabric Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

Fabric Data Days starts November 4th!

Dataflow performance and more

Helpful resources

Fabric Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026