Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
Hi,
I did some performance comparison of dataflows execuiton and I'm not sure if I completely understand the results.
The data to be loaded was this: https://azuresynapsestorage.blob.core.windows.net/sampledata/WideWorldImportersDW/tables/fact_sale.p...
First execution
Loading to a data warehouse
Adding 3 calculated fields
Execution time: 58m49sec
Second Execution
Loading to a data warehouse
No calculation
Execution time: 39m59sec
Conclusion: Simple calculations in a big amount of data can take 19 minutes to run
Third Execution
Disable Staging
Load to a lakehouse
Execution Time: 13m12s
Conclusion: The fact it's faster without staging I understand. The fact the data warehouse doesn't work as a destination without staging I don't. How to explain to someone that if he chooses a data warehouse, the dataflows gen 2 will be 26 minutes slower because the operations can't be done in memory when the target is a data warehouse?
Is there any improvement planned on this ?
Fourth Execution
Data Pipeline to a lakehouse
Execution time: 16m 58s
Conclusion: I don't know how to explain the difference between a pipeline and a dataflow gen 2 without staging. Are there some configurations I should be checking?
Fifth Execution
Data pipeline to a data warehouse
Execution Time: 4m17s
Conclusion: The data warehouse is way faster than the lakehouse for data ingestion? Why this power can't be used on dataflows gen 2?
Sixth Execution
COPY INTO in the data warehouse
Execution time: 1m32sec
Conclusion: I'm not sure where to start about this last one. So, Polaris has all this power, but we can't use any of this for data ingestion (dataflows/pipelines) ?
All these differences makes the scenario a bit difficult to choose when to use each one of the solutions. We may end up choosing according the technical limitations related to data transformations and having to accept the performance loss when changing from one solution to another.
Am I missing something? Are there additional guidelines in relation to this?
Kind Regards,
Dennes
Hello @DennesTorres ,
Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this .
We will update you once we hear back from them.
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |