Get certified for free when you join Fabric Data Days 2026 and dive into Fabric, Power BI, SQL, AI, and other essential data skills.
Join nowData Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more
Is there any differences in terms of performance between following two scenarios :
1- Running a pipeline every 15 minutes to read data from a lakehouse table and write into a warehouse table with using upsert write method with using copy data activity (there is a column "EventID" to use for upsert)
2- Every 15 minutes trigger a notebook which is doing same upsert action with using merge command.
Thanks
Solved! Go to Solution.
Hi @Hamidr,
this really depends on the type of workload and how complex your logic is.
Both options will get the job done, but they behave a bit differently in practice.
Thanks,
prashanth
Hi @Hamidr,
Generally speaking, the less code you write the more expensive the operation.
Notebooks will be more efficient than a copy activity in an ideal world, but that depends on writing efficient code and having a properly sized spark cluster, or for small workloads dropping spark entirely and using pure python notebooks.
If you're skilled at python, then use a notebook. If you're not, stick with the pipeline and copy activity.
Proud to be a Super User! | |
Hi @Hamidr,
this really depends on the type of workload and how complex your logic is.
Both options will get the job done, but they behave a bit differently in practice.
Thanks,
prashanth
| User | Count |
|---|---|
| 2 | |
| 2 | |
| 1 | |
| 1 |
| User | Count |
|---|---|
| 4 | |
| 3 | |
| 2 | |
| 2 | |
| 1 |