Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during AI Skills Fest. This week only. Secure your voucher now.

Reply
Hamidr
Frequent Visitor

Upsert to Warehouse using copy activity vs notebook

Is there any differences in terms of performance between following two scenarios : 
1- Running a pipeline every 15 minutes to read data from a lakehouse table and write into a warehouse table with using upsert write method with using copy data activity (there is a column "EventID" to use for upsert)

2- Every 15 minutes trigger a notebook which is doing same upsert action with using merge command.

Thanks

1 ACCEPTED SOLUTION
v-prasare
Community Support
Community Support

Hi @Hamidr,

this really depends on the type of workload and how complex your logic is.

Both options will get the job done, but they behave a bit differently in practice.

 Using Copy Activity (Upsert)

  • This approach relies on Fabric’s built-in data movement engine, which is generally optimized for performance.
  • It works best when your logic is straightforward, such as a simple key-based upsert using a column like  EventID.
  • Since it has less execution overhead, it’s usually a good fit for frequent runs (like every 15 minutes).

Using Notebook with MERGE

  • A notebook gives you more flexibility, especially if your upsert logic involves additional conditions or transformations.
  • You can fully control how records are inserted or updated using SQL MERGE.
  • That said, it may introduce a bit more overhead due to compute and execution setup.

 

 

Thanks,
prashanth

View solution in original post

2 REPLIES 2
tayloramy
Super User
Super User

Hi @Hamidr

 

Generally speaking, the less code you write the more expensive the operation. 

 

Notebooks will be more efficient than a copy activity in an ideal world, but that depends on writing efficient code and having a properly sized spark cluster, or for small workloads dropping spark entirely and using pure python notebooks. 

 

If you're skilled at python, then use a notebook. If you're not, stick with the pipeline and copy activity. 





If you found this helpful, consider giving some Kudos.
If I answered your question or solved your problem, mark this post as the solution!

Join the Fabric Discord!

Proud to be a Super User!





v-prasare
Community Support
Community Support

Hi @Hamidr,

this really depends on the type of workload and how complex your logic is.

Both options will get the job done, but they behave a bit differently in practice.

 Using Copy Activity (Upsert)

  • This approach relies on Fabric’s built-in data movement engine, which is generally optimized for performance.
  • It works best when your logic is straightforward, such as a simple key-based upsert using a column like  EventID.
  • Since it has less execution overhead, it’s usually a good fit for frequent runs (like every 15 minutes).

Using Notebook with MERGE

  • A notebook gives you more flexibility, especially if your upsert logic involves additional conditions or transformations.
  • You can fully control how records are inserted or updated using SQL MERGE.
  • That said, it may introduce a bit more overhead due to compute and execution setup.

 

 

Thanks,
prashanth

Helpful resources

Announcements
June Fabric Update Carousel

Fabric Monthly Update - June 2026

Check out the June 2026 Fabric update to learn about new features.

Fabric SQL PBI Data Days

Data Days 2026 coming soon!

Sign up to receive a private message when registration opens and key events begin.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.