Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more

Reply
Hamidr
Frequent Visitor

Upsert to Warehouse using copy activity vs notebook

Is there any differences in terms of performance between following two scenarios : 
1- Running a pipeline every 15 minutes to read data from a lakehouse table and write into a warehouse table with using upsert write method with using copy data activity (there is a column "EventID" to use for upsert)

2- Every 15 minutes trigger a notebook which is doing same upsert action with using merge command.

Thanks

1 ACCEPTED SOLUTION
v-prasare
Community Support
Community Support

Hi @Hamidr,

this really depends on the type of workload and how complex your logic is.

Both options will get the job done, but they behave a bit differently in practice.

 Using Copy Activity (Upsert)

  • This approach relies on Fabric’s built-in data movement engine, which is generally optimized for performance.
  • It works best when your logic is straightforward, such as a simple key-based upsert using a column like  EventID.
  • Since it has less execution overhead, it’s usually a good fit for frequent runs (like every 15 minutes).

Using Notebook with MERGE

  • A notebook gives you more flexibility, especially if your upsert logic involves additional conditions or transformations.
  • You can fully control how records are inserted or updated using SQL MERGE.
  • That said, it may introduce a bit more overhead due to compute and execution setup.

 

 

Thanks,
prashanth

View solution in original post

2 REPLIES 2
tayloramy
Super User
Super User

Hi @Hamidr

 

Generally speaking, the less code you write the more expensive the operation. 

 

Notebooks will be more efficient than a copy activity in an ideal world, but that depends on writing efficient code and having a properly sized spark cluster, or for small workloads dropping spark entirely and using pure python notebooks. 

 

If you're skilled at python, then use a notebook. If you're not, stick with the pipeline and copy activity. 





If you found this helpful, consider giving some Kudos.
If I answered your question or solved your problem, mark this post as the solution!

Join the Fabric Discord!

Proud to be a Super User!





v-prasare
Community Support
Community Support

Hi @Hamidr,

this really depends on the type of workload and how complex your logic is.

Both options will get the job done, but they behave a bit differently in practice.

 Using Copy Activity (Upsert)

  • This approach relies on Fabric’s built-in data movement engine, which is generally optimized for performance.
  • It works best when your logic is straightforward, such as a simple key-based upsert using a column like  EventID.
  • Since it has less execution overhead, it’s usually a good fit for frequent runs (like every 15 minutes).

Using Notebook with MERGE

  • A notebook gives you more flexibility, especially if your upsert logic involves additional conditions or transformations.
  • You can fully control how records are inserted or updated using SQL MERGE.
  • That said, it may introduce a bit more overhead due to compute and execution setup.

 

 

Thanks,
prashanth

Helpful resources

Announcements
Fabric Data Days is here Carousel

Fabric Data Days 2026

Don't miss out on Data Days, June 15 through August 7. Learn Fabric, Power BI, SQL, AI and more.

June Fabric Update Carousel

Fabric Monthly Update - June 2026

Check out the June 2026 Fabric update to learn about new features.