Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
Mathan1787
Helper I
Helper I

Upsert in copy activity

Hi,

 

could you let me know how can i do upsert  in copy activity of Data pipeline so i could load the data from Azure sql table to lakehouse and store the information as delta table or if there are ways to do similar activity with dataflow gen2.

4 REPLIES 4
carloseparra
Frequent Visitor

That screenshot is very misleading Dennes. That is Azure Data Factory. We need Fabric.

Mathan1787
Helper I
Helper I

I am planning to sort the upsert with notebook as copying the entire 2 billion records would be very costly. it would be nice if there is a incremental loading of data or cdc.

Hi,

If you manage to identify an "updatedDate" and/or "InsertedDate" and/or an ascending field, such as Date or ascending numeric key, you can manage to build an incremental load based on them and still keep an image of the records you load every day.

Keeping this image is important, if you really can't find other solution than the upsert, you miss this image. But yes, I understand for this amount, if the incremental load is not possible, at least the upsert maybe needed.

Kind Regards,

Dennes

DennesTorres
Impactful Individual
Impactful Individual

Hi!

Some connectors allow the upsert, as you can notice on the image below. I don't remember if it's available for Delta on pipelines.

However, it makes sense to do it in dataflows (although, of course, every situation is unique).

Usually, on the pipeline, you are getting external (or "external") data to your lake. I believe it's a good practice to keep each load as an individual set, in a specific folder on the initial layers. For example, every day you will load a set of records. On the first layer of the lake, you keep them together, the original set you loaded. Leave the upsert to do later.

In this way, if you need to find out "where/when this was loaded from?" or "When this value became this?" it's possible to track down to a specific date when that information was loaded into the lake. 

In this way, you can just use copy in the pipeline, creating daily new folders, and make the upsert later, from these daily new folders to the table itself, using a dataflow.


DennesTorres_0-1691483009811.png

 

Kind Regards,

 

Dennes

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.