Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
HamidBee
Power Participant
Power Participant

Best Approaches to Avoid Duplicates When Using Copy Activity to Load Data into an Existing Table

Hi All,

 

I'm using the Copy Activity in Microsoft Fabric to load data into an existing table, but I’m concerned about the risk of duplicate entries. For example, if the copy operation is interrupted and restarted, or if I unintentionally include data from previous periods, it could lead to duplicate records in my target table.

 

What are some recommended settings or strategies to ensure only unique records are loaded, even when there’s a chance of re-running the data or having overlapping entries from a prior load? Any insights on best practices for managing this kind of scenario would be really helpful.

1 ACCEPTED SOLUTION
v-yilong-msft
Community Support
Community Support

Hi @HamidBee ,

Here are some of my personal thoughts on your question:

 

1. In Microsoft Fabric, you can configure the Copy Activity to perform upserts by using the Mapping Data Flow feature. Upsert combines the insert and update operations. It checks if a record already exists in the target table based on a unique key. If it exists, it updates the record; if not, it inserts a new one.

You can look at this document below: 

A guide to Fabric Dataflows for Azure Data Factory Mapping Data Flow users - Microsoft Fabric | Micr...

 

2. Implement an incremental load strategy where only new or changed records are loaded. This can be achieved by maintaining a watermark or timestamp column to track the last loaded record.

I think you can take a look about this document:

Pattern to incrementally amass data with Dataflow Gen2 - Microsoft Fabric | Microsoft Learn

 

 

Best Regards

Yilong Zhou

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
v-yilong-msft
Community Support
Community Support

Hi @HamidBee ,

Here are some of my personal thoughts on your question:

 

1. In Microsoft Fabric, you can configure the Copy Activity to perform upserts by using the Mapping Data Flow feature. Upsert combines the insert and update operations. It checks if a record already exists in the target table based on a unique key. If it exists, it updates the record; if not, it inserts a new one.

You can look at this document below: 

A guide to Fabric Dataflows for Azure Data Factory Mapping Data Flow users - Microsoft Fabric | Micr...

 

2. Implement an incremental load strategy where only new or changed records are loaded. This can be achieved by maintaining a watermark or timestamp column to track the last loaded record.

I think you can take a look about this document:

Pattern to incrementally amass data with Dataflow Gen2 - Microsoft Fabric | Microsoft Learn

 

 

Best Regards

Yilong Zhou

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks for sharing.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.