Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedJoin us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now
Hi All,
I'm using the Copy Activity in Microsoft Fabric to load data into an existing table, but I’m concerned about the risk of duplicate entries. For example, if the copy operation is interrupted and restarted, or if I unintentionally include data from previous periods, it could lead to duplicate records in my target table.
What are some recommended settings or strategies to ensure only unique records are loaded, even when there’s a chance of re-running the data or having overlapping entries from a prior load? Any insights on best practices for managing this kind of scenario would be really helpful.
Solved! Go to Solution.
Hi @HamidBee ,
Here are some of my personal thoughts on your question:
1. In Microsoft Fabric, you can configure the Copy Activity to perform upserts by using the Mapping Data Flow feature. Upsert combines the insert and update operations. It checks if a record already exists in the target table based on a unique key. If it exists, it updates the record; if not, it inserts a new one.
You can look at this document below:
2. Implement an incremental load strategy where only new or changed records are loaded. This can be achieved by maintaining a watermark or timestamp column to track the last loaded record.
I think you can take a look about this document:
Pattern to incrementally amass data with Dataflow Gen2 - Microsoft Fabric | Microsoft Learn
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @HamidBee ,
Here are some of my personal thoughts on your question:
1. In Microsoft Fabric, you can configure the Copy Activity to perform upserts by using the Mapping Data Flow feature. Upsert combines the insert and update operations. It checks if a record already exists in the target table based on a unique key. If it exists, it updates the record; if not, it inserts a new one.
You can look at this document below:
2. Implement an incremental load strategy where only new or changed records are loaded. This can be achieved by maintaining a watermark or timestamp column to track the last loaded record.
I think you can take a look about this document:
Pattern to incrementally amass data with Dataflow Gen2 - Microsoft Fabric | Microsoft Learn
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks for sharing.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Check out the February 2025 Fabric update to learn about new features.
User | Count |
---|---|
6 | |
4 | |
2 | |
1 | |
1 |
User | Count |
---|---|
13 | |
10 | |
5 | |
5 | |
4 |