Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

Reply
MisterSmith
Helper I
Helper I

Constraints on delta parquet file

All

Is it possible to put constraints on a delta parquet file.  I'm experiencing scenarios where data is duplicating.  On sql server I would have a primary key to manage duplication by erroring.  I have adapted my code to try and prevent duplication.  However in some cases it is due to incorrect data in the source system.  I dont want to hide this and want the error to surface so it can be fixed in the source system.  Therefore can I apply a constraint to a parquet file or do I have to manage this with code?

 

Thanks

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @MisterSmith ,

 

You can use Delta Lake's merge operation to de-duplicate data. This operation allows you to merge new data into an existing Delta table and specify conditions to handle duplicate data.

vkongfanfmsft_0-1730775673664.png

 

You can try sql like below:

MERGE INTO logs
USING newDedupedLogs
ON logs.uniqueId = newDedupedLogs.uniqueId
WHEN NOT MATCHED
  THEN INSERT *

 

For more details, you can refer to below document:

Upsert into a Delta Lake table using merge - Azure Databricks | Microsoft Learn

 

Best Regards,
Adamk Kong

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

View solution in original post

1 REPLY 1
Anonymous
Not applicable

Hi @MisterSmith ,

 

You can use Delta Lake's merge operation to de-duplicate data. This operation allows you to merge new data into an existing Delta table and specify conditions to handle duplicate data.

vkongfanfmsft_0-1730775673664.png

 

You can try sql like below:

MERGE INTO logs
USING newDedupedLogs
ON logs.uniqueId = newDedupedLogs.uniqueId
WHEN NOT MATCHED
  THEN INSERT *

 

For more details, you can refer to below document:

Upsert into a Delta Lake table using merge - Azure Databricks | Microsoft Learn

 

Best Regards,
Adamk Kong

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Fabric Update Carousel

Fabric Monthly Update - October 2025

Check out the October 2025 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.