Solved: Copy data into lakehouse and warehouse - append or...

DuncanJ · ‎07-14-2023

When creating a data pipeline that copies data into a lakehouse, I can choose whether to Append or Overwrite data. If I choose to Overwrite, repeated runs of the pipeline do not create duplicate records.

However, there is no option to Append or Overwrite when creating a data pipeline that copies data into a warehouse. As a result, repeated runs of the pipeline create duplicate records in the warehouse.

Shouldn't the option to Append or Overwrite be present in the warehouse experience too?

Or have I misunderstood the intent? Any wisdom appreciated.

Thank you

Duncan

Majid_ · ‎07-14-2023

Duncan,

I agree with you ) That's why it is in preview mode for now.

I hope the Microsoft team sees your post and fixes this small but very important issue! !

To be honest, I didn't notice it before.)

View solution in original post

NR85 · ‎10-18-2024

You could add a TRUNCATE TABLE (TRUNCATE TABLE (Transact-SQL) - SQL Server | Microsoft Learn) statement to the Pre-copy script in the Advanced drop-down of the Destination tab on the Copy data activity. This would remove all rows from the table before writing the new data which would be functionally similar to using an overwrite operation.

Anonymous · ‎01-09-2025

This worked perfectly for me, had the same problem, I even thought i'd have to create a dataflow for a simple copy and replace job from an excel file to a warehouse table

DuncanJ · ‎07-15-2023

Continuing to explore this issue by reading

Tables in data warehousing in Microsoft Fabric

Primary keys, foreign keys, and unique keys in Warehouse in Microsoft Fabric

I thought it may be that the copy data into warehouse activity would automatically upsert based on keys in the target, but multiple runs of the pipeline result in duplicated data. May be because the constraints have to be declared as NOT ENFORCED. Makes sense, becaue the Limitations say unique indexes are not currently supported.

Perhaps when the need to declare a primary key as NOT ENFORCED is removed in due course, upsert will happen automatically.

In the meantime, using the copy activity into a lakehouse works as expected. And I can PBI off that.

So much to learn!

DuncanJ · ‎07-15-2023

Thank you Majid!

Majid_ · ‎07-15-2023

You welcome, Duncan !

Majid_ · ‎07-14-2023

Hi Duncan !

Yes you absolutely right. There is no option to overwrite data like in Data Lakehouse.

In alternative way you can use staging tables to do all transformations.

DuncanJ · ‎07-14-2023

Thank you Majid.

I don't understand why there is no option to overwrite data though. If it makes sense to have the option with a lakehouse, doesn't it makse sense to have the option with a warehouse? Then it would be trivial to keep the data in sync between e.g. Azure SQL Databases and a warehouse.

In other words, I think it would greaet to have the copy data activity be consistent irrespective of destination ... lakehouse or warehouse ... or am I completely missing the basic concept here? 🙂

Thanks

Duncan

WLY168 · ‎03-13-2025

My work around is to use the "pre-copy script". Since my pipeline activity copies multiple tables at the same time, the following works nicely. A truncate might also work.

@concat('delete from dbo.', item().destination.table )

Majid_ · ‎07-14-2023

Duncan,

I agree with you ) That's why it is in preview mode for now.

I hope the Microsoft team sees your post and fixes this small but very important issue! !

To be honest, I didn't notice it before.)

Copy data into lakehouse and warehouse - append or overwrite

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - May 2025

Fabric Community Update - June 2025

Join the #PBI10 DataViz contest

Copy data into lakehouse and warehouse - append or overwrite

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - May 2025

Fabric Community Update - June 2025