Solved: updating records using copy job?

Jerome22 · ‎06-26-2025

Hi,

when Data copy will support updates and not just adding rows?

I have a lot of tables which can have updated records, so I want to update my destination not just adding new rows.

thanks.

Jerome22 · ‎07-08-2025

ok... unfortunately it's not supporting deletes.

and the update appear to update all the rows matching the ID without option to select a hash or timestamp column.

well... I'll still use SQL statements for now. still faster with more control than relying on external tools like the copy job.

View solution in original post

smeetsh · ‎07-08-2025

We often do a clear out of tables or certain rows in tables, but we have a few datasets that are so large that is slows things down.
This can de done with just pipeline activities (copy and script activity), notebooks are not always needed. I would only use a notebook when i need to do something with a lakehouse.

BhaveshPatel · ‎07-09-2025

@smeetsh We always have Notebooks. No issues. Instead of Data Factory, Use Notebooks wherever we can. Databricks has implemented Delta Lake and then Microsoft follows Databricks. ( Delta Lake is based on Python and Scala ).

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

smeetsh · ‎07-22-2025

I could be wrong here, but I don't think a notebook can write to a warehouse, which for us at least would be the issue and it may be as well for the topic starter

BhaveshPatel · ‎07-08-2025

Copy jobs in Data Factory means there is no need for updates. Overwrite where ever possible ( Data Lake and Data Lakehouse ) ( Apache Spark and Delta Lake ) or ( Merge in Microsoft Fabric or

Incremental Refresh with below code ( Append in Microsoft Fabric )

### Variables

LakehouseName = "Demo"

TableName = "DimTables"

ColName = "Surrogate_Key_Demo"

NumberOfRowsToRemove = "10"

### Remove Old rows

Reload = spark.sql("SELECT Max({0})-{1} as ReLoadValue FROM {2}.{3}".format(ColName,NumberOfRowsToRemove,LakehouseName,TableName)).collect()

Reload = Reload[0].ReLoadValue

display(Reload)

## ColName should be integer at all times to work (Incremental Refresh)

spark.sql("Delete from {0}.{1} where {2} > {3}".format(LakehouseName, TableName, ColName, Reload))

so it means you should always use Notebooks. ( Concept of Data Lake and Data Lakehouse)

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

Srisakthi · ‎07-08-2025

Hi @Jerome22 ,

Copy Job now supports Upsert!!

There is an update please check this

https://blog.fabric.microsoft.com/en-in/blog/simplifying-data-ingestion-with-copy-job-incremental-co...

Regards,

Srisakthi

Jerome22 · ‎07-08-2025

ok... unfortunately it's not supporting deletes.

and the update appear to update all the rows matching the ID without option to select a hash or timestamp column.

well... I'll still use SQL statements for now. still faster with more control than relying on external tools like the copy job.

smeetsh · ‎07-08-2025

What you could do is incorporate a where statement in the update statement, where you do something like

"WHERE hash in (select hash from TABLE) AND time_stamp_column > [your timestamp]

The above code is rough and probaly has some syntax errors, but I hope you understand the geste of the logic behind it 🙂

Cheers

Hans

v-sdhruv · ‎07-08-2025

Hi @Jerome22 ,
Hope you were able to get some clarity on the issue.
If any of the responses has addressed your query, kindly mark it as accepted solution so that the other members can find it easily.
Thank you

smeetsh · ‎07-03-2025

There is no option for warehouse for this yet, I did see an upsert (preview) button appear on a lakehouse though. The best way for now is to do it using SQL code, whwere you check for an existing record, which you than update (UPDATE TABLE, set column = etc etc ) and when a record does not exist you will do an insert

v-sdhruv · ‎07-01-2025

Hi @Jerome22 ,
Hope you were able to get some clarity on the issue. Let us know if need any help.
If the response has addressed your query, kindly mark it as accepted solution so that the other members can find it easily.
Thank you

v-sdhruv · ‎06-27-2025

Hi @Jerome22 ,

Unfortunately, in Microsoft Fabric, the Copy Data activity does support an "Upsert" behavior. You can however use workarounds like Dataflows or SQL logic.
You can refer-
upsert option missing in copy activity of microsoft fabric data pipeline

Hope this helps!
If the response has addressed your query, please accept it as a solution so other members can easily find it.
Thank You

updating records using copy job?

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - August 2025

Join us at FabCon Vienna from September 15-18, 2025

updating records using copy job?

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - August 2025