Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Jerome22
Resolver I
Resolver I

updating records using copy job?

Hi,

 

when Data copy will support updates and not just adding rows?

I have a lot of tables which can have updated records, so I want to update my destination not just adding new rows.

 

thanks.

1 ACCEPTED SOLUTION

ok... unfortunately it's not supporting deletes.

and the update appear to update all the rows matching the ID without option to select a hash or timestamp column.

 

well... I'll still use SQL statements for now. still faster with more control than relying on external tools like the copy job.

 

View solution in original post

11 REPLIES 11
smeetsh
Helper V
Helper V

We often do a clear out of tables or certain rows in tables, but we have a few datasets that are so large that is slows things down.
This can de done with just pipeline activities (copy and script activity), notebooks are not always needed. I would only use a notebook when i need to do something with a lakehouse.

@smeetsh We always have Notebooks. No issues. Instead of Data Factory, Use Notebooks wherever we can. Databricks has implemented Delta Lake and then Microsoft follows Databricks. ( Delta Lake is based on Python and Scala ).

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

I could be wrong here, but I don't think a notebook can write to a warehouse, which for us at least would be the issue and it may be as well for the topic starter

BhaveshPatel
Community Champion
Community Champion

Copy jobs in Data Factory means there is no need for updates. Overwrite where ever possible ( Data Lake and Data Lakehouse )  ( Apache Spark and Delta Lake ) or ( Merge in Microsoft Fabric or 

 

Incremental Refresh with below code ( Append in Microsoft Fabric )

 

### Variables
LakehouseName = "Demo"
TableName = "DimTables"
ColName = "Surrogate_Key_Demo"
NumberOfRowsToRemove = "10"


### Remove Old rows
Reload = spark.sql("SELECT Max({0})-{1} as ReLoadValue FROM {2}.{3}".format(ColName,NumberOfRowsToRemove,LakehouseName,TableName)).collect()
Reload = Reload[0].ReLoadValue
display(Reload)

## ColName should be integer at all times to work (Incremental Refresh)
 
spark.sql("Delete from {0}.{1} where {2} > {3}".format(LakehouseName, TableName, ColName, Reload))
 
so it means you should always use Notebooks. ( Concept of Data Lake and Data Lakehouse)

 

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.
Srisakthi
Super User
Super User

Hi @Jerome22 ,

 

Copy Job now supports Upsert!!

There is an update please check this

https://blog.fabric.microsoft.com/en-in/blog/simplifying-data-ingestion-with-copy-job-incremental-co...

 

 

Regards,

Srisakthi

ok... unfortunately it's not supporting deletes.

and the update appear to update all the rows matching the ID without option to select a hash or timestamp column.

 

well... I'll still use SQL statements for now. still faster with more control than relying on external tools like the copy job.

 

What you could do is incorporate a where statement in the update statement, where you do something like 

"WHERE hash in (select hash from TABLE) AND time_stamp_column > [your timestamp] 

 

The above code is rough and probaly has some syntax errors, but I hope you understand the geste of the logic behind it 🙂

Cheers

Hans

v-sdhruv
Community Support
Community Support

Hi @Jerome22 ,
Hope you were able to get some clarity on the issue. 
If any of the responses has addressed your query, kindly mark it as accepted solution so that the other members can find it easily.
Thank you

smeetsh
Helper V
Helper V

There is no option for warehouse for this yet, I did see an upsert (preview) button appear on a lakehouse though. The best way for now is to do it using SQL code, whwere you check for an existing record, which you than update (UPDATE TABLE, set column = etc etc ) and when a record does not exist you will do an insert

v-sdhruv
Community Support
Community Support

Hi @Jerome22 ,
Hope you were able to get some clarity on the issue. Let us know if need any help.
If the response has addressed your query, kindly mark it as accepted solution so that the other members can find it easily.
Thank you

v-sdhruv
Community Support
Community Support

Hi @Jerome22 ,

Unfortunately, in Microsoft Fabric, the Copy Data activity does support an "Upsert" behavior. You can however use workarounds like Dataflows or SQL logic.
You can refer-
upsert option missing in copy activity of microsoft fabric data pipeline

Hope this helps!
If the response has addressed your query, please accept it as a solution so other members can easily find it.
Thank You

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.