Solved: Updating

elavery95 · ‎04-11-2024

Hello,

Currently I have two tables in a lakehouse and I am trying to complete some updates to the main table. Let's call them metadata and d_metadata.

Metadata contains all data to date, and is the main table our teams use for reporting. d_metadata contains data loaded from source, where the records were updated in the previous day.

Both tables are currently in a lakehouse, and right now, we are using a Dataflow Gen2 to load the metadata table where rows do not match rows in the d_metadata table (joined based on a unique id).

This is returning the old records from metadata, and then I am appending updated records from d_metadata and pushing the final results the metadata table.

I would like to appraoch this differently by simply deleting the updated records in metadata, and then appending d_metadata to avoid doing a full overwrite of what is a very large table.

I cannot find a way to achieve this within a Fabric lakehouse, and other Spark SQL suggestions have not worked.

Can someone help guide me in the right direction? Thanks in advance 🙂

miguel · ‎04-11-2024

Dataflows only offers two data destination operations:

Append
Replace

In that sense, it doesn't support any sort of delete operations or "Upserts" to update existing records.

If your destination was a SQL Database, you could perhaps leverage a combination of a Data Pipeline, a Dataflow Gen2 and a SQL Stored procedure to perhaps load the data inside of a "staging table" in your SQL Database and then run a stored procedure that will take that table and figure out, based on your own logic, what records to delete, update or add to your main table.

I'm not sure how feasible that is with a Lakehouse as a destination, but you could try asking more questions specific to Lakehouse in the Lakehouse community forum (link below):

Synapse forums - Microsoft Fabric Community

View solution in original post

Anonymous · ‎04-12-2024

Hi @elavery95 ,

Thanks for using Fabric Community.
Can you please check this - Pattern to incrementally amass data with Dataflow Gen2 - Microsoft Fabric | Microsoft Learn

I suggest you incremental refresh rather than maintaining 2 tables.

FYI: We will have more features in upcoming releases.

Docs to refer - What's new and planned for Data Factory in Microsoft Fabric - Microsoft Fabric | Microsoft Learn

Hope this is helpful. Please let me know incase of further queries.

Anonymous · ‎04-15-2024

Hi @elavery95 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet.
In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help .

Anonymous · ‎04-16-2024

Hi @elavery95 ,

We haven’t heard from you on the last response and was just checking back to see if your query was answered.
Otherwise, will respond back with the more details and we will try to help .

Thanks

miguel · ‎04-11-2024

Dataflows only offers two data destination operations:

Append
Replace

In that sense, it doesn't support any sort of delete operations or "Upserts" to update existing records.

If your destination was a SQL Database, you could perhaps leverage a combination of a Data Pipeline, a Dataflow Gen2 and a SQL Stored procedure to perhaps load the data inside of a "staging table" in your SQL Database and then run a stored procedure that will take that table and figure out, based on your own logic, what records to delete, update or add to your main table.

I'm not sure how feasible that is with a Lakehouse as a destination, but you could try asking more questions specific to Lakehouse in the Lakehouse community forum (link below):

Synapse forums - Microsoft Fabric Community

Updating

Helpful resources

Join us at the Microsoft Fabric Community Conference

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024

Join us at the 2025 Microsoft Fabric Community Conference

Updating

Helpful resources

Join us at the Microsoft Fabric Community Conference

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024