𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗙𝗮𝗯𝗿𝗶𝗰 𝗖𝗼𝗽𝘆 𝗔𝗰𝘁𝗶𝘃𝗶𝘁𝘆 + 𝗗𝗲𝗹𝘁𝗮 𝗨𝗽𝘀𝗲𝗿𝘁 is honestly one of the most underrated features in Fabric for high-performance operational data loading.
We can now:
• Read directly from sources such as SQL Server / SQL DB • Write directly into Lakehouse Delta Tables • Use native Upsert mode • Avoid Spark notebooks completely • Avoid Parquet staging/intermediate layers • Achieve very fast sync performance with only ONE Copy Activity
I’m currently using this approach to dynamically load and merge data from multiple SQL Server store databases into consolidated Delta tables in Microsoft Fabric Lakehouse.
However, there is one very important feature still missing from Copy Activity Upsert.
𝗜 𝘄𝗼𝘂𝗹𝗱 𝗹𝗼𝘃𝗲 𝘁𝗼 𝘀𝗲𝗲 𝗻𝗮𝘁𝗶𝘃𝗲 𝘀𝘂𝗽𝗽𝗼𝗿𝘁 𝗳𝗼𝗿 𝗮𝘂𝘁𝗼-𝗺𝗮𝗻𝗮𝗴𝗲𝗱 𝗮𝘂𝗱𝗶𝘁 / 𝗗𝗪 𝗰𝗼𝗹𝘂𝗺𝗻𝘀 during Upsert operations.
My requirement is:
In the destination Delta table, I need these columns:
• DW_DATE_INSERT → Datetime when a NEW record is inserted through Upsert
• DW_DATE_UPDATE → Datetime when an EXISTING record is updated through Upsert
• DW_LOAD_DATE → Maximum of DW_DATE_INSERT and DW_DATE_UPDATE (meaning the latest insert/update datetime)
Expected behavior:
• If INSERT occurs:
DW_DATE_INSERT should be populated
DW_DATE_UPDATE should remain NULL
• If UPDATE occurs:
DW_DATE_UPDATE should be updated
DW_DATE_INSERT should remain unchanged with its original value
Currently, adding these columns dynamically in the Source → Additional Columns section is NOT a solution.
Why?
Because these datetime values change every execution, the Upsert process detects every row as changed and re-updates all records again and again, even when business data has not changed.
Of course, I know this can be handled using Spark / Notebook MERGE logic.
But the whole beauty of this approach is:
𝗡𝗼 𝗻𝗼𝘁𝗲𝗯𝗼𝗼𝗸𝘀. 𝗡𝗼 𝗦𝗽𝗮𝗿𝗸 𝘀𝘁𝗮𝗿𝘁𝘂𝗽 𝗹𝗮𝘁𝗲𝗻𝗰𝘆. 𝗡𝗼 𝗲𝘅𝘁𝗿𝗮 𝗺𝗲𝗿𝗴𝗲 𝗹𝗮𝘆𝗲𝗿. 𝗝𝘂𝘀𝘁 𝗼𝗻𝗲 𝗳𝗮𝘀𝘁 𝗖𝗼𝗽𝘆 𝗔𝗰𝘁𝗶𝘃𝗶𝘁𝘆.
It would be amazing if Microsoft could add native insert/update audit column handling directly inside Copy Activity Upsert for Delta Lakehouse tables.
This would make Copy Activity even more powerful for very fast enterprise operational sync workloads.
If you think this feature would be useful, please vote/support this idea so we can hopefully get it added to Fabric 🙌
#MicrosoftFabric #DataEngineering #DeltaLake #Lakehouse #ETL #ELT #OneLake #SQLServer #Fabric #Microsoft #AnalyticsEngineering
... View more