March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Greetings, community. I have a pipeline I'm using to insert data into a data warehouse table. Since Fabric's DW doesn't yet provide an identity column, I want to add a unique identifier to the data. I know I can run a script after the Copy Activity completes that assigns a GUID, but is it possible to do this from within the Copy Activity itself so I don't have to first insert and then update?
Solved! Go to Solution.
@arpost - I had this same issue/requirement. I was able to figure out a way to do this though:
Stage and Prod Tables:
I setup a staging table to load the data I wanted to upsert. In the Prod Table I added what would typically have been my Identity column.
Stored Proc/SQL Query Load:
In the SQL Query that would load the data from Stage to Prod; I created a variable called "MaxKey". That would get the current maximum value currently in my identity column. That would be the starting point for any new records I needed to insert.
DECLARE @MaxIdentity bigint;
SET @MaxIdentity = (SELECT MAX(productrowkey) from dimProduct);
Then in my query to actual generate the rows to update you can use the ROW_Number() function:
SELECT
ROW_NUMBER() OVER (ORDER BY stg.ProductName ASC) + @MaxIdentity as productkey,
stg.ProductName
FROM stage_Product stg
LEFT JOIN dimProduct prd on stg.ProductName = stg.ProductName
WHERE prd.productkey is null
So far everything appears to be working as expected for me. The downside later is that once the Identity function is supported you will have to migrate data to a new table as I don't think you can convert an existing column to an Identity.
@arpost - I had this same issue/requirement. I was able to figure out a way to do this though:
Stage and Prod Tables:
I setup a staging table to load the data I wanted to upsert. In the Prod Table I added what would typically have been my Identity column.
Stored Proc/SQL Query Load:
In the SQL Query that would load the data from Stage to Prod; I created a variable called "MaxKey". That would get the current maximum value currently in my identity column. That would be the starting point for any new records I needed to insert.
DECLARE @MaxIdentity bigint;
SET @MaxIdentity = (SELECT MAX(productrowkey) from dimProduct);
Then in my query to actual generate the rows to update you can use the ROW_Number() function:
SELECT
ROW_NUMBER() OVER (ORDER BY stg.ProductName ASC) + @MaxIdentity as productkey,
stg.ProductName
FROM stage_Product stg
LEFT JOIN dimProduct prd on stg.ProductName = stg.ProductName
WHERE prd.productkey is null
So far everything appears to be working as expected for me. The downside later is that once the Identity function is supported you will have to migrate data to a new table as I don't think you can convert an existing column to an Identity.
No, unfortunately this is no such way inside Copy. But feel free to submit an idea on Fabric Ideas page.
Thank you @ajarora for your insights.
@arpost ,
Appreciate if you could share the feedback on our feedback channel . Which would be open for the user community to upvote & comment on. This allows our product teams to effectively prioritize your request against our existing feature backlog and gives insight into the potential impact of implementing the suggested feature.
Hope this helps. Please let me know if you have any further queries.
Hi @arpost ,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. Otherwise, will respond back with the more details and we will try to help .
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
6 | |
2 | |
2 | |
1 | |
1 |