Solved: Re: Remove duplicate values in Fabric Data Warehou...

Yggdrasill · ‎11-03-2023

After creating a Pipeline in Data Factory with success where I fetch data via REST API to Azure SQL Database I wanted to see if I could do the same within Microsoft Fabric and use the new (Synapse) Data Warehouse feature within Fabric.

All steps of my original pipeline work until the last step where I call a script which basically removes duplicate rows from the SQL table.

I moved this to Fabric like so

The error I receive from Fabric is on the last step:
The query processor could not produce a query plan because the target DML table is not hash partitioned.

I know why this is happening, technically, but we can't, as is, add hashing to the SQL tables "stored in Fabric"

I've tried running scripts and stored procedures but with and I've also created another table as an anchor to replicate the hashing method no luck so therefor I ask the community for advice.

How do you remove duplicate rows in (Synapse) Data Warehouse within Fabric?

Yggdrasill · ‎11-05-2023

This SP would work if I want to keep all unique rows but in my case I just want to remove duplicates on my key column [id] but with the highest value in [LastSyncedDate]

Your method will still return duplicate id's

I "solved" this by creating a view in the data warehouse which removes the duplicates and then I just removed the last step of the pipeline and I query the view instead of the table...

View solution in original post

Yggdrasill · ‎11-05-2023

This SP would work if I want to keep all unique rows but in my case I just want to remove duplicates on my key column [id] but with the highest value in [LastSyncedDate]

Your method will still return duplicate id's

I "solved" this by creating a view in the data warehouse which removes the duplicates and then I just removed the last step of the pipeline and I query the view instead of the table...

Anonymous · ‎11-03-2023

Hi @Yggdrasill ,
Thanks for using Fabric Community.
Can you please explain me how are you removing the duplicate rows? Are you using the DROP command?
If you are using the DROP command, currently this is not supported in Fabric Warehouse.

Hope this helps. Please let us know if you have any further questions.

Yggdrasill · ‎11-03-2023

I created a stored procedure

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE RemoveDuplicates
AS
BEGIN
    WITH CTE AS (
        SELECT
            *,
            ROW_NUMBER() OVER (PARTITION BY id ORDER BY LastDate DESC) AS RowNum
        FROM
            dbo.MyTable
    )

    DELETE FROM CTE
    WHERE RowNum > 1;
END;




GO

Anonymous · ‎11-03-2023

Hi @Yggdrasill ,
I tried to create a repro with a work around by using the CTAS and the DISTINCT keyword in the stored procedure. I have attached the screenshots for your reference.

1) Created a stored procedure removeDuplicates .

2) The data in Allotment table is as follows:

3) Executed the stored procedure.

Try using this work around in your stored procedure.

Hope this helps. Please let me know if you have any further questions.

Remove duplicate values in Fabric Data Warehouse SQL table using Stored Procedure

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025

Join the #PBI10 DataViz contest

Remove duplicate values in Fabric Data Warehouse SQL table using Stored Procedure

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025