Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Duplicate found error when there is no duplicate

Hi,

 

I keep on getting the same errors:

Data source error: Column '[column name]' in Table '[table name]' contains a duplicate value '<pii>630872972</pii>' and this is not allowed for columns on the one side of a many-to-one relationship or for columns that are used as the primary key of a table.

 

There are two similar tables in this report, both linked to the main table through a unique reference number column. This error names either table as the problem, and it's not always the same value that is alledgedly duplicated. Both tables are made up of the URN column and a datetime column. They both come from a view in a postgresql database, and the view does not contain duplicates as it returns the max datetime for each URN. Nevertheless, I added a 'remove duplicates' step, just in case.

 

Neither table contains a duplicate, nor do they contain blank rows.

 

I also tried changing the relationship between those tables and the main one, which didn't make a difference. At the moment they are both set up as a one-to-one rather than many-to-one as stated in the error message, so I thought maybe it had to do with being the primary key. So I created an index column, duplicated the URN column, then deleted the original URN col, hoping it would pick up the index col as the primary key instead. Obviously didn't work.

 

I have tried deleting the queries and re-creating them, didn't work.

 

The last time I published it it refreshed fine, then the next scheduled refresh also worked, then the next 4 did not. I just did a manual refresh, also didn't work.

I forgot to mention, this is only in Service, in Desktop it refreshes fine and correctly finds no duplicates.

 

I don't know what else to do.

Thank you

 

 

 

 

Status: New
Comments
v-chuncz-msft
Community Support

@amyWMFS 

 

You may check if it is caused by incremental refresh.

Verify queries have an equal to (=) on either RangeStart or RangeEnd, but not both. If the equal to (=) exists on both parameters, a row could satisfy the conditions for two partitions, which could lead to duplicate data in the model.

amyWMFS
Helper I

@v-chuncz-msft 

 

Thank you. 

I always set my incremental refresh to have only one = because of that; I did check though just in case and both tables only have one =