Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
enzo_garcia20
Frequent Visitor

Incremental refresh failing due duplicate records

Hi,

 

I connected Power BI on many Azure SQL Server tables and created a model. Then I uploaded the model to the Premium capacity configured for incremental refresh. The first service refresh (wich load historic data) loads fine, but the next day incremental refresh is giving me an error of duplicate records on a table that is a dimension (and therefore has a lot of relationships 1 to *).

 

I've already duble checked the origin table and there isn't any duplicates records there. I checked my incremental rule and it's "cod_date < RangeEnd" too and I don't do any transformation on Power Query, I just conect to the table.

 

I wonder if it's something related with Power BI doesn't let us define a primary key when programing incremental refresh.

 

Any help will be nice.

 

Thanks!

 

1 ACCEPTED SOLUTION
v-eqin-msft
Community Support
Community Support

Hi @enzo_garcia20 ,

 

According to the official document, the error causud by the Post dates have changed.

 

Detail: With a refresh operation, only data that has changed at the data source is refreshed in the dataset. As the data is divided by a date, it’s recommended post (transaction) dates are not changed.

If a date is changed accidentally, then two issues can occur: Users notice some totals changed in the historical data (that is not supposed to happen), or during a refresh an error is returned indicating a unique value is not in fact unique. For the latter, this can happen when the table with incremental refresh configured is used in a 1:N relationship with another table as the 1 side and should have unique values. When the data is changed (for a specific ID), that ID then appears in another partition and the engine will detect the value is not unique.

 

Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

View solution in original post

5 REPLIES 5
v-eqin-msft
Community Support
Community Support

Hi @enzo_garcia20 ,

 

According to the official document, the error causud by the Post dates have changed.

 

Detail: With a refresh operation, only data that has changed at the data source is refreshed in the dataset. As the data is divided by a date, it’s recommended post (transaction) dates are not changed.

If a date is changed accidentally, then two issues can occur: Users notice some totals changed in the historical data (that is not supposed to happen), or during a refresh an error is returned indicating a unique value is not in fact unique. For the latter, this can happen when the table with incremental refresh configured is used in a 1:N relationship with another table as the 1 side and should have unique values. When the data is changed (for a specific ID), that ID then appears in another partition and the engine will detect the value is not unique.

 

Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Hi,

 

Thanks for the answers guys and @v-eqin-msft.

 

I don't know if this is my case because none post dates were changed. But in the end, I rebuild my model from scratch and it worked flawless so maybe it was some kind of bug.

 

But I have one more question: in the documentation you sent it says "only data that has changed at the data source is refreshed in the dataset". How does Power BI know wich data changed or not if it doesn't ask me for a primary key when programing a refresh?

 

I programed my incremental refresh to evaluate two months back, so how does it know when to refresh an existing row ou add a new row?

 

If you can recommend me some documentation it will be awersome.

 

But thanks for all the support.

 

Best Regards,
Enzo.

 

 

@enzo_garcia20 PBI knows what data to refresh because it does a max(date_col) for every partition range.  Trace your Sql (if Sql source) and you'll see it.  So it collects the names of all partitions that had a change during your refresh period (last 2 months) with a timestamp since the last refresh.  It then just does a complete reload of those partitions. 

 

In other words, it doesn't need to know which row changed, only what partitions had changes. Then it wipes and out rebuilds those partitions.  The incremental doesn't do rows at a time - it does entire partitions. 

GilbertQ
Super User
Super User

Hi @enzo_garcia20 

 

Do you have an incremental refresh on your dimension tables?

 

The query failing is right in that on a table where you have relationships on the 1 side of the relationship there will be duplicate records for some reason.

 

What happens when you refresh this PBIX file, do you get the same error?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

arvindsingh802
Super User
Super User

It will be becasue of your records are getting updated beyond the delta range and because of that there are two recorss with same key in your dataset/dataflow.

 

To throublehsoot better, create Dataflow/dataset with only one dimension table and implemnet incremental load, validate for duplicay after loading the data (try after 1-2 delta load)

 

If this post helps, then please consider Accept it as the solution and give it a thumbs up


If this post helps, then please consider Accept it as the solution, Appreciate your Kudos!!
Proud to be a Super User!!

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors