Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
enzo_garcia20
Frequent Visitor

Incremental refresh failing due duplicate records

Hi,

 

I connected Power BI on many Azure SQL Server tables and created a model. Then I uploaded the model to the Premium capacity configured for incremental refresh. The first service refresh (wich load historic data) loads fine, but the next day incremental refresh is giving me an error of duplicate records on a table that is a dimension (and therefore has a lot of relationships 1 to *).

 

I've already duble checked the origin table and there isn't any duplicates records there. I checked my incremental rule and it's "cod_date < RangeEnd" too and I don't do any transformation on Power Query, I just conect to the table.

 

I wonder if it's something related with Power BI doesn't let us define a primary key when programing incremental refresh.

 

Any help will be nice.

 

Thanks!

 

1 ACCEPTED SOLUTION
v-eqin-msft
Community Support
Community Support

Hi @enzo_garcia20 ,

 

According to the official document, the error causud by the Post dates have changed.

 

Detail: With a refresh operation, only data that has changed at the data source is refreshed in the dataset. As the data is divided by a date, it’s recommended post (transaction) dates are not changed.

If a date is changed accidentally, then two issues can occur: Users notice some totals changed in the historical data (that is not supposed to happen), or during a refresh an error is returned indicating a unique value is not in fact unique. For the latter, this can happen when the table with incremental refresh configured is used in a 1:N relationship with another table as the 1 side and should have unique values. When the data is changed (for a specific ID), that ID then appears in another partition and the engine will detect the value is not unique.

 

Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

View solution in original post

5 REPLIES 5
v-eqin-msft
Community Support
Community Support

Hi @enzo_garcia20 ,

 

According to the official document, the error causud by the Post dates have changed.

 

Detail: With a refresh operation, only data that has changed at the data source is refreshed in the dataset. As the data is divided by a date, it’s recommended post (transaction) dates are not changed.

If a date is changed accidentally, then two issues can occur: Users notice some totals changed in the historical data (that is not supposed to happen), or during a refresh an error is returned indicating a unique value is not in fact unique. For the latter, this can happen when the table with incremental refresh configured is used in a 1:N relationship with another table as the 1 side and should have unique values. When the data is changed (for a specific ID), that ID then appears in another partition and the engine will detect the value is not unique.

 

Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Hi,

 

Thanks for the answers guys and @v-eqin-msft.

 

I don't know if this is my case because none post dates were changed. But in the end, I rebuild my model from scratch and it worked flawless so maybe it was some kind of bug.

 

But I have one more question: in the documentation you sent it says "only data that has changed at the data source is refreshed in the dataset". How does Power BI know wich data changed or not if it doesn't ask me for a primary key when programing a refresh?

 

I programed my incremental refresh to evaluate two months back, so how does it know when to refresh an existing row ou add a new row?

 

If you can recommend me some documentation it will be awersome.

 

But thanks for all the support.

 

Best Regards,
Enzo.

 

 

@enzo_garcia20 PBI knows what data to refresh because it does a max(date_col) for every partition range.  Trace your Sql (if Sql source) and you'll see it.  So it collects the names of all partitions that had a change during your refresh period (last 2 months) with a timestamp since the last refresh.  It then just does a complete reload of those partitions. 

 

In other words, it doesn't need to know which row changed, only what partitions had changes. Then it wipes and out rebuilds those partitions.  The incremental doesn't do rows at a time - it does entire partitions. 

GilbertQ
Super User
Super User

Hi @enzo_garcia20 

 

Do you have an incremental refresh on your dimension tables?

 

The query failing is right in that on a table where you have relationships on the 1 side of the relationship there will be duplicate records for some reason.

 

What happens when you refresh this PBIX file, do you get the same error?





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!







Power BI Blog

arvindsingh802
Super User
Super User

It will be becasue of your records are getting updated beyond the delta range and because of that there are two recorss with same key in your dataset/dataflow.

 

To throublehsoot better, create Dataflow/dataset with only one dimension table and implemnet incremental load, validate for duplicay after loading the data (try after 1-2 delta load)

 

If this post helps, then please consider Accept it as the solution and give it a thumbs up


If this post helps, then please consider Accept it as the solution, Appreciate your Kudos!!
Proud to be a Super User!!

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Power BI Carousel June 2024

Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Top Solution Authors