Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI DataViz World Championships are on! With four chances to enter, you could win a spot in the LIVE Grand Finale in Las Vegas. Show off your skills.

Reply
jberkers
Frequent Visitor

Duplicate rows in Lakehouse Table with Primary Key with DataFlow Gen2

Hi All,

 

Wondering if this is a bug or if this is by design.

 

I have a DataFlow Gen2 operation that retrieves ticket data from our ITSM that has been modified in the last 7 days. In the Query within the DataFlow I have set the Ticket ID column as a Primary Key, with the intent that this prevents duplicate records from being added during append operations. Instead, what I am finding is that if an entry with the same ticket id already exists, a new one is appended.

 

There don't apppear to be any specific indication in the LakeHouse's table definition that there is a primary key set.

 

Is this a feature that has not yet been implemented for LakeHouse?

 

The data is being retrieved from a REST API via an on-prem data gateway (3000.186.16).

 

Any suggestions or ideas?

 

Regards,

JohnB

3 REPLIES 3
jberkers
Frequent Visitor

Hi @lbendlin ,

 

Thanks for responding.

 

Due to the nature of the data, I cannot use an Incremental refresh in the way documented since a "ticket" in the ITSM may last accross several refresh cycles, and change state during that time.  For reporting purposes I need to make sure that I have the latest state of the ticket, rather than freezing it on initial import, if that makes sense.

 

From reading several articles (not specific to DataFlow Gen2) for PowerBI/Fabric, it suggested at setting a Primary Key would result in an UPSERT if a matching entry was found. Is this not the case for Lakehouse/DataFlow Gen2?

 

For now I am attempting to deduplicate the data via other means, however, if this process can be simplified, that would be much preferred.

 

Thanks.

JohnB

lbendlin
Super User
Super User

Did you set up incremental refresh for that dataflow?

I am having a similar problem, was there a resoution to this? Thank You

Helpful resources

Announcements
Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.