Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
tomperro
Helper V
Helper V

Duplicate rows in dataflow

I have created a dataflow to pull data from a salesforce object into my warehouse but when I do a refresh, it appends to the existing warehouse table. I have added an index column but this did not solve the issue.

How can I refresh the data and not have duplicates?

1 ACCEPTED SOLUTION
v-csrikanth
Community Support
Community Support

Hi @tomperro 
Thanks for reaching out to Fabric Community.

Your dataflow is currently appending new rows each time, which is why you end up with duplicates.

Here are Best approaches pick the one that best fits your scenario:

Choose the method that aligns with your performance and audit requirements.


If this answer solves your issue, please give us a Kudos and mark it as Accepted Solution.
Kind regards,
Community Support Team _ C Srikanth.

View solution in original post

4 REPLIES 4
v-csrikanth
Community Support
Community Support

Hi @tomperro 
I wanted to follow up since I haven't heard from you in a while. Have you had a chance to try the suggested solutions?
If your issue is resolved, please consider marking the post as solved. However, if you're still facing challenges, feel free to share the details, and we'll be happy to assist you further.
Looking forward to your response!


Best Regards,
Community Support Team _ C Srikanth.

v-csrikanth
Community Support
Community Support

Hi @tomperro 
Thanks for reaching out to Fabric Community.

Your dataflow is currently appending new rows each time, which is why you end up with duplicates.

Here are Best approaches pick the one that best fits your scenario:

Choose the method that aligns with your performance and audit requirements.


If this answer solves your issue, please give us a Kudos and mark it as Accepted Solution.
Kind regards,
Community Support Team _ C Srikanth.

rohit1991
Super User
Super User

Hi @tomperro ,
It sounds like the issue you're encountering is due to the dataflow performing an append operation rather than a full refresh or upsert into your warehouse table. Adding an index column alone won't prevent duplicates unless you're using it as part of a deduplication step. To avoid duplicates, you’ll need to implement logic in your dataflow that either deletes existing data before each refresh or identifies and removes duplicates based on a unique key, such as a Salesforce record ID.

 

Another approach is to stage the incoming data in a temporary table and then use a transformation (like a merge or deduplicate step) before writing to the final destination. If you're using Microsoft Fabric or a similar platform with dataflows, you might also consider enabling incremental refresh or configuring the destination settings to overwrite the table on refresh, if that option is available.


Did it work? ✔ Give a Kudo • Mark as Solution – help others too!

Ok, those make sense, but how do I do that  😊

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.