Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
tomperro
Helper IV
Helper IV

Duplicate rows in dataflow

I have created a dataflow to pull data from a salesforce object into my warehouse but when I do a refresh, it appends to the existing warehouse table. I have added an index column but this did not solve the issue.

How can I refresh the data and not have duplicates?

1 ACCEPTED SOLUTION
v-csrikanth
Community Support
Community Support

Hi @tomperro 
Thanks for reaching out to Fabric Community.

Your dataflow is currently appending new rows each time, which is why you end up with duplicates.

Here are Best approaches pick the one that best fits your scenario:

Choose the method that aligns with your performance and audit requirements.


If this answer solves your issue, please give us a Kudos and mark it as Accepted Solution.
Kind regards,
Community Support Team _ C Srikanth.

View solution in original post

4 REPLIES 4
v-csrikanth
Community Support
Community Support

Hi @tomperro 
I wanted to follow up since I haven't heard from you in a while. Have you had a chance to try the suggested solutions?
If your issue is resolved, please consider marking the post as solved. However, if you're still facing challenges, feel free to share the details, and we'll be happy to assist you further.
Looking forward to your response!


Best Regards,
Community Support Team _ C Srikanth.

v-csrikanth
Community Support
Community Support

Hi @tomperro 
Thanks for reaching out to Fabric Community.

Your dataflow is currently appending new rows each time, which is why you end up with duplicates.

Here are Best approaches pick the one that best fits your scenario:

Choose the method that aligns with your performance and audit requirements.


If this answer solves your issue, please give us a Kudos and mark it as Accepted Solution.
Kind regards,
Community Support Team _ C Srikanth.

rohit1991
Super User
Super User

Hi @tomperro ,
It sounds like the issue you're encountering is due to the dataflow performing an append operation rather than a full refresh or upsert into your warehouse table. Adding an index column alone won't prevent duplicates unless you're using it as part of a deduplication step. To avoid duplicates, you’ll need to implement logic in your dataflow that either deletes existing data before each refresh or identifies and removes duplicates based on a unique key, such as a Salesforce record ID.

 

Another approach is to stage the incoming data in a temporary table and then use a transformation (like a merge or deduplicate step) before writing to the final destination. If you're using Microsoft Fabric or a similar platform with dataflows, you might also consider enabling incremental refresh or configuring the destination settings to overwrite the table on refresh, if that option is available.

 

Passionate about leveraging data analytics to drive strategic decision-making and foster business growth.

Connect with me on LinkedIn: Rohit Kumar.

Ok, those make sense, but how do I do that  😊

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.