<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dataflow Incremental Refresh Causing Duplicate Records in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4858672#M13102</link>
    <description>&lt;P&gt;Hello &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1296713"&gt;@Thorns&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;We hope you're doing well. Could you please confirm whether your issue has been resolved or if you're still facing challenges? Your update will be valuable to the community and may assist others with similar concerns.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Thank you.&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 26 Oct 2025 15:17:32 GMT</pubDate>
    <dc:creator>v-ssriganesh</dc:creator>
    <dc:date>2025-10-26T15:17:32Z</dc:date>
    <item>
      <title>Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4855225#M13015</link>
      <description>&lt;P&gt;I created a Gen2 Dataflow the sources its data from a Salesforce object and loaded to my warehouse.&amp;nbsp; I then setup incremental refresh to review everything created in the past 3 years with a bucket size of Month and using the last modified date to determine what data to extract.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Thorns_0-1761080043373.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1304732i5C1892DE0DE91FEF/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Thorns_0-1761080043373.png" alt="Thorns_0-1761080043373.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I noticed a problem today where I had duplicate records in my table (only showing limited data for anonymity).&amp;nbsp; There were two records with the same Id, same CreatedDate, and LastModifiedDate.&amp;nbsp; For some reason the incremental refresh is causing a duplicate record to be created.&amp;nbsp; And this has happened with hundreds of records.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Thorns_1-1761080233917.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1304734iAF0DC182DCF5EC0E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Thorns_1-1761080233917.png" alt="Thorns_1-1761080233917.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;What is happening with the incremental refresh that is causing duplicate records to be created in the table?&lt;/P&gt;</description>
      <pubDate>Tue, 21 Oct 2025 20:59:07 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4855225#M13015</guid>
      <dc:creator>Thorns</dc:creator>
      <dc:date>2025-10-21T20:59:07Z</dc:date>
    </item>
    <item>
      <title>Re: Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4855758#M13040</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hello &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1296713"&gt;@Thorns&lt;/a&gt;,&lt;BR /&gt;Thank you for reaching out to the Microsoft fabric community forum. &lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;The duplicate records in your warehouse table are likely due to the incremental refresh feature in Dataflow Gen2 not supporting query folding for Salesforce objects. This causes the full dataset to be pulled per bucket refresh, leading to improper replacement and duplicates, especially for modified records.&lt;/P&gt;
&lt;P&gt;To fix this:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Disable incremental refresh to stop further duplicates.&lt;/LI&gt;
&lt;LI&gt;Manually implement incremental load:&lt;/LI&gt;
&lt;/OL&gt;
&lt;UL&gt;
&lt;LI&gt;Add a reference query to get the max LastModifiedDate from your warehouse table.&lt;/LI&gt;
&lt;LI&gt;Filter Salesforce data where LastModifiedDate &amp;gt; that value in your main query.&lt;/LI&gt;
&lt;LI&gt;Use "upsert" (with Id as primary key) or "append" as the destination update method.&lt;/LI&gt;
&lt;/UL&gt;
&lt;OL&gt;
&lt;LI&gt;If handling deletes, include the IsDeleted field and add logic to remove corresponding rows and Test with a manual refresh, then schedule.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;For existing duplicates, run a one-time deduplication before applying the new setup. This should resolve the issue effectively.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Ganesh Singamshetty.&lt;/P&gt;</description>
      <pubDate>Wed, 22 Oct 2025 11:50:59 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4855758#M13040</guid>
      <dc:creator>v-ssriganesh</dc:creator>
      <dc:date>2025-10-22T11:50:59Z</dc:date>
    </item>
    <item>
      <title>Re: Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4856029#M13054</link>
      <description>&lt;P&gt;Thanks for the information!&amp;nbsp; I will stop using incremental refresh.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Where in the Gen2 Dataflow does it allow for upsert?&amp;nbsp; I do not see that as an option.&lt;/P&gt;</description>
      <pubDate>Wed, 22 Oct 2025 15:55:41 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4856029#M13054</guid>
      <dc:creator>Thorns</dc:creator>
      <dc:date>2025-10-22T15:55:41Z</dc:date>
    </item>
    <item>
      <title>Re: Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4856375#M13057</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1296713"&gt;@Thorns&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;Thanks for the update. glad you're pausing incremental refresh to avoid more duplicates.&lt;/P&gt;
&lt;P&gt;I apologize for the confusion in my earlier suggestion upon double-checking the Dataflow Gen2 documentation, "upsert" isn't directly available as an update method for any destination, including Warehouse. For Warehouse specifically, the only supported option is replace, with no append or upsert in the UI.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;To achieve an incremental upsert-like behavior (adding new records and updating changed ones without duplicates):&lt;/P&gt;
&lt;UL dir="auto"&gt;
&lt;LI&gt;Switch your destination to a Fabric Lakehouse table (which supports both append and replace). This allows more flexibility&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Thu, 23 Oct 2025 04:36:22 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4856375#M13057</guid>
      <dc:creator>v-ssriganesh</dc:creator>
      <dc:date>2025-10-23T04:36:22Z</dc:date>
    </item>
    <item>
      <title>Re: Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4858672#M13102</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1296713"&gt;@Thorns&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;We hope you're doing well. Could you please confirm whether your issue has been resolved or if you're still facing challenges? Your update will be valuable to the community and may assist others with similar concerns.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;Thank you.&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 26 Oct 2025 15:17:32 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4858672#M13102</guid>
      <dc:creator>v-ssriganesh</dc:creator>
      <dc:date>2025-10-26T15:17:32Z</dc:date>
    </item>
    <item>
      <title>Re: Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4861458#M13152</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1296713"&gt;@Thorns&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Hope everything’s going great with you. Just checking in has the issue been resolved or are you still running into problems? Sharing an update can really help others facing the same thing.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Thank you.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Oct 2025 09:50:30 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4861458#M13152</guid>
      <dc:creator>v-ssriganesh</dc:creator>
      <dc:date>2025-10-29T09:50:30Z</dc:date>
    </item>
    <item>
      <title>Re: Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4867205#M13264</link>
      <description>&lt;P&gt;I decided to go with a different route since this feature is not working as expected.&amp;nbsp; Thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 05 Nov 2025 16:34:31 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4867205#M13264</guid>
      <dc:creator>Thorns</dc:creator>
      <dc:date>2025-11-05T16:34:31Z</dc:date>
    </item>
    <item>
      <title>Re: Dataflow Incremental Refresh Causing Duplicate Records</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4870813#M13338</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1296713"&gt;@Thorns&lt;/a&gt;,&lt;BR /&gt;Thank you for the update. completely understand your decision.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;If this capability would be useful in your scenario, We encourage you to submit it as a feature suggestion in the &lt;A href="https://ideas.fabric.microsoft.com/" target="_blank"&gt;Microsoft Fabric Ideas Forum&lt;/A&gt;.&lt;BR /&gt;This helps the product team gauge community demand and prioritize it for future releases.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;You can also keep an eye on upcoming enhancements in the official Microsoft Fabric Roadmap: &lt;A href="https://learn.microsoft.com/en-us/fabric/fundamentals/whats-new?tabs=items" target="_blank"&gt;https://learn.microsoft.com/en-us/fabric/fundamentals/whats-new?tabs=items&lt;/A&gt; it’s updated regularly with new and planned capabilities for Dataflows, Lakehouse, and Warehouse.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Ganesh Singamshetty.&lt;/P&gt;</description>
      <pubDate>Mon, 10 Nov 2025 09:45:17 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Dataflow-Incremental-Refresh-Causing-Duplicate-Records/m-p/4870813#M13338</guid>
      <dc:creator>v-ssriganesh</dc:creator>
      <dc:date>2025-11-10T09:45:17Z</dc:date>
    </item>
  </channel>
</rss>

