<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: insert into warehouse table has missing rows, but the source script has all. in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4999930#M15010</link>
    <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;After digging deeper, this doesn’t look like a Copy activity or mapping issue. The key point is that &lt;STRONG&gt;rowsRead = rowsWritten&lt;/STRONG&gt; in the Copy activity output, and the activity completes successfully without any warnings or errors. This rules out sink-side rejections, data type issues, or constraint violations in the Warehouse.&lt;/P&gt;&lt;P&gt;What we are actually seeing is a &lt;STRONG&gt;data freshness / synchronization issue caused by the SQL Endpoint&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Although the source query returns all expected rows when executed manually, the Copy activity reads from the &lt;STRONG&gt;SQL Endpoint snapshot&lt;/STRONG&gt;, which can lag behind the Lakehouse data. This explains why:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;The latest rows are missing in the Warehouse&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;The issue is intermittent&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;The pipeline succeeds with correct row counts from the snapshot it sees&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Increasing the wait time helps, but it is not deterministic, especially when multiple pipelines and parallel copy activities are running.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Confirmed solution / recommendation:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;Avoid using the SQL Endpoint as a source for time-sensitive ingestion.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Read &lt;STRONG&gt;directly from the Lakehouse (Files/Delta tables)&lt;/STRONG&gt; instead.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;If SQL Endpoint must be used, apply a much larger delay and treat it as eventually consistent, not real-time.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Once we switched to reading directly from the Lakehouse, the missing rows issue disappeared completely.&lt;/P&gt;&lt;P&gt;This confirms the root cause is &lt;STRONG&gt;SQL Endpoint snapshot lag&lt;/STRONG&gt;, not the Copy activity itself.&lt;/P&gt;&lt;P&gt;Hope this helps anyone running into similar “silent missing rows” behavior.&lt;/P&gt;&lt;P&gt;Cheers&lt;BR /&gt;Barış&lt;/P&gt;</description>
    <pubDate>Sat, 07 Feb 2026 10:48:34 GMT</pubDate>
    <dc:creator>bariscihan</dc:creator>
    <dc:date>2026-02-07T10:48:34Z</dc:date>
    <item>
      <title>insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4994194#M14947</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have something strange going on with one of my pipelines since the 28th of January. This pipeline has been running fine for a year now, and it still completes without any errors.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The pipeline pulls data from a snowflake connection. Next we use the sqlendpoint of the lakehouse (I have implemented a 6 minute delay, before querying the sql endpoint with a copy activity, to make sure the sql endpoint data is refreshed , ms bug #1092 if i recall correctly ).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The the source for the copy job is a pretty standard sql script, with joins to our dim tables, datetype conversions, some case when etc. When i run just the cript I see all my date that I expect to see. The number of rows is 4549, the number of clumns 65. So it is not a huge data set at all.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If i compare my data in the lakehouse (bronze) en compare the data in the warehouse (gold) a number of rows are missing, they are all the latest rows, for instance i checked this morning (for us it is the 4th) and no data from the 3rd was in the&amp;nbsp; warehouse table.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;At first I thought this could be caused by the sqlenpoint delay, bt that doesnt make to much sense since when i 1st noticed this on the second of February, the latest data was of the 28th. Since then the pipeline had run succesfully about 8 times (it runs 2x a day)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The fact that my script display all rows, but some how the copy job does not ingest all rows puzzles me, it is a copy job afterall, with no user input besides source script and columnmapping.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The pipeline does have a number of copy jobs in paralel however, could that cause be the issue, is there a setting I can tweak? I am stumped, if the source query has all the rows, the mapping hasn't changed and the pipeline runs without error, why would te copy job not insert all rows?&lt;BR /&gt;&lt;BR /&gt;Belows a screenshort of the pipeline, so you can see how it is laid out,&amp;nbsp; it does have 8 paralel steps, the one i noticed it is step: 2000_copy etc etc&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="smeetsh_0-1770165141830.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1324473iA58CC5319240629B/image-size/medium?v=v2&amp;amp;px=400" role="button" title="smeetsh_0-1770165141830.png" alt="smeetsh_0-1770165141830.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cheers&lt;/P&gt;&lt;P&gt;Hans&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Feb 2026 00:35:37 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4994194#M14947</guid>
      <dc:creator>smeetsh</dc:creator>
      <dc:date>2026-02-04T00:35:37Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995185#M14955</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/718450"&gt;@smeetsh&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Thank you for reaching out to the Microsoft Community Forum.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Please try below things to fix the issue.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1. Check one failed run activity Output JSON: Open the run --&amp;gt; the specific Copy activity like "2000_copy_incident" --&amp;gt; Output.&lt;/P&gt;
&lt;P&gt;Capture: rowsRead, rowsWritten, Any error or skip counts, Reject log path, if present. If rowsRead = 4549 and rowsWritten &amp;lt; 4549, you have sink rejections.&lt;/P&gt;
&lt;P&gt;2. In the Copy activity --&amp;gt; Fault tolerance: Set Maximum errors = 0 and Disable Skip incompatible rows.&lt;/P&gt;
&lt;P&gt;3. Re-run. If the activity now fails, you will get the exact cause like PK violation, conversion error, string truncation, date overflow, etc.&lt;/P&gt;
&lt;P&gt;4. Check data types &amp;amp; lengths for every column: VARCHAR/NVARCHAR lengths, DECIMAL(p,s) ranges and DATETIME/DATE conversions like TIMESTAMP_TZ from Snowflake --&amp;gt; SQL&lt;/P&gt;
&lt;P&gt;Note: Newest rows often introduce longer values or unexpected nullability.&lt;/P&gt;
&lt;P&gt;5. Check Warehouse constraints &amp;amp; indexes, Does the target table have PRIMARY KEY or UNIQUE constraints and Computed columns with constraints. Try inserting one of the “missing” rows manually into Warehouse using the same values.&lt;/P&gt;
&lt;P&gt;6. Check there’s no hidden filter, parameter substitution, pipeline variables, or expressions like @{formatDateTime(utcNow(), 'yyyy-MM-dd')}). Run the exact command text from the activity using the same linked service and count the rows. If counts match the manual run (4549) but Copy lands fewer, it’s a sink issue.&lt;BR /&gt;If counts differ, it’s a source query/session issue.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I hope this information helps. Please do let us know if you have any further queries.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Dinesh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Feb 2026 12:49:07 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995185#M14955</guid>
      <dc:creator>v-dineshya</dc:creator>
      <dc:date>2026-02-04T12:49:07Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995625#M14957</link>
      <description>&lt;P&gt;&lt;FONT size="2"&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/718450"&gt;@smeetsh&lt;/a&gt;,&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;I have sometimes observed silent records loss in copy jobs due to SQL Endpoint snapshot lag. And I want to rule out schema/data-type changes or copy logic issue, since the same setup has been running successfully for a year.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2"&gt;I suggest either read directly from Lakehouse (keeping a delay of 6 minutes) or increase the wait-time if you like to use SQL Endpoint only.&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Feb 2026 18:22:28 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995625#M14957</guid>
      <dc:creator>stoic-harsh</dc:creator>
      <dc:date>2026-02-04T18:22:28Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995681#M14958</link>
      <description>&lt;P&gt;The activity doesn't fail, it runs without any errors, if i manualy run the one step it seems to work as well.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Edit i had a look at the specific copy step and the no of rows read = no of rows copied, so we are experiencing a synch issue I reckon?&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="smeetsh_0-1770235053294.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1324669i627F27F291AAC3A0/image-size/medium?v=v2&amp;amp;px=400" role="button" title="smeetsh_0-1770235053294.png" alt="smeetsh_0-1770235053294.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Feb 2026 20:00:13 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995681#M14958</guid>
      <dc:creator>smeetsh</dc:creator>
      <dc:date>2026-02-04T20:00:13Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995683#M14959</link>
      <description>&lt;P&gt;It certainly feels like the SQL endpoint delay issue, I just don't understand why it now has suddenly become an issue since the 28th, the delay is already at 400 seconds, had been for ages, I will increase it even more (600 seconds). We have dozens of pipelines built this way but only this one seems affected&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Feb 2026 03:25:21 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4995683#M14959</guid>
      <dc:creator>smeetsh</dc:creator>
      <dc:date>2026-02-05T03:25:21Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4996495#M14973</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/718450"&gt;@smeetsh&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;As mentioned by&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1505413"&gt;@stoic-harsh&lt;/a&gt;&amp;nbsp;, could you please&amp;nbsp; read directly from Lakehouse instead of&amp;nbsp;&lt;SPAN&gt;SQL Endpoint&amp;nbsp;.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Regards,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Dinesh&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Feb 2026 10:51:07 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4996495#M14973</guid>
      <dc:creator>v-dineshya</dc:creator>
      <dc:date>2026-02-05T10:51:07Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4999930#M15010</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;After digging deeper, this doesn’t look like a Copy activity or mapping issue. The key point is that &lt;STRONG&gt;rowsRead = rowsWritten&lt;/STRONG&gt; in the Copy activity output, and the activity completes successfully without any warnings or errors. This rules out sink-side rejections, data type issues, or constraint violations in the Warehouse.&lt;/P&gt;&lt;P&gt;What we are actually seeing is a &lt;STRONG&gt;data freshness / synchronization issue caused by the SQL Endpoint&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;Although the source query returns all expected rows when executed manually, the Copy activity reads from the &lt;STRONG&gt;SQL Endpoint snapshot&lt;/STRONG&gt;, which can lag behind the Lakehouse data. This explains why:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;The latest rows are missing in the Warehouse&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;The issue is intermittent&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;The pipeline succeeds with correct row counts from the snapshot it sees&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Increasing the wait time helps, but it is not deterministic, especially when multiple pipelines and parallel copy activities are running.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Confirmed solution / recommendation:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;Avoid using the SQL Endpoint as a source for time-sensitive ingestion.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Read &lt;STRONG&gt;directly from the Lakehouse (Files/Delta tables)&lt;/STRONG&gt; instead.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;If SQL Endpoint must be used, apply a much larger delay and treat it as eventually consistent, not real-time.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Once we switched to reading directly from the Lakehouse, the missing rows issue disappeared completely.&lt;/P&gt;&lt;P&gt;This confirms the root cause is &lt;STRONG&gt;SQL Endpoint snapshot lag&lt;/STRONG&gt;, not the Copy activity itself.&lt;/P&gt;&lt;P&gt;Hope this helps anyone running into similar “silent missing rows” behavior.&lt;/P&gt;&lt;P&gt;Cheers&lt;BR /&gt;Barış&lt;/P&gt;</description>
      <pubDate>Sat, 07 Feb 2026 10:48:34 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/4999930#M15010</guid>
      <dc:creator>bariscihan</dc:creator>
      <dc:date>2026-02-07T10:48:34Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/5000614#M15020</link>
      <description>&lt;P&gt;Reading directly from a lakehouse is sadly&amp;nbsp; not an option at this time&lt;/P&gt;</description>
      <pubDate>Sun, 08 Feb 2026 22:18:22 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/5000614#M15020</guid>
      <dc:creator>smeetsh</dc:creator>
      <dc:date>2026-02-08T22:18:22Z</dc:date>
    </item>
    <item>
      <title>Re: insert into warehouse table has missing rows, but the source script has all.</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/5000617#M15021</link>
      <description>&lt;P&gt;Thanks for the help all, I suspected it was the SQL endpoint delay and I am glad i wasn't going mad :D. I have now increased it to 15 minutes, the data is not that time sensitive. Since our gold medallion is a warehouse, and as far as i know i cannot write from a lakehouse to a warehouse, changing the architecture is not an option right now. I may have to start using the sempy labs sql endpoint refresh API if this gets any worse&lt;/P&gt;</description>
      <pubDate>Sun, 08 Feb 2026 22:27:11 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/insert-into-warehouse-table-has-missing-rows-but-the-source/m-p/5000617#M15021</guid>
      <dc:creator>smeetsh</dc:creator>
      <dc:date>2026-02-08T22:27:11Z</dc:date>
    </item>
  </channel>
</rss>

