<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Replicating Move behavior in Fabric pipelines in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5142997#M15684</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/307993"&gt;@arpost&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;We have not received a response from you regarding the query and were following up to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank You.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 02 Apr 2026 11:02:35 GMT</pubDate>
    <dc:creator>v-karpurapud</dc:creator>
    <dc:date>2026-04-02T11:02:35Z</dc:date>
    <item>
      <title>Replicating Move behavior in Fabric pipelines</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5139580#M15594</link>
      <description>&lt;P&gt;Greetings, community. Mainly posting this thread to create awareness around this idea (&lt;A href="https://community.fabric.microsoft.com/t5/Fabric-Ideas/Add-Move-behavior-for-Copy-activity-in-Fabric-pipelines/idi-p/4844775" target="_blank"&gt;Add Move behavior for Copy activity in Fabric pipe... - Microsoft Fabric Community&lt;/A&gt;) asking for a metadata-only Move operation that can be performed on Azure Storage, Lakehouse, and so on. I know these capabilities exist; they just aren't available in pipelines, which is unfortunate.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'll share the scenario we're facing. We deal with large volumes of files each day. Each file may have 2-4+ location changes over the lifetime of the file:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Copy #1: Copy from Azure Storage to Lakehouse&lt;/LI&gt;&lt;LI&gt;Move #1: Archive SFTP file in archive folder&lt;/LI&gt;&lt;LI&gt;Move #2: Archive Lakehouse file after processing&lt;/LI&gt;&lt;LI&gt;Move #3: Archive exported file after processing&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Currently, all of these steps consist of Copy + Delete operations, which means we incur significant capacity cost at scale just because we have to literally duplicate and then delete files. If there were a simple metadata-only Move option, that would significantly help, but there isn't without doing a decent amount of custom coding in a Notebook to tap into file-system utilities.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Has anyone else faced this issue? If so, how have you approached a Move-only behavior in your pipelines? Also, be sure to upvote so &lt;A href="https://community.fabric.microsoft.com/t5/Fabric-Ideas/Add-Move-behavior-for-Copy-activity-in-Fabric-pipelines/idi-p/4844775" target="_self"&gt;this idea&lt;/A&gt; gets on the Fabric team's radar.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Mar 2026 16:26:39 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5139580#M15594</guid>
      <dc:creator>arpost</dc:creator>
      <dc:date>2026-03-26T16:26:39Z</dc:date>
    </item>
    <item>
      <title>Re: Replicating Move behavior in Fabric pipelines</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5139813#M15598</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/307993"&gt;@arpost&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When we were implementing this sort of behaviour, I typically used mv operation in the Spark notebook.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The link on the library:&lt;/P&gt;&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/data-engineering/microsoft-spark-utilities" target="_blank"&gt;https://learn.microsoft.com/en-us/fabric/data-engineering/microsoft-spark-utilities&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;While I can't guarantee how it works under the hood, but the "mv" command, it seems, does exactly what you want.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Also we observed using notebooks is very convenient way for that sort of operations.&amp;nbsp;While you are mentioning the Data Pipeline, this approach is slightly different. But anyway, we didn't have any problems with that. Also, due to coding essence of the notebook, we noticed, it is much easier to configure exception handling during this sort of operations.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Conclusion. If notebook way is good for you, I strongly suggest to try it.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for good question and proactive position. I hope it helps. Kudo (like) and making answer as a solution will help me and others to contribute and use that contribution more effectively.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;BR, Yurri&lt;/P&gt;</description>
      <pubDate>Fri, 27 Mar 2026 02:27:08 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5139813#M15598</guid>
      <dc:creator>4iurchenko</dc:creator>
      <dc:date>2026-03-27T02:27:08Z</dc:date>
    </item>
    <item>
      <title>Re: Replicating Move behavior in Fabric pipelines</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5139999#M15603</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/307993"&gt;@arpost&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;P&gt;I’d recommend moving each file &lt;STRONG&gt;only once&lt;/STRONG&gt; into a Lakehouse landing folder (for example via SFTP/FTP) and then treating that landing zone as &lt;STRONG&gt;immutable&lt;/STRONG&gt;. After this initial move, we avoid any further physical file movement.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Incremental ingestion using Spark&lt;/STRONG&gt;&lt;BR /&gt;From the landing area, I’d use Spark Structured Streaming with checkpointing (Auto Loader pattern) to ingest files into the Bronze layer. The checkpoint becomes the source of truth for which files have already been processed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;How this avoids reprocessing&lt;/STRONG&gt;&lt;BR /&gt;Rather than moving or deleting files to signal progress, Spark uses the checkpoint to automatically skip files it has already ingested. This allows landing files to remain in place without any risk of duplicate processing.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;STRONG&gt;Removing old files with retention policies&lt;/STRONG&gt;&lt;BR /&gt;To control storage growth, I’d apply &lt;STRONG&gt;retention policies&lt;/STRONG&gt; on the landing folders to automatically delete files after a defined number of days. This provides clean-up and compliance without introducing archive moves or extra compute cost.&lt;/DIV&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;P&gt;&lt;STRONG&gt;Why this is better than a move-based approach&lt;/STRONG&gt;&lt;BR /&gt;Compared to copy‑and‑delete “moves”, this significantly reduces Fabric capacity and IO cost by eliminating repeated data duplication. State is tracked logically via checkpoints, not physically by moving files around.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Operational benefits&lt;/STRONG&gt;&lt;BR /&gt;Overall, this keeps the architecture simpler, more reliable, and easier to scale. It reduces custom filesystem code, lowers operational risk, and aligns well with modern Lakehouse ingestion best practices.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 27 Mar 2026 11:16:22 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5139999#M15603</guid>
      <dc:creator>deborshi_nag</dc:creator>
      <dc:date>2026-03-27T11:16:22Z</dc:date>
    </item>
    <item>
      <title>Re: Replicating Move behavior in Fabric pipelines</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5140846#M15621</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/307993"&gt;@arpost&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;Thank you for reaching out to the Microsoft Fabric Community Forum. Also, thanks to&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1420385"&gt;@4iurchenko&lt;/a&gt;&amp;nbsp; and &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/1445078"&gt;@deborshi_nag&lt;/a&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;for those inputs on this thread.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN data-teams="true"&gt;Could you let us know if the suggested solution resolved your issue? If not, please share any additional details so we can assist further.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN data-teams="true"&gt;Best regards,&lt;BR /&gt;Community Support Team.&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 30 Mar 2026 06:20:21 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5140846#M15621</guid>
      <dc:creator>v-karpurapud</dc:creator>
      <dc:date>2026-03-30T06:20:21Z</dc:date>
    </item>
    <item>
      <title>Re: Replicating Move behavior in Fabric pipelines</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5142997#M15684</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/307993"&gt;@arpost&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;We have not received a response from you regarding the query and were following up to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank You.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2026 11:02:35 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Replicating-Move-behavior-in-Fabric-pipelines/m-p/5142997#M15684</guid>
      <dc:creator>v-karpurapud</dc:creator>
      <dc:date>2026-04-02T11:02:35Z</dc:date>
    </item>
  </channel>
</rss>

