<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Gen2 df external table in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683550#M9129</link>
    <description>&lt;P&gt;[Edited by admin for unnecessary tagging without context]&lt;/P&gt;
&lt;P&gt;A widely preferred pattern for data engineering with lakehouse for us has been creation of External Delta table. This is only possible for data sources that can be consumed from a notebook.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, there are data sources that exist beyond that and the only alternative is gen2 df for them. But gen 2 df only inserts into a lakehouse table, is there any way to insert into a chosen lakehouse subfolder instead of a table.&lt;/P&gt;
&lt;P&gt;I don’t think it is doable now. If that is the case, if it is on cards?&lt;/P&gt;</description>
    <pubDate>Wed, 07 May 2025 20:55:27 GMT</pubDate>
    <dc:creator>smpa01</dc:creator>
    <dc:date>2025-05-07T20:55:27Z</dc:date>
    <item>
      <title>Gen2 df external table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683550#M9129</link>
      <description>&lt;P&gt;[Edited by admin for unnecessary tagging without context]&lt;/P&gt;
&lt;P&gt;A widely preferred pattern for data engineering with lakehouse for us has been creation of External Delta table. This is only possible for data sources that can be consumed from a notebook.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, there are data sources that exist beyond that and the only alternative is gen2 df for them. But gen 2 df only inserts into a lakehouse table, is there any way to insert into a chosen lakehouse subfolder instead of a table.&lt;/P&gt;
&lt;P&gt;I don’t think it is doable now. If that is the case, if it is on cards?&lt;/P&gt;</description>
      <pubDate>Wed, 07 May 2025 20:55:27 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683550#M9129</guid>
      <dc:creator>smpa01</dc:creator>
      <dc:date>2025-05-07T20:55:27Z</dc:date>
    </item>
    <item>
      <title>Re: Gen2 df external table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683709#M9132</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/24978"&gt;@smpa01&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You arr right,Dataflow Gen2 &amp;nbsp;currently supports writing data only to Lakehouse tables, not specific subfolders.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;one thing which you can try is&amp;nbsp;Use Dataflow Gen2 to land data in a staging table and then read this staging table in notebook and write to your desired location&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 07 May 2025 21:42:32 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683709#M9132</guid>
      <dc:creator>nilendraFabric</dc:creator>
      <dc:date>2025-05-07T21:42:32Z</dc:date>
    </item>
    <item>
      <title>Re: Gen2 df external table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683719#M9133</link>
      <description>&lt;P&gt;That's too much honestly to maintain.&amp;nbsp;pre_bronze-&amp;gt;brone-&amp;gt;silver so on and so forth.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Dataflows have an advantage over notebooks when it comes to connecting to certain sources that don't have equivalent connectors available in notebooks — for example, on-premises SQL Server, SharePoint, etc. In such cases, there is no alternative but to use a dataflow.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Currently, dataflows remain relevant largely because of this limitation. So, for writing to destination, it only makes sense that df gen2 provides same options as notebook.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To keep up with the norm, df gen2 must give the ability to write to subfolders. After all, any bronze should land in files for audit trailing.&lt;/P&gt;</description>
      <pubDate>Wed, 07 May 2025 22:04:46 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683719#M9133</guid>
      <dc:creator>smpa01</dc:creator>
      <dc:date>2025-05-07T22:04:46Z</dc:date>
    </item>
    <item>
      <title>Re: Gen2 df external table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683739#M9134</link>
      <description>&lt;P&gt;I'm curious, what are the benefits of writing to files instead of just appending to a lakehouse bronze delta table?&lt;/P&gt;</description>
      <pubDate>Wed, 07 May 2025 22:52:35 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683739#M9134</guid>
      <dc:creator>frithjof_v</dc:creator>
      <dc:date>2025-05-07T22:52:35Z</dc:date>
    </item>
    <item>
      <title>Re: Gen2 df external table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683772#M9135</link>
      <description>&lt;H3&gt;&lt;STRONG&gt;Why External Tables are Ideal for the Bronze Layer in Production Data Lakes (according to my practical experience of data engineering and servicing BI)&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;In a well-architected Data Lake, data flows through three layers:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Bronze&lt;/STRONG&gt; (Raw Ingestion),&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Silver&lt;/STRONG&gt; (Cleaned &amp;amp; Enriched),&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Gold&lt;/STRONG&gt; (Curated Business Data with Semantic Models).&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;The &lt;STRONG&gt;Bronze Layer&lt;/STRONG&gt; is where raw data from various sources like on-prem SQL, SharePoint, Azure SQL, Oracle, APIs, and Databricks is ingested. Using &lt;STRONG&gt;external tables&lt;/STRONG&gt; for this layer is highly advantageous for the following reasons:&lt;/P&gt;
&lt;HR /&gt;
&lt;H3&gt;&lt;STRONG&gt;1. Data Persists Beyond Table Lifetime&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;External tables store data separately from the metadata, so &lt;STRONG&gt;dropping the table does not delete the data&lt;/STRONG&gt;.&lt;/LI&gt;
&lt;LI&gt;This ensures raw ingested data is always available for reprocessing or auditing.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H3&gt;&lt;STRONG&gt;2. Easy Table Rebuilds Without Re-ingestion&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Since the data remains in the storage layer, you can &lt;STRONG&gt;recreate the table schema at any time&lt;/STRONG&gt; without fetching the source data again.&lt;/LI&gt;
&lt;LI&gt;This is crucial for schema adjustments or optimization without risking data loss.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H3&gt;&lt;STRONG&gt;3. Multiple Silver/Gold Views from the Same Data&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;External tables allow you to build &lt;STRONG&gt;multiple transformations&lt;/STRONG&gt; (Silver/Gold) from the same Bronze data.&lt;/LI&gt;
&lt;LI&gt;This eliminates redundancy and maintains a &lt;STRONG&gt;single source of truth&lt;/STRONG&gt; for different business units like Finance, Procurement, Leasing, and Engineering.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H3&gt;&lt;STRONG&gt;4. Flexible Backfills and Schema Evolutions&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Adding new columns, adjusting schemas, or historical backfills are seamless.&lt;/LI&gt;
&lt;LI&gt;You can introduce new attributes for all past, present, and future data without re-ingesting or dropping the table.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H3&gt;&lt;STRONG&gt;5. Enhanced Audit Traceability&lt;/STRONG&gt;&lt;/H3&gt;
&lt;UL&gt;
&lt;LI&gt;Every row can be traced back to its &lt;STRONG&gt;original source file&lt;/STRONG&gt; or API batch.&lt;/LI&gt;
&lt;LI&gt;This provides clear visibility into when and where data was ingested — critical for regulatory compliance and debugging.&lt;/LI&gt;
&lt;/UL&gt;
&lt;HR /&gt;
&lt;H3&gt;&lt;STRONG&gt;Conclusion&lt;/STRONG&gt;&lt;/H3&gt;
&lt;P&gt;External tables in the Bronze layer offer:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;STRONG&gt;Data safety&lt;/STRONG&gt; beyond table lifecycle&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Rebuild flexibility&lt;/STRONG&gt; without re-fetching data&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Multi-view capability&lt;/STRONG&gt; for different business requirements&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Smooth schema evolution&lt;/STRONG&gt; and backfills&lt;/LI&gt;
&lt;LI&gt;&lt;STRONG&gt;Full audit traceability&lt;/STRONG&gt; for compliance and debugging&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;This design pattern forms the backbone of a resilient, scalable, and auditable Data Lake architecture.&lt;/P&gt;</description>
      <pubDate>Thu, 08 May 2025 00:37:05 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683772#M9135</guid>
      <dc:creator>smpa01</dc:creator>
      <dc:date>2025-05-08T00:37:05Z</dc:date>
    </item>
    <item>
      <title>Re: Gen2 df external table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683878#M9136</link>
      <description>&lt;P&gt;At the moment, Dataflow Gen2 only loads data to tables. Do feel free to suggest new destinations (and formats) in the Fabric Ideas portal (&lt;A href="https://aka.ms/FabricIdeas" target="_blank"&gt;https://aka.ms/FabricIdeas&lt;/A&gt;)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;An alternative is to leverage the copy activity or a copy job. Especially as the bronze layer is typically used for the files in its raw state, so no transformation should be performed at that layer and instead a simple copy activity should be good enough. If a connector is missing from the copy job / copy activity, then would you mind letting us know what the source is? you can also post a new idea for such connector in the Ideas Portal.&lt;/P&gt;</description>
      <pubDate>Thu, 08 May 2025 03:53:03 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Gen2-df-external-table/m-p/4683878#M9136</guid>
      <dc:creator>miguel</dc:creator>
      <dc:date>2025-05-08T03:53:03Z</dc:date>
    </item>
  </channel>
</rss>

