<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?) in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Save-a-spark-dataframe-in-a-Lakehouse-with-Schema-Support/m-p/4860956#M13143</link>
    <description>&lt;P&gt;When you partition a table it dynamically creates folders/directories based on that partition field(s) and the field is not in your data file anymore but is a directory instead.&amp;nbsp; If you want to repartition a table you must first create a new table with your new partition field(s) then run an select * from old table sql statement to insert into and make sure you partition field is your first field in the select or use your pyspark code from above to load your new table.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 28 Oct 2025 18:43:39 GMT</pubDate>
    <dc:creator>jaymac210</dc:creator>
    <dc:date>2025-10-28T18:43:39Z</dc:date>
    <item>
      <title>Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?)</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Save-a-spark-dataframe-in-a-Lakehouse-with-Schema-Support/m-p/4745319#M10443</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;We have created a Lakehouse with Schema support enabled.&amp;nbsp; Then we have developed a notebook to save a pyspark dataframe as a delta table&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;EM&gt;spark_df.write.mode("append").format("delta").option("overwriteSchema", "true").partitionBy("MonthKey").saveAsTable("staffestablishmentplan")&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;but following error occurs&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;IllegalArgumentException&lt;/SPAN&gt;&lt;SPAN&gt;: requirement failed: The provided partitioning does not match of the table. - provided: identity(MonthKey) - table: &lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;Looks like is it is not able so refer the table. I have tried replacing the table name by the dbo.&amp;lt;TableName&amp;gt; but error remains.&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;Is it not possible to save tables using notebooks on lakehouse with schema support?&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;I don-t see that on the list of current limitations&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;A href="https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-schemas#public-preview-limitations" target="_blank" rel="noopener"&gt;https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-schemas#public-preview-limitations&lt;/A&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;Regrards,&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 26 Jun 2025 14:39:16 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Save-a-spark-dataframe-in-a-Lakehouse-with-Schema-Support/m-p/4745319#M10443</guid>
      <dc:creator>alfBI</dc:creator>
      <dc:date>2025-06-26T14:39:16Z</dc:date>
    </item>
    <item>
      <title>Re: Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?)</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Save-a-spark-dataframe-in-a-Lakehouse-with-Schema-Support/m-p/4745867#M10448</link>
      <description>&lt;P&gt;it should be possible but from the error it seems like&amp;nbsp;a partition mismatch issue, not a problem with schema support itself.&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Once a Delta table is created (example:&amp;nbsp;staffestablishmentplan), its partitioning columns are fixed unless the table is dropped and recreated. So, if the table already exists without partitions or with different partitions, then this code:&amp;nbsp;&amp;nbsp;&lt;EM&gt;spark_df.write.mode("append").format("delta").option("overwriteSchema", "true").partitionBy("MonthKey").saveAsTable("staffestablishmentplan")&lt;/EM&gt; will throw an error.&lt;/LI&gt;&lt;LI&gt;Check the existing table partitioning:&amp;nbsp;&lt;EM&gt;spark.sql("DESCRIBE DETAIL staffestablishmentplan").select("partitionColumns").show(truncate=False)&lt;/EM&gt;&lt;BR /&gt;&lt;P&gt;If the output is [], then the table was created without partitions. In that case:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;You cannot append with a new partitioning scheme unless you drop and recreate the table.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;You need to either:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P&gt;Remove .partitionBy("MonthKey") when appending or&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;Drop and recreate the table with the desired partition.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;If you are still early in development and can afford to overwrite the table (if possible)&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Finaly recommendation to try:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Check if the table already exists&lt;/LI&gt;&lt;LI&gt;If it exists without MonthKey partition, you have two options:&lt;UL&gt;&lt;LI&gt;drop it&lt;/LI&gt;&lt;LI&gt;then run your saveAsTable with partitionBy&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Or append without partitionBy if you cannot afford to drop it&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;STRONG&gt;Please 'Kudos' and 'Accept as Solution' if this answered your query.&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 27 Jun 2025 01:12:04 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Save-a-spark-dataframe-in-a-Lakehouse-with-Schema-Support/m-p/4745867#M10448</guid>
      <dc:creator>Vinodh247</dc:creator>
      <dc:date>2025-06-27T01:12:04Z</dc:date>
    </item>
    <item>
      <title>Re: Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?)</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Save-a-spark-dataframe-in-a-Lakehouse-with-Schema-Support/m-p/4860956#M13143</link>
      <description>&lt;P&gt;When you partition a table it dynamically creates folders/directories based on that partition field(s) and the field is not in your data file anymore but is a directory instead.&amp;nbsp; If you want to repartition a table you must first create a new table with your new partition field(s) then run an select * from old table sql statement to insert into and make sure you partition field is your first field in the select or use your pyspark code from above to load your new table.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 28 Oct 2025 18:43:39 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Save-a-spark-dataframe-in-a-Lakehouse-with-Schema-Support/m-p/4860956#M13143</guid>
      <dc:creator>jaymac210</dc:creator>
      <dc:date>2025-10-28T18:43:39Z</dc:date>
    </item>
  </channel>
</rss>

