topic Re: Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?) in Data Engineering

Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?)

alfBI — Thu, 26 Jun 2025 14:39:16 GMT

Hi,

We have created a Lakehouse with Schema support enabled. Then we have developed a notebook to save a pyspark dataframe as a delta table

spark_df.write.mode("append").format("delta").option("overwriteSchema", "true").partitionBy("MonthKey").saveAsTable("staffestablishmentplan")

but following error occurs

IllegalArgumentException: requirement failed: The provided partitioning does not match of the table. - provided: identity(MonthKey) - table:

Looks like is it is not able so refer the table. I have tried replacing the table name by the dbo.<TableName> but error remains.

Is it not possible to save tables using notebooks on lakehouse with schema support?

I don-t see that on the list of current limitations

https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-schemas#public-preview-limitations

Regrards,

Re: Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?)

Vinodh247 — Fri, 27 Jun 2025 01:12:04 GMT

it should be possible but from the error it seems like a partition mismatch issue, not a problem with schema support itself.

Once a Delta table is created (example: staffestablishmentplan), its partitioning columns are fixed unless the table is dropped and recreated. So, if the table already exists without partitions or with different partitions, then this code: spark_df.write.mode("append").format("delta").option("overwriteSchema", "true").partitionBy("MonthKey").saveAsTable("staffestablishmentplan") will throw an error.
Check the existing table partitioning: spark.sql("DESCRIBE DETAIL staffestablishmentplan").select("partitionColumns").show(truncate=False)
If the output is [], then the table was created without partitions. In that case:
- You cannot append with a new partitioning scheme unless you drop and recreate the table.
- You need to either:
  - Remove .partitionBy("MonthKey") when appending or
  - Drop and recreate the table with the desired partition.
If you are still early in development and can afford to overwrite the table (if possible)

Finaly recommendation to try:

Check if the table already exists
If it exists without MonthKey partition, you have two options:
- drop it
- then run your saveAsTable with partitionBy
Or append without partitionBy if you cannot afford to drop it

Please 'Kudos' and 'Accept as Solution' if this answered your query.

Re: Save a spark dataframe in a Lakehouse with Schema Support enabled (Feasible?)

jaymac210 — Tue, 28 Oct 2025 18:43:39 GMT

When you partition a table it dynamically creates folders/directories based on that partition field(s) and the field is not in your data file anymore but is a directory instead. If you want to repartition a table you must first create a new table with your new partition field(s) then run an select * from old table sql statement to insert into and make sure you partition field is your first field in the select or use your pyspark code from above to load your new table.