Solved: Re: How to delete Delta Table column in Lakehouse

VoltesDev · ‎02-18-2025

Hi guys,

Anyone know how to delete Delta table column (field) in Lakehouse. I tried using Pyspark script like this, but not working:

from pyspark.sql.functions import lit
from pyspark.sql.types import StringType, TimestampType

delta_table_path = "abfss://gold@onelake.dfs.fabric.microsoft.com/gold_datawarehouse.Lakehouse/Tables/gold_notebook_runs"

# Load the Delta table
df = spark.read.format("delta").load(delta_table_path)

# alter table
# Drop unwanted columns
df = df.drop("paret_start_time", "paret_end_time")

# Write back the DataFrame to the Delta table (Overwriting with new schema)
df.write.format("delta") \
    .mode("overwrite") \
    .option("overwriteSchema", "true") \
    .save(delta_table_path)

I wondering why we can't just delete the column by simply click the field in Lakehouse and delete like SQL Server.

Thanks

nilendraFabric · ‎02-18-2025

Yes Using `overwriteSchema` will overwrite existing data in all columns when combined with `mode("overwrite")`

View solution in original post

v-karpurapud · ‎02-21-2025

Hello @VoltesDev

Could you please confirm if your query have been resolved the solution provided by @nilendraFabric ? If they have, kindly mark the helpful response and accept it as the solution. This will assist other community members in resolving similar issues more efficiently.

Thank you

VoltesDev · ‎02-25-2025

For some weird reason, the 1st one from my own post, when try again, works.

v-karpurapud · ‎02-27-2025

Hello @VoltesDev

Thank you for your response! Glad to hear that your issue is resolved ! Kindly accept the solution so that it can help others easily find and apply the fix 😊.

Regards!

v-karpurapud · ‎02-24-2025

Hello @VoltesDev

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

nilendraFabric · ‎02-18-2025

Hello @VoltesDev

try with these settings

from delta.tables import *

deltaTable = DeltaTable.forPath(spark, delta_table_path)
deltaTable.alter().setProperty(
'delta.columnMapping.mode', 'name'
).execute()
deltaTable.alter().setProperty(
'delta.minReaderVersion', '2'
).execute()
deltaTable.alter().setProperty(
'delta.minWriterVersion', '5'
).execute()

df.write.format("delta") \
.mode("overwrite") \
.option("mergeSchema", "true") \
.option("overwriteSchema", "true") \
.save(delta_table_path)

VoltesDev · ‎02-18-2025

will this overwrite my existing data in other column ?

nilendraFabric · ‎02-18-2025

Yes Using `overwriteSchema` will overwrite existing data in all columns when combined with `mode("overwrite")`

VoltesDev · ‎02-18-2025

is there any other way ?

nilendraFabric · ‎02-18-2025

I mean what is the exact requirement?

VoltesDev · ‎02-18-2025

I want to delete the unwanted column, but I don't want to loose existing data.

nilendraFabric · ‎02-18-2025

Try this

df = spark.read.format("delta").table("your_table_name")
df = df.drop("column1", "column2")

df.write.format("delta").mode("overwrite").saveAsTable("temp_table")

spark.sql("DROP TABLE your_table_name")

spark.sql("ALTER TABLE temp_table RENAME TO your_table_name")

this is the only way to fullfil your requirement it seems

How to delete Delta Table column in Lakehouse

Helpful resources

Join us at the Microsoft Fabric Community Conference

Fabric Monthly Update - February 2025

Fabric Community Update - February 2025

New Offer! Become a Certified Fabric Data Engineer

How to delete Delta Table column in Lakehouse

Helpful resources

Join us at the Microsoft Fabric Community Conference

Fabric Monthly Update - February 2025

Fabric Community Update - February 2025