Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
VoltesDev
Helper V
Helper V

How to delete Delta Table column in Lakehouse

Hi guys,

 

Anyone know how to delete Delta table column (field) in Lakehouse. I tried using Pyspark script like this, but not working:

 

from pyspark.sql.functions import lit
from pyspark.sql.types import StringType, TimestampType

delta_table_path = "abfss://gold@onelake.dfs.fabric.microsoft.com/gold_datawarehouse.Lakehouse/Tables/gold_notebook_runs"

# Load the Delta table
df = spark.read.format("delta").load(delta_table_path)

# alter table
# Drop unwanted columns
df = df.drop("paret_start_time", "paret_end_time")

# Write back the DataFrame to the Delta table (Overwriting with new schema)
df.write.format("delta") \
    .mode("overwrite") \
    .option("overwriteSchema", "true") \
    .save(delta_table_path)

 

I wondering why we can't just delete the column by simply click the field in Lakehouse and delete like SQL Server. 

 

Thanks

1 ACCEPTED SOLUTION

Yes Using `overwriteSchema` will overwrite existing data in all columns when combined with `mode("overwrite")`

View solution in original post

11 REPLIES 11
v-karpurapud
Community Support
Community Support

Hello @VoltesDev 

Could you please confirm if your query have been resolved the solution provided by @nilendraFabric ? If they have, kindly mark the helpful response and accept it as the solution. This will assist other community members in resolving similar issues more efficiently.

Thank you

 

For some weird reason, the 1st one from my own post, when try again, works.

 

Hello @VoltesDev 

Thank you for your response! Glad to hear that your issue is resolved ! Kindly accept the solution so that it can help others easily find and apply the fix 😊.


Regards!

Hello @VoltesDev 

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

nilendraFabric
Community Champion
Community Champion

Hello @VoltesDev 

 

try with these settings 

 

from delta.tables import *

deltaTable = DeltaTable.forPath(spark, delta_table_path)
deltaTable.alter().setProperty(
'delta.columnMapping.mode', 'name'
).execute()
deltaTable.alter().setProperty(
'delta.minReaderVersion', '2'
).execute()
deltaTable.alter().setProperty(
'delta.minWriterVersion', '5'
).execute()

 

df.write.format("delta") \
.mode("overwrite") \
.option("mergeSchema", "true") \
.option("overwriteSchema", "true") \
.save(delta_table_path)

will this overwrite my existing data in other column ?

Yes Using `overwriteSchema` will overwrite existing data in all columns when combined with `mode("overwrite")`

is there any other way ?

I mean what is the exact requirement?

I want to delete the unwanted column, but I don't want to loose existing data.

Try this

 

df = spark.read.format("delta").table("your_table_name")
df = df.drop("column1", "column2") 

df.write.format("delta").mode("overwrite").saveAsTable("temp_table")

 

spark.sql("DROP TABLE your_table_name")

 

spark.sql("ALTER TABLE temp_table RENAME TO your_table_name")

 

this is the only way to fullfil your requirement it seems

 

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.