Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedJoin us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now
Hi guys,
Anyone know how to delete Delta table column (field) in Lakehouse. I tried using Pyspark script like this, but not working:
from pyspark.sql.functions import lit
from pyspark.sql.types import StringType, TimestampType
delta_table_path = "abfss://gold@onelake.dfs.fabric.microsoft.com/gold_datawarehouse.Lakehouse/Tables/gold_notebook_runs"
# Load the Delta table
df = spark.read.format("delta").load(delta_table_path)
# alter table
# Drop unwanted columns
df = df.drop("paret_start_time", "paret_end_time")
# Write back the DataFrame to the Delta table (Overwriting with new schema)
df.write.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.save(delta_table_path)
I wondering why we can't just delete the column by simply click the field in Lakehouse and delete like SQL Server.
Thanks
Solved! Go to Solution.
Yes Using `overwriteSchema` will overwrite existing data in all columns when combined with `mode("overwrite")`
Hello @VoltesDev
Could you please confirm if your query have been resolved the solution provided by @nilendraFabric ? If they have, kindly mark the helpful response and accept it as the solution. This will assist other community members in resolving similar issues more efficiently.
Thank you
For some weird reason, the 1st one from my own post, when try again, works.
Hello @VoltesDev
Thank you for your response! Glad to hear that your issue is resolved ! Kindly accept the solution so that it can help others easily find and apply the fix 😊.
Regards!
Hello @VoltesDev
May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.
Thank you.
Hello @VoltesDev
try with these settings
from delta.tables import *
deltaTable = DeltaTable.forPath(spark, delta_table_path)
deltaTable.alter().setProperty(
'delta.columnMapping.mode', 'name'
).execute()
deltaTable.alter().setProperty(
'delta.minReaderVersion', '2'
).execute()
deltaTable.alter().setProperty(
'delta.minWriterVersion', '5'
).execute()
df.write.format("delta") \
.mode("overwrite") \
.option("mergeSchema", "true") \
.option("overwriteSchema", "true") \
.save(delta_table_path)
will this overwrite my existing data in other column ?
Yes Using `overwriteSchema` will overwrite existing data in all columns when combined with `mode("overwrite")`
is there any other way ?
I mean what is the exact requirement?
I want to delete the unwanted column, but I don't want to loose existing data.
Try this
df = spark.read.format("delta").table("your_table_name")
df = df.drop("column1", "column2")
df.write.format("delta").mode("overwrite").saveAsTable("temp_table")
spark.sql("DROP TABLE your_table_name")
spark.sql("ALTER TABLE temp_table RENAME TO your_table_name")
this is the only way to fullfil your requirement it seems
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Check out the February 2025 Fabric update to learn about new features.
User | Count |
---|---|
37 | |
15 | |
3 | |
2 | |
2 |
User | Count |
---|---|
53 | |
16 | |
13 | |
8 | |
7 |