Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi,
I try to use a Spark SQL code to alter the data type and the lenght of a column of an existing lakehouse table.
This syntax goes in error
%%sql
ALTER TABLE MYTABLE ALTER COLUMN mycolumn varchar(10)
Any helps to me, please? Thanks
Unfortunately, I cannot use an equivalent T-SQL statement against a lakehouse.
Solved! Go to Solution.
Hi @pmscorca ,
Thanks for the reply from @frithjof_v .
The types supported by Cast() method include: "string", "boolean", "byte", "short", "int", "long", "float", "double", "decimal", "date", "timestamp".
Here is an official document about Cast() method:
Column.Cast(String) Method (Microsoft.Spark.Sql) - .NET for Apache Spark | Microsoft Learn
In Spark SQL, there is no direct way to cast a string to a varchar with a specified length.
Here is an alternative:
You can use withColumn and cast:
from pyspark.sql.functions import col
df = spark.sql("SELECT * FROM MYTABLE")
df = df.withColumn("mycolumn", col("mycolumn").cast("new_data_type"))
If you have any other questions, please feel free to contact me.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
This worked for me when I tried it a couple of months ago:
spark.read.table("tableName")\
.withColumn("columnName", col("columnName").cast("columnType"))\
.write\
.mode("overwrite")\
.option("overwriteSchema", "true")\
.saveAsTable("tableName")
Hi @pmscorca ,
In Apache Spark SQL, you cannot directly change the data type of an existing column using the ALTER TABLE command.
You need to create a new DataFrame with the changed column type and then overwrite the existing table with the new DataFrame.
Here is a sample PySpark code:
from pyspark.sql.functions import col
# Load the existing table into a DataFrame
df = spark.table("MYTABLE")
# Cast the column to the desired type
df = df.withColumn("mycolumn", col("mycolumn").cast("string"))
# Overwrite the existing table
df.write.mode('overwrite').saveAsTable("MYTABLE")
Replace "MYTABLE" and "mycolumn" with the name of the table and column, and "string" with the desired data type. Note that the length parameter does not apply when casting to a string in Spark.
One small note, remember to back up your data before proceeding.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
Hi,
I've tried to use the cast method and I've noticed that if I try to pass from a string data type to a int data type working with an empty table I've an error about the data type incompatibility.
I'd like to know which are the data types supported by the cast method.
I'd like to cast to a varchar with a specified lenght.
I'd like that the MS official documentation could report more useful and reusable spark codes to implement the notebooks.
Thanks
Hi @pmscorca ,
Thanks for the reply from @frithjof_v .
The types supported by Cast() method include: "string", "boolean", "byte", "short", "int", "long", "float", "double", "decimal", "date", "timestamp".
Here is an official document about Cast() method:
Column.Cast(String) Method (Microsoft.Spark.Sql) - .NET for Apache Spark | Microsoft Learn
In Spark SQL, there is no direct way to cast a string to a varchar with a specified length.
Here is an alternative:
You can use withColumn and cast:
from pyspark.sql.functions import col
df = spark.sql("SELECT * FROM MYTABLE")
df = df.withColumn("mycolumn", col("mycolumn").cast("new_data_type"))
If you have any other questions, please feel free to contact me.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
Hi,
I can use this PySpark code successfully:
sql_sample_alter_table = """
ALTER TABLE sql_sample_table ALTER COLUMN descr TYPE VARCHAR(100)
"""
spark.sql(sql_sample_alter_table)
Hi,
however does it exist a manner to cast to a varchar with a specified length by a pyspark code, Scala code, etc.? Thanks
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.