<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using Spark SQL to alter a column of an existing lakehouse table in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4036728#M2965</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I can use this PySpark code successfully:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;sql_sample_alter_table = """
ALTER TABLE sql_sample_table ALTER COLUMN descr TYPE VARCHAR(100)
"""
spark.sql(sql_sample_alter_table)&lt;/LI-CODE&gt;</description>
    <pubDate>Thu, 11 Jul 2024 09:22:21 GMT</pubDate>
    <dc:creator>pmscorca</dc:creator>
    <dc:date>2024-07-11T09:22:21Z</dc:date>
    <item>
      <title>Using Spark SQL to alter a column of an existing lakehouse table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4032526#M2916</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I try to use a Spark SQL code to alter the data type and the lenght of a column of an existing lakehouse table.&lt;/P&gt;&lt;P&gt;This syntax goes in error&lt;/P&gt;&lt;LI-CODE lang="python"&gt;%%sql
ALTER TABLE MYTABLE ALTER COLUMN mycolumn varchar(10)&lt;/LI-CODE&gt;&lt;P&gt;Any helps to me, please? Thanks&lt;/P&gt;&lt;P&gt;Unfortunately, I cannot use an equivalent T-SQL statement against a lakehouse.&lt;/P&gt;</description>
      <pubDate>Tue, 09 Jul 2024 12:28:01 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4032526#M2916</guid>
      <dc:creator>pmscorca</dc:creator>
      <dc:date>2024-07-09T12:28:01Z</dc:date>
    </item>
    <item>
      <title>Re: Using Spark SQL to alter a column of an existing lakehouse table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4033641#M2927</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/719515"&gt;@pmscorca&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In Apache Spark SQL, you cannot directly change the data type of an existing column using the ALTER TABLE command.&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="vhuijieymsft_0-1720584541838.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1129479i23EA0D9E9AEB3EC1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="vhuijieymsft_0-1720584541838.png" alt="vhuijieymsft_0-1720584541838.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You need to create a new DataFrame with the changed column type and then overwrite the existing table with the new DataFrame.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is a sample PySpark code:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;from pyspark.sql.functions import col

# Load the existing table into a DataFrame
df = spark.table("MYTABLE")

# Cast the column to the desired type
df = df.withColumn("mycolumn", col("mycolumn").cast("string"))

# Overwrite the existing table
df.write.mode('overwrite').saveAsTable("MYTABLE")&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Replace "MYTABLE" and "mycolumn" with the name of the table and column, and "string" with the desired data type. Note that the length parameter does not apply when casting to a string in Spark.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One small note, remember to back up your data before proceeding.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;BR /&gt;Yang&lt;BR /&gt;Community Support Team&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there is any post&amp;nbsp;&lt;STRONG&gt;&lt;EM&gt;helps&lt;/EM&gt;&lt;/STRONG&gt;, then please consider&amp;nbsp;&lt;STRONG&gt;&lt;EM&gt;Accept it as the solution&lt;/EM&gt;&lt;/STRONG&gt;&amp;nbsp;&amp;nbsp;to help the other members find it more quickly.&lt;BR /&gt;If I misunderstand your needs or you still have problems on it, please feel free to let us know.&amp;nbsp;&lt;STRONG&gt;&lt;EM&gt;Thanks a lot!&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jul 2024 04:09:16 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4033641#M2927</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-07-10T04:09:16Z</dc:date>
    </item>
    <item>
      <title>Re: Using Spark SQL to alter a column of an existing lakehouse table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4033894#M2931</link>
      <description>&lt;P&gt;This worked for me when I tried it a couple of months ago:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;spark.read.table("tableName")\&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;.withColumn("columnName", col("columnName").cast("columnType"))\&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;.write\&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;.mode("overwrite")\&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;.option("overwriteSchema", "true")\&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp;.saveAsTable("tableName")&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.fabric.microsoft.com/t5/General-Discussion/Dropping-and-recreating-lakehouse-table/m-p/3835426/highlight/true#M6312" target="_blank" rel="noopener"&gt;https://community.fabric.microsoft.com/t5/General-Discussion/Dropping-and-recreating-lakehouse-table/m-p/3835426/highlight/true#M6312&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jul 2024 07:15:59 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4033894#M2931</guid>
      <dc:creator>frithjof_v</dc:creator>
      <dc:date>2024-07-10T07:15:59Z</dc:date>
    </item>
    <item>
      <title>Re: Using Spark SQL to alter a column of an existing lakehouse table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4035001#M2941</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I've tried to use the cast method and I've noticed that if I try to pass from a string data type to a int data type working with an empty table I've an error about the data type incompatibility.&lt;BR /&gt;I'd like to know which are the data types supported by the cast method.&lt;/P&gt;&lt;P&gt;I'd like to cast to a varchar with a specified lenght.&lt;/P&gt;&lt;P&gt;I'd like that the MS official documentation could report more useful and reusable spark codes to implement the notebooks.&lt;/P&gt;&lt;P&gt;Thanks&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Jul 2024 15:07:01 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4035001#M2941</guid>
      <dc:creator>pmscorca</dc:creator>
      <dc:date>2024-07-10T15:07:01Z</dc:date>
    </item>
    <item>
      <title>Re: Using Spark SQL to alter a column of an existing lakehouse table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4035803#M2949</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/719515"&gt;@pmscorca&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for the reply from &lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/437984"&gt;@frithjof_v&lt;/a&gt;&amp;nbsp;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The types supported by Cast() method include: "string", "boolean", "byte", "short", "int", "long", "float", "double", "decimal", "date", "timestamp".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is an official document about Cast() method:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.spark.sql.column.cast?view=spark-dotnet" target="_blank"&gt;Column.Cast(String) Method (Microsoft.Spark.Sql) - .NET for Apache Spark | Microsoft Learn&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In Spark SQL, there is no direct way to cast a string to a varchar with a specified length.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is an alternative:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can use withColumn and cast:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;from pyspark.sql.functions import col

df = spark.sql("SELECT * FROM MYTABLE")
df = df.withColumn("mycolumn", col("mycolumn").cast("new_data_type"))&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you have any other questions, please feel free to contact me.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;BR /&gt;Yang&lt;BR /&gt;Community Support Team&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there is any post&amp;nbsp;&lt;STRONG&gt;&lt;EM&gt;helps&lt;/EM&gt;&lt;/STRONG&gt;, then please consider&amp;nbsp;&lt;STRONG&gt;&lt;EM&gt;Accept it as the solution&lt;/EM&gt;&lt;/STRONG&gt;&amp;nbsp;&amp;nbsp;to help the other members find it more quickly.&lt;BR /&gt;If I misunderstand your needs or you still have problems on it, please feel free to let us know.&amp;nbsp;&lt;STRONG&gt;&lt;EM&gt;Thanks a lot!&lt;/EM&gt;&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 01:29:26 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4035803#M2949</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-07-11T01:29:26Z</dc:date>
    </item>
    <item>
      <title>Re: Using Spark SQL to alter a column of an existing lakehouse table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4036110#M2957</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;however does it exist a manner to cast&amp;nbsp;to a varchar with a specified length by a pyspark code, Scala code, etc.? Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jul 2024 05:09:06 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4036110#M2957</guid>
      <dc:creator>pmscorca</dc:creator>
      <dc:date>2024-07-11T05:09:06Z</dc:date>
    </item>
    <item>
      <title>Re: Using Spark SQL to alter a column of an existing lakehouse table</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4036728#M2965</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I can use this PySpark code successfully:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;sql_sample_alter_table = """
ALTER TABLE sql_sample_table ALTER COLUMN descr TYPE VARCHAR(100)
"""
spark.sql(sql_sample_alter_table)&lt;/LI-CODE&gt;</description>
      <pubDate>Thu, 11 Jul 2024 09:22:21 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Using-Spark-SQL-to-alter-a-column-of-an-existing-lakehouse-table/m-p/4036728#M2965</guid>
      <dc:creator>pmscorca</dc:creator>
      <dc:date>2024-07-11T09:22:21Z</dc:date>
    </item>
  </channel>
</rss>

