October 28 & 29: Experts share their secrets on how to pass the Fabric Analytics Engineer certification exam—live. Learn more
Hi All,
Column name changes made to delta tables in lakehouse is not reflecting in SQL End point analytics. Has anyone faced similar issues ? Is this a known issue or fabric limitation ? Is schema overwrite the only option ?
Thanks,
When renaming columns in Delta tables within a lakehouse, the changes reflect in SQL Endpoint analytics after a brief delay of 1-2 minutes.
Following steps worked for me,
df = spark.sql("SELECT * FROM lakehouse_name.delta_table_name")
df = df.withColumnRenamed("old_col_name","new_col_name")
df.write.format("delta").mode("overwrite").option("overwriteSchema", "true").saveAsTable("table_name")
NOTE: overwriteschema-overwrites the entire schema and data of the table, mergeschema -merges the new schema with the existing schema, ensuring that all columns (old and new) are included
After these steps, the column name change will be applied, but it may take 1-2 minutes to appear in SQL Endpoint analytics due to metadata refresh.
Thanks @Shreya_Barhate . Changes to delta tables are still not reflecting in the SQL Endpoint. I have tried with both Spark 3.4 and Spark 3.5 (public preview) in notebook workspace , still no results in last 1 hour.
If you have already applied column name mapping to your tables, i.e. something like this
ALTER TABLE table_name SET TBLPROPERTIES (
'delta.columnMapping.mode' = 'name',
'delta.minReaderVersion' = '2',
'delta.minWriterVersion' = '5')
then I think you will need to create a new table (or a new lakehouse) and don't use the enable column mapping command. Instead, just use overwriteSchema or mergeSchema if you need to edit the schema of your table. You can also use ALTER TABLE ADD COLUMN, but you cannot use other ALTER TABLE commands like RENAME COLUMN or DROP COLUMN.
Thanks @Shreya_Barhate and @frithjof_v , I am able to alter column name and data type changes using dataframe in lakehouse rather than ALTER TSQL.
When writing the dataframe with changed columns to a Lakehouse table, are you using .option("overwriteSchema", "true") or .option("mergeSchema", "true")? Or have you found another method?
I'm curious about this topic 😃
It's .option("overwriteSchema", "true"). The mergeSchema option will append the renamed column to the end of the schema.
Hi @Debasish_p ,
Thanks for the reply from @frithjof_v .
In addition to using Overwrite, you'll have to check that the following architectures match, and that the saves you've made have not been changed. My test was successful, here are my test steps:
Here's my raw tabular data:
When I change the column names and then save the changes using the following statement, I get an architecture mismatch error:
Use the following statement to save your changes without any problems:
df.write.format(“delta”).option(“mergeSchema”, “true”).mode(“overwrite”).save(“Tables/products”)
You can also verify that the changes were successful:
df = spark.read.format(“delta”).load(“Tables/products”)
df.show()
The last table you see at the SQL endpoint looks like this:
If you have any other questions please feel free to contact me.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
Nice, so mergeSchema is also an option, in addition to overwriteSchema 😃
As far as I know, overwriteSchema is the only option, yes.
EDIT: mergeSchema is another option.
User | Count |
---|---|
11 | |
3 | |
3 | |
3 | |
2 |
User | Count |
---|---|
18 | |
11 | |
7 | |
7 | |
4 |