Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
rblevi01
New Member

Fabric Spark Job Definition Create Lakehouse Table pyspark

Should df.write.format("delta").saveAsTable("test2") be executed from a Fabric Spark Job Definition?  Or, does it run on mutiple nodes and attempt to create the table many times?

 

I ask because if I execute the code below, the error is [TABLE_OR_VIEW_ALREADY_EXISTS].

 

I am sure the table does not exist in the Lakehouse associated with the Spark Job Definition.  

 

The eventual goal is to create a job that creates 1000s of tables from files.  If saveAsTable can only be run from a notebook, is there an alternative to create tables in a Lakehouse from a job?

 

He is the simple code that no matter the table name, will always return an error that it already exists.

 

from pyspark.sql import SparkSession

if __name__ == "__main__":

# Initialize a SparkSession
spark = SparkSession.builder.appName("TextToTable").getOrCreate()
df = spark.read.format("csv").option("header","true").load("Files/test.txt")

# df now is a Spark DataFrame containing CSV data from "Files/test.txt".
print(df.show())

# Create a new table   THIS IS WHERE ERROR HAPPENS
df.write.format("delta").saveAsTable("test2")

 

# Stop the SparkSession
spark.stop()

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @rblevi01 ,

Thanks for using Fabric Community.
Apologies for the issue you are facing. 

Please try to use the modified code - that you need to set the mode as overwrite.

 

 

df.write.mode('overwrite').format('delta').saveAsTable('new_table11')

 

 


I was able to execute Spark Job Definition without any issues.

Hope this is helpful. Please let me know incase of further queries.

View solution in original post

2 REPLIES 2
Anonymous
Not applicable

Hi @rblevi01 ,

Thanks for using Fabric Community.
Apologies for the issue you are facing. 

Please try to use the modified code - that you need to set the mode as overwrite.

 

 

df.write.mode('overwrite').format('delta').saveAsTable('new_table11')

 

 


I was able to execute Spark Job Definition without any issues.

Hope this is helpful. Please let me know incase of further queries.

Thank-you that worked.  Not sure why overwrite has to be used for a table that does not exist, however it works and I really appreciate the response.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.