Re: Python Notebook recreating lakehouse table aft...

MangoMagic · ‎07-12-2025

During development & testing, have to regularly delete lakehouse tables through Explorer side panel and try to recreate it using python code, but it often comes with an error as there are some leftover files in the filesystem for Table folder (probably a log folder).

Is there a better way to create lakehouse tables in Python notebook that doesn't cause an error when trying to recreate table with same name?

write_deltalake(
    delta_table_path,
    df,
    mode="overwrite",
    schema_mode="overwrite",
    engine="rust",
    storage_options={"allow_unsafe_rename": "true"}
)

---------------------------------------------------------------------------
DeltaError                                Traceback (most recent call last)
Cell In[40], line 12
      7 df = df.apply(lambda col: col if col.name in ["revision","_ETL_Time"]
      8         else col.apply(lambda x: json.dumps(x) 
      9         if not isinstance(x, str) else x))
     10 #df.info()
---> 12 write_deltalake(
     13     delta_table_path,
     14     df,
     15     mode="overwrite",
     16     schema_mode="overwrite",
     17     engine="rust",
     18     storage_options={"allow_unsafe_rename": "true"}
     19 )
     21 display(df)

File ~/jupyter-env/python3.11/lib/python3.11/site-packages/deltalake/writer.py:304, in write_deltalake(table_or_uri, data, schema, partition_by, mode, file_options, max_partitions, max_open_files, max_rows_per_file, min_rows_per_group, max_rows_per_group, name, description, configuration, schema_mode, storage_options, partition_filters, predicate, large_dtypes, engine, writer_properties, custom_metadata)
    301     return
    303 data = RecordBatchReader.from_batches(schema, (batch for batch in data))
--> 304 write_deltalake_rust(
    305     table_uri=table_uri,
    306     data=data,
    307     partition_by=partition_by,
    308     mode=mode,
    309     table=table._table if table is not None else None,
    310     schema_mode=schema_mode,
    311     predicate=predicate,
    312     name=name,
    313     description=description,
    314     configuration=configuration,
    315     storage_options=storage_options,
    316     writer_properties=(
    317         writer_properties._to_dict() if writer_properties else None
    318     ),
    319     custom_metadata=custom_metadata,
    320 )
    321 if table:
    322     table.update_incremental()

DeltaError: Delta transaction failed, version 0 already exists.

v-venuppu · ‎07-24-2025

Hi @MangoMagic ,

I hope the information provided is helpful.I wanted to check whether you were able to resolve the issue with the provided solutions.Please let us know if you need any further assistance.

Thank you.

v-venuppu · ‎07-21-2025

Hi @MangoMagic ,

Thank you for reaching out to Microsoft Fabric Community.

Thank you @BhaveshPatel @Element115 for the prompt response.

May I ask if you have resolved this issue? Please let us know if you have any further issues, we are happy to help.

Thank you.

Element115 · ‎07-18-2025

Let's first programmatically delete the table(s) properly, and thus let the underlying system perform all the necessary cleanup.

Single table

PySpark

spark.sql("DROP TABLE IF EXISTS schema_name.table_name")

If multiple tables in a given schema:

PySpark

schema = "your_schema_name"

# List all tables in schema
tables = spark.catalog.listTables(schema)

# Drop each table
for table in tables:
    table_name = table.name
    print(f"Dropping table {schema}.{table_name}")
    spark.sql(f"DROP TABLE IF EXISTS {schema}.{table_name}")

Combining these, let's look at a full PySpark solution:

schema = "your_schema_name"

# Step 1: List all tables in the schema
tables = spark.catalog.listTables(schema)

# Step 2: Drop all tables
for t in tables:
    print(f"Dropping table {schema}.{t.name}")
    spark.sql(f"DROP TABLE IF EXISTS {schema}.{t.name}")

# Step 3: Recreate all tables with a sample schema
# WARNING: Replace sample_schema with actual schemas if you want original structures back

sample_schema = "id INT, name STRING"

for t in tables:
    table_full_name = f"{schema}.{t.name}"
    print(f"Recreating table {table_full_name}")
    spark.sql(f"""
        CREATE TABLE {table_full_name} ({sample_schema})
        USING DELTA
        LOCATION '/mnt/datalake/{schema}/{t.name}'
    """)

If you still get complains about the table names, this means the lakehouse server hasn't update its state yet.

Hence, what I would do:

0_create pipeline

1_create a Notebook activity to run the delete Pyspark code above

2_use the Web activity with the new lakehouse REST API to force the lakehouse to sync (instead of a Wait activity because we can't know how long it's gonna take)

3_on success, next activity is a Notebook recreating the tables (somehow, you would need to save the table names somewhere, maybe another lakehouse dedicated only to store metadata)

I had the same issue with lakehouse tables on which write I/O wasn't finished by the time the next activity is ready to run. that was before the new REST API, so at the time I used a PySpark script to force the lakehouse to refresh and sync all metadata. and it worked.

Haven't used the Web activity, so just a suggestion, and to be tested.

BhaveshPatel · ‎07-18-2025

Hi @MangoMagic

You should delete the single table through notebook by right clicking Delta Table and Delete the table.

Delete Single Delta Table at a time from Explorer Side Panel. There is no alternative way.

Alternatively, You can drop and recreate the delta table

DROP TABLE IF EXISTS DimTable

CREATE TABLE IF NOT EXISTS DimTable
USING DELTA

This is managed delta table ( Table with metadata )

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.

v-venuppu · ‎07-18-2025

Hi @MangoMagic ,

Thank you for reaching out to Microsoft Fabric Community.

Thank you @lbendlin for the prompt response.

I wanted to check if you had the opportunity to review the information provided and resolve the issue..?Please let us know if you need any further assistance.We are happy to help.

Thank you.

lbendlin · ‎07-14-2025

in your notebook code include a "remove if exists" step that clears up the remaining artifacts before you attempt to create the new artifact

Python Notebook recreating lakehouse table after deleting issue

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025

Join us at FabCon Vienna from September 15-18, 2025

Python Notebook recreating lakehouse table after deleting issue

Helpful resources

Fabric Monthly Update - July 2025

Fabric Community Update - July 2025