Solved: Re: Error : Notebook writing table into a Warehous...

Ard · ‎03-25-2025

Hello,

I try to write a spark dataframe into a Warehouse from a notebook with this tutorial : Spark connector for Microsoft Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn

I can read a table without any issue, but writing give me an error

from com.microsoft.spark.fabric.Constants import Constants

database_name = "warehouse_name"

df = spark.read.option(Constants.DatabaseName, database_name).synapsesql("SELECT TOP 1 * FROM [sys_data].[statistic]")

final_table = rf"{database_name}.schema_name.new_table_stat"
df.write.mode("append").synapsesql(final_table)

I get this error, I modified the warehouse url to sharing it :

I pinpointed this

Path 'https://i-api.onelake.fabric.microsoft.com/warehouse_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet' has URL suffix which is not allowed.

From the full error :

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
Cell In[56], line 3
      1 df = spark.read.option(Constants.DatabaseName, database_name).synapsesql("SELECT TOP 1 * FROM [sys_data].[statistic]")
      2 final_table = rf"{database_name}.schema_name.new_table_stat"
----> 3 df.write.mode("append").synapsesql(final_table)

File ~/cluster-env/trident_env/lib/python3.11/site-packages/com/microsoft/spark/fabric/FabricDWWriter.py:15, in synapsesql(self, table_name)
     13     connector.synapsesql(table_name)
     14 except Exception as e:
---> 15     raise e

File ~/cluster-env/trident_env/lib/python3.11/site-packages/com/microsoft/spark/fabric/FabricDWWriter.py:13, in synapsesql(self, table_name)
     10 try:
     11     connector = sqlcontext._jvm.com.microsoft.spark.fabric.tds.implicits.write.FabricSparkTDSImplicits.FabricSparkTDSWrite(
     12         self._jwrite)
---> 13     connector.synapsesql(table_name)
     14 except Exception as e:
     15     raise e

File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
   1316 command = proto.CALL_COMMAND_NAME +\
   1317     self.command_header +\
   1318     args_command +\
   1319     proto.END_COMMAND_PART
   1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
   1323     answer, self.gateway_client, self.target_id, self.name)
   1325 for temp_arg in temp_args:
   1326     if hasattr(temp_arg, "_detach"):

File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:179, in capture_sql_exception.<locals>.deco(*a, **kw)
    177 def deco(*a: Any, **kw: Any) -> Any:
    178     try:
--> 179         return f(*a, **kw)
    180     except Py4JJavaError as e:
    181         converted = convert_exception(e.java_exception)

File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name)
    324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325 if answer[1] == REFERENCE_TYPE:
--> 326     raise Py4JJavaError(
    327         "An error occurred while calling {0}{1}{2}.\n".
    328         format(target_id, ".", name), value)
    329 else:
    330     raise Py4JError(
    331         "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
    332         format(target_id, ".", name, value))

Py4JJavaError: An error occurred while calling o7092.synapsesql.
: com.microsoft.spark.fabric.tds.write.error.FabricSparkTDSWriteError: Write orchestration failed.
	at com.microsoft.spark.fabric.tds.write.processor.FabricTDSWritePreProcessor.orchestrateWrite(FabricTDSWritePreProcessor.scala:66)
	at com.microsoft.spark.fabric.tds.implicits.write.FabricSparkTDSImplicits$FabricSparkTDSWrite.synapsesql(FabricSparkTDSImplicits.scala:84)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.microsoft.spark.fabric.tds.error.FabricSparkTDSSQLExecQryError: Failed to commit to the data warehouse. Client Connection ID - e71fdc3b-9572-48c8-91b8-0f0db1d9f144
	at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.commitDataToWarehouse(FabricSparkTDSSQLCommitHandler.scala:100)
	at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.$anonfun$commitToWarehouse$6(FabricSparkTDSSQLCommitHandler.scala:61)
	at scala.util.Success.flatMap(Try.scala:251)
	at scala.util.Try$WithFilter.flatMap(Try.scala:142)
	at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.$anonfun$commitToWarehouse$4(FabricSparkTDSSQLCommitHandler.scala:57)
	at scala.util.Success.flatMap(Try.scala:251)
	at scala.util.Try$WithFilter.flatMap(Try.scala:142)
	at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.$anonfun$commitToWarehouse$2(FabricSparkTDSSQLCommitHandler.scala:54)
	at scala.util.Success.flatMap(Try.scala:251)
	at scala.util.Try$WithFilter.flatMap(Try.scala:142)
	at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.commitToWarehouse(FabricSparkTDSSQLCommitHandler.scala:53)
	at com.microsoft.spark.fabric.tds.write.processor.FabricTDSWritePreProcessor.$anonfun$orchestrateWrite$1(FabricTDSWritePreProcessor.scala:57)
	at scala.util.Success.flatMap(Try.scala:251)
	at com.microsoft.spark.fabric.tds.write.processor.FabricTDSWritePreProcessor.orchestrateWrite(FabricTDSWritePreProcessor.scala:56)
	... 12 more
Caused by: com.microsoft.spark.fabric.tds.error.FabricSparkTDSSQLExecQryError: Error executing query-COPY INTO [warehouse_name].[schema_name].[new_table_stat]
FROM 'abfss://658adfb8-a617-43d3-be4c-37a58448bc32@i-api.onelake.fabric.microsoft.com/workspace_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet'
WITH(
   FILE_TYPE = 'PARQUET',
   CREDENTIAL = (
       IDENTITY='Shared Access Signature',
       SECRET='?[REDACTED]')). A retry attempt for error code - 13832 isn't expected to change outcome.
	at com.microsoft.spark.fabric.tds.utility.FabricTDSSQLUtility$.executeSQLQuery(FabricTDSSQLUtility.scala:397)
	at com.microsoft.spark.fabric.tds.utility.FabricTDSSQLUtility$.execQuery(FabricTDSSQLUtility.scala:249)
	at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.commitDataToWarehouse(FabricSparkTDSSQLCommitHandler.scala:116)
	... 25 more
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Path 'https://i-api.onelake.fabric.microsoft.com/workspace_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet' has URL suffix which is not allowed.
	at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:259)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1695)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:920)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:814)
	at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7675)
	at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:4137)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:272)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:246)
	at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStatement.java:787)
	at com.microsoft.spark.fabric.tds.utility.FabricTDSSQLUtility$.executeSQLQuery(FabricTDSSQLUtility.scala:314)
	... 27 more

Ard · ‎03-28-2025

Hello @CMJJ

Thanks for also testing the case.
I found a solution even if it is not the best because I really think the "writing" function of msspark doesn't work.
So I used this :

warehouse_url = "my_url"
jdbc_url = f"jdbc:sqlserver://{warehouse_url}:1433;database={database_name}"
final_table = f"[schema].[table]"

token = mssparkutils.credentials.getToken("pbi")

df_joined.write \
    .format("jdbc") \
    .option("url", jdbc_url) \
    .option("dbtable", final_table) \
    .option("accessToken", token) \
    .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
    .mode("overwrite") \
    .save()

View solution in original post

CMJJ · ‎03-27-2025

Hi @Ard

Confirming that I am receiving this identical error also, my first time trying to use this feature. I also was using the tutorial you linked, as well as this article:
Using Spark to Read and Write Data in Microsoft Fabric Data Warehouse

I've tried overwrite option with both the table existing and not existing, but the error is identical. I also tried the append option with no luck.

The dataframe columns and target table columns are identical.

The account I'm running the notebook with is a workspace admin, and the Warehouse is in the same workspace as the Notebook.

Ard · ‎03-28-2025

Hello @CMJJ

Thanks for also testing the case.
I found a solution even if it is not the best because I really think the "writing" function of msspark doesn't work.
So I used this :

warehouse_url = "my_url"
jdbc_url = f"jdbc:sqlserver://{warehouse_url}:1433;database={database_name}"
final_table = f"[schema].[table]"

token = mssparkutils.credentials.getToken("pbi")

df_joined.write \
    .format("jdbc") \
    .option("url", jdbc_url) \
    .option("dbtable", final_table) \
    .option("accessToken", token) \
    .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
    .mode("overwrite") \
    .save()

v-tsaipranay · ‎03-26-2025

Hi @Ard ,

Thank you for using Microsoft Community Forum.

@Akash_Varuna explanation is correct Schema mismatches often cause this error. Differences in column names, data types, or column count between the DataFrame and the target table can lead to a write failure. Using overwrite can fix this by recreating the table, but it deletes existing data, so use it cautiously.

In addition to schema mismatches, the error mentioning "URL suffix not allowed" points to a possible issue with how Spark writes temporary files to OneLake storage especially if wildcards like *.parquet appear in the path. This can happen if the Fabric connector generates an invalid path for staging data. To work around this, consider writing the DataFrame to a Lakehouse first, then loading it into the Warehouse from there.

Lastly, permissions on both the Warehouse and OneLake storage are critical. Ensure the notebook identity (user, service principal, or managed identity) has the necessary write permissions for the Warehouse table and read/write access to OneLake.

If the problem continues, double-check that you’re using the latest version of the Microsoft Fabric Spark connector, and enable detailed logging to get clearer insights into the failure.

I hope this will reslove your issue, if you need any further assistance, feel free to reach out.

If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.

Thankyou.

Ard · ‎03-28-2025

Hello,

Thanks for your help.

I don't like the lakehouse staging because I don't need it in my architecture. So I manage the writing with this solution :

warehouse_url = "my_url"
jdbc_url = f"jdbc:sqlserver://{warehouse_url}:1433;database={database_name}"
final_table = f"[schema].[table]"

token = mssparkutils.credentials.getToken("pbi")

df_joined.write \
    .format("jdbc") \
    .option("url", jdbc_url) \
    .option("dbtable", final_table) \
    .option("accessToken", token) \
    .option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
    .mode("overwrite") \
    .save()

sholy29 · ‎06-24-2025

Hi Ard,

I have similar problem just like you. In my case it was overwrite not append (Py4JJavaError: An error occurred while calling o6923.synapsesql. : com.microsoft.spark.fabric.tds.write.error.FabricSparkTDSWriteError: Write orchestration failed.) I want to try this your solution.

A quick question. Where can I find the warehouse_url in the Warehouse Item?

Thanks

Ard · ‎03-26-2025

Hello @Akash_Varuna,

Thanks for your answer, I've tried several things

First I modify with "overwrite" and a schema that doesn't exist:

from com.microsoft.spark.fabric.Constants import Constants

database_name = "warehouse_name"

df = spark.read.option(Constants.DatabaseName, database_name).synapsesql("SELECT TOP 1 * FROM [sys_data].[statistic]")

final_table = rf"{database_name}.000_test.new_table_stat"
df.write.mode("overwrite").synapsesql(final_table)

So I got this error because "000_test" does not exist, so it is normal

The specified schema name "000_test" either does not exist or you do not have permission to use it.

Now I create the schema and I try again, but I get the same error than in my original post even with "overwrite", something like:

Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Path 'https://i-api.onelake.fabric.microsoft.com/workspace_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet' has URL suffix which is not allowed.

So it doesn't solve my error

Akash_Varuna · ‎03-26-2025

Hi @Ard You have the proper permissions right ?

Ard · ‎03-26-2025

Hello @Akash_Varuna,

The notebook is in the same workspace that the warehouse and I'm an admin of the workspace.

And if I try a TSQL query in a TSQL notebook, I can create a table.

So yes, the permissions are right

Akash_Varuna · ‎03-25-2025

Hi @Ard , It might indeed be due to schema mismatches. When using append, if there are schema differences between the source and the target table (like column names, types, or counts), Spark will not be able to write and will throw an error. In such cases, you may need to use overwrite instead. With overwrite, it will drop and recreate the table with the new schema, solving the mismatch issue, but this deletes all existing data, so use it cautiously. If it's not schema mismatches, the error might be related to the URL path or storage permissions.

Error : Notebook writing table into a Warehouse

Helpful resources

Fabric Monthly Update - August 2025

Fabric Community Update - August 2025

Join us at FabCon Vienna from September 15-18, 2025

Error : Notebook writing table into a Warehouse

Helpful resources

Fabric Monthly Update - August 2025

Fabric Community Update - August 2025