Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hello,
I try to write a spark dataframe into a Warehouse from a notebook with this tutorial : Spark connector for Microsoft Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn
I can read a table without any issue, but writing give me an error
from com.microsoft.spark.fabric.Constants import Constants
database_name = "warehouse_name"
df = spark.read.option(Constants.DatabaseName, database_name).synapsesql("SELECT TOP 1 * FROM [sys_data].[statistic]")
final_table = rf"{database_name}.schema_name.new_table_stat"
df.write.mode("append").synapsesql(final_table)
I get this error, I modified the warehouse url to sharing it :
I pinpointed this
Path 'https://i-api.onelake.fabric.microsoft.com/warehouse_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet' has URL suffix which is not allowed.
From the full error :
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
Cell In[56], line 3
1 df = spark.read.option(Constants.DatabaseName, database_name).synapsesql("SELECT TOP 1 * FROM [sys_data].[statistic]")
2 final_table = rf"{database_name}.schema_name.new_table_stat"
----> 3 df.write.mode("append").synapsesql(final_table)
File ~/cluster-env/trident_env/lib/python3.11/site-packages/com/microsoft/spark/fabric/FabricDWWriter.py:15, in synapsesql(self, table_name)
13 connector.synapsesql(table_name)
14 except Exception as e:
---> 15 raise e
File ~/cluster-env/trident_env/lib/python3.11/site-packages/com/microsoft/spark/fabric/FabricDWWriter.py:13, in synapsesql(self, table_name)
10 try:
11 connector = sqlcontext._jvm.com.microsoft.spark.fabric.tds.implicits.write.FabricSparkTDSImplicits.FabricSparkTDSWrite(
12 self._jwrite)
---> 13 connector.synapsesql(table_name)
14 except Exception as e:
15 raise e
File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):
File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:179, in capture_sql_exception.<locals>.deco(*a, **kw)
177 def deco(*a: Any, **kw: Any) -> Any:
178 try:
--> 179 return f(*a, **kw)
180 except Py4JJavaError as e:
181 converted = convert_exception(e.java_exception)
File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name)
324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
325 if answer[1] == REFERENCE_TYPE:
--> 326 raise Py4JJavaError(
327 "An error occurred while calling {0}{1}{2}.\n".
328 format(target_id, ".", name), value)
329 else:
330 raise Py4JError(
331 "An error occurred while calling {0}{1}{2}. Trace:\n{3}\n".
332 format(target_id, ".", name, value))
Py4JJavaError: An error occurred while calling o7092.synapsesql.
: com.microsoft.spark.fabric.tds.write.error.FabricSparkTDSWriteError: Write orchestration failed.
at com.microsoft.spark.fabric.tds.write.processor.FabricTDSWritePreProcessor.orchestrateWrite(FabricTDSWritePreProcessor.scala:66)
at com.microsoft.spark.fabric.tds.implicits.write.FabricSparkTDSImplicits$FabricSparkTDSWrite.synapsesql(FabricSparkTDSImplicits.scala:84)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.microsoft.spark.fabric.tds.error.FabricSparkTDSSQLExecQryError: Failed to commit to the data warehouse. Client Connection ID - e71fdc3b-9572-48c8-91b8-0f0db1d9f144
at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.commitDataToWarehouse(FabricSparkTDSSQLCommitHandler.scala:100)
at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.$anonfun$commitToWarehouse$6(FabricSparkTDSSQLCommitHandler.scala:61)
at scala.util.Success.flatMap(Try.scala:251)
at scala.util.Try$WithFilter.flatMap(Try.scala:142)
at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.$anonfun$commitToWarehouse$4(FabricSparkTDSSQLCommitHandler.scala:57)
at scala.util.Success.flatMap(Try.scala:251)
at scala.util.Try$WithFilter.flatMap(Try.scala:142)
at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.$anonfun$commitToWarehouse$2(FabricSparkTDSSQLCommitHandler.scala:54)
at scala.util.Success.flatMap(Try.scala:251)
at scala.util.Try$WithFilter.flatMap(Try.scala:142)
at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.commitToWarehouse(FabricSparkTDSSQLCommitHandler.scala:53)
at com.microsoft.spark.fabric.tds.write.processor.FabricTDSWritePreProcessor.$anonfun$orchestrateWrite$1(FabricTDSWritePreProcessor.scala:57)
at scala.util.Success.flatMap(Try.scala:251)
at com.microsoft.spark.fabric.tds.write.processor.FabricTDSWritePreProcessor.orchestrateWrite(FabricTDSWritePreProcessor.scala:56)
... 12 more
Caused by: com.microsoft.spark.fabric.tds.error.FabricSparkTDSSQLExecQryError: Error executing query-COPY INTO [warehouse_name].[schema_name].[new_table_stat]
FROM 'abfss://658adfb8-a617-43d3-be4c-37a58448bc32@i-api.onelake.fabric.microsoft.com/workspace_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet'
WITH(
FILE_TYPE = 'PARQUET',
CREDENTIAL = (
IDENTITY='Shared Access Signature',
SECRET='?[REDACTED]')). A retry attempt for error code - 13832 isn't expected to change outcome.
at com.microsoft.spark.fabric.tds.utility.FabricTDSSQLUtility$.executeSQLQuery(FabricTDSSQLUtility.scala:397)
at com.microsoft.spark.fabric.tds.utility.FabricTDSSQLUtility$.execQuery(FabricTDSSQLUtility.scala:249)
at com.microsoft.spark.fabric.tds.write.processor.FabricSparkTDSSQLCommitHandler$.commitDataToWarehouse(FabricSparkTDSSQLCommitHandler.scala:116)
... 25 more
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Path 'https://i-api.onelake.fabric.microsoft.com/workspace_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet' has URL suffix which is not allowed.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:259)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1695)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:920)
at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:814)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7675)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:4137)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:272)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:246)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.execute(SQLServerStatement.java:787)
at com.microsoft.spark.fabric.tds.utility.FabricTDSSQLUtility$.executeSQLQuery(FabricTDSSQLUtility.scala:314)
... 27 more
Solved! Go to Solution.
Hello @CMJJ
Thanks for also testing the case.
I found a solution even if it is not the best because I really think the "writing" function of msspark doesn't work.
So I used this :
warehouse_url = "my_url"
jdbc_url = f"jdbc:sqlserver://{warehouse_url}:1433;database={database_name}"
final_table = f"[schema].[table]"
token = mssparkutils.credentials.getToken("pbi")
df_joined.write \
.format("jdbc") \
.option("url", jdbc_url) \
.option("dbtable", final_table) \
.option("accessToken", token) \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.mode("overwrite") \
.save()
Hi @Ard
Confirming that I am receiving this identical error also, my first time trying to use this feature. I also was using the tutorial you linked, as well as this article:
Using Spark to Read and Write Data in Microsoft Fabric Data Warehouse
I've tried overwrite option with both the table existing and not existing, but the error is identical. I also tried the append option with no luck.
The dataframe columns and target table columns are identical.
The account I'm running the notebook with is a workspace admin, and the Warehouse is in the same workspace as the Notebook.
Hello @CMJJ
Thanks for also testing the case.
I found a solution even if it is not the best because I really think the "writing" function of msspark doesn't work.
So I used this :
warehouse_url = "my_url"
jdbc_url = f"jdbc:sqlserver://{warehouse_url}:1433;database={database_name}"
final_table = f"[schema].[table]"
token = mssparkutils.credentials.getToken("pbi")
df_joined.write \
.format("jdbc") \
.option("url", jdbc_url) \
.option("dbtable", final_table) \
.option("accessToken", token) \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.mode("overwrite") \
.save()
Hi @Ard ,
Thank you for using Microsoft Community Forum.
@Akash_Varuna explanation is correct Schema mismatches often cause this error. Differences in column names, data types, or column count between the DataFrame and the target table can lead to a write failure. Using overwrite can fix this by recreating the table, but it deletes existing data, so use it cautiously.
In addition to schema mismatches, the error mentioning "URL suffix not allowed" points to a possible issue with how Spark writes temporary files to OneLake storage especially if wildcards like *.parquet appear in the path. This can happen if the Fabric connector generates an invalid path for staging data. To work around this, consider writing the DataFrame to a Lakehouse first, then loading it into the Warehouse from there.
Lastly, permissions on both the Warehouse and OneLake storage are critical. Ensure the notebook identity (user, service principal, or managed identity) has the necessary write permissions for the Warehouse table and read/write access to OneLake.
If the problem continues, double-check that you’re using the latest version of the Microsoft Fabric Spark connector, and enable detailed logging to get clearer insights into the failure.
I hope this will reslove your issue, if you need any further assistance, feel free to reach out.
If this post helps, then please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.
Thankyou.
Hello,
Thanks for your help.
I don't like the lakehouse staging because I don't need it in my architecture. So I manage the writing with this solution :
warehouse_url = "my_url"
jdbc_url = f"jdbc:sqlserver://{warehouse_url}:1433;database={database_name}"
final_table = f"[schema].[table]"
token = mssparkutils.credentials.getToken("pbi")
df_joined.write \
.format("jdbc") \
.option("url", jdbc_url) \
.option("dbtable", final_table) \
.option("accessToken", token) \
.option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver") \
.mode("overwrite") \
.save()
Hello @Akash_Varuna,
Thanks for your answer, I've tried several things
First I modify with "overwrite" and a schema that doesn't exist:
from com.microsoft.spark.fabric.Constants import Constants
database_name = "warehouse_name"
df = spark.read.option(Constants.DatabaseName, database_name).synapsesql("SELECT TOP 1 * FROM [sys_data].[statistic]")
final_table = rf"{database_name}.000_test.new_table_stat"
df.write.mode("overwrite").synapsesql(final_table)
So I got this error because "000_test" does not exist, so it is normal
The specified schema name "000_test" either does not exist or you do not have permission to use it.
Now I create the schema and I try again, but I get the same error than in my original post even with "overwrite", something like:
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: Path 'https://i-api.onelake.fabric.microsoft.com/workspace_id/_system/artifacts/synapse-notebook-id/id/user/trusted-service-user/statistic378ec4926caf4943994e3ab45bc57c8e/*.parquet' has URL suffix which is not allowed.
So it doesn't solve my error
Hello @Akash_Varuna,
The notebook is in the same workspace that the warehouse and I'm an admin of the workspace.
And if I try a TSQL query in a TSQL notebook, I can create a table.
So yes, the permissions are right
Hi @Ard , It might indeed be due to schema mismatches. When using append, if there are schema differences between the source and the target table (like column names, types, or counts), Spark will not be able to write and will throw an error. In such cases, you may need to use overwrite instead. With overwrite, it will drop and recreate the table with the new schema, solving the mismatch issue, but this deletes all existing data, so use it cautiously. If it's not schema mismatches, the error might be related to the URL path or storage permissions.
User | Count |
---|---|
14 | |
4 | |
4 | |
4 | |
3 |
User | Count |
---|---|
8 | |
8 | |
7 | |
6 | |
5 |