Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi Community,
Hopy you all are doing well.
Need you help for how to read table in a custom schema using spark SQL.
Below are the details for the error im getting while reading the table:
I have created a lakehouse in my workspace with Lakehouse schemas (Public Preview) enabled.
I have added a new schema in the lakehouse, named wind.
I have created a table using below pyspark code:
df_write.write.format("delta") \
.mode("overwrite") \
.option("mergeSchema", "true") \
.save('abfss://RWECEDAP_DataEngineering@onelake.dfs.fabric.microsoft.com/rwece_dap_silver.Lakehouse/Tables/wind/bbii')
Please suggest if any better way of doing writting data in the custom schema of a lakehouse.
My read code is follows which through Exception while running(I got this read code by dragging and dropping table into a code cell and get a code snippet that refers to the schema where the table is located)
df = spark.sql("SELECT * FROM rwece_dap_silver.wind.bbii LIMIT 1000")
display(df)
Error im getting while reading the above table:
---------------------------------------------------------------------------
AnalysisException Traceback (most recent call last)
Cell In[167], line 1
----> 1 df = spark.sql("SELECT * FROM rwece_dap_silver.wind.bbii LIMIT 1000")
2 display(df)
File /opt/spark/python/lib/pyspark.zip/pyspark/sql/session.py:1631, in SparkSession.sql(self, sqlQuery, args, **kwargs)
1627 assert self._jvm is not None
1628 litArgs = self._jvm.PythonUtils.toArray(
1629 [_to_java_column(lit(v)) for v in (args or [])]
1630 )
-> 1631 return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
1632 finally:
1633 if len(kwargs) > 0:
File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):
File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:185, in capture_sql_exception.<locals>.deco(*a, **kw)
181 converted = convert_exception(e.java_exception)
182 if not isinstance(converted, UnknownException):
183 # Hide where the exception came from that shows a non-Pythonic
184 # JVM exception message.
--> 185 raise converted from None
186 else:
187 raise
AnalysisException: [REQUIRES_SINGLE_PART_NAMESPACE] spark_catalog requires a single-part namespace, but got `rwece_dap_silver`.`wind`.
Solutions Tried:
With this also im getting similar issue:
df = spark.sql("SELECT * FROM RWECEDAP_DataEngineering.rwece_dap_silver.wind.bbii LIMIT 1000")
display(df)
# Output:
---------------------------------------------------------------------------
AnalysisException Traceback (most recent call last)
Cell In[173], line 1
----> 1 df = spark.sql("SELECT * FROM RWECEDAP_DataEngineering.rwece_dap_silver.wind.bbii LIMIT 1000")
2 display(df)
File /opt/spark/python/lib/pyspark.zip/pyspark/sql/session.py:1631, in SparkSession.sql(self, sqlQuery, args, **kwargs)
1627 assert self._jvm is not None
1628 litArgs = self._jvm.PythonUtils.toArray(
1629 [_to_java_column(lit(v)) for v in (args or [])]
1630 )
-> 1631 return DataFrame(self._jsparkSession.sql(sqlQuery, litArgs), self)
1632 finally:
1633 if len(kwargs) > 0:
File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):
File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:185, in capture_sql_exception.<locals>.deco(*a, **kw)
181 converted = convert_exception(e.java_exception)
182 if not isinstance(converted, UnknownException):
183 # Hide where the exception came from that shows a non-Pythonic
184 # JVM exception message.
--> 185 raise converted from None
186 else:
187 raise
AnalysisException: [REQUIRES_SINGLE_PART_NAMESPACE] spark_catalog requires a single-part namespace, but got `RWECEDAP_DataEngineering`.`rwece_dap_silver`.`wind`.
Solved! Go to Solution.
Hi,
I'm guessing you've not attached the lakehouse to the notebook. If I don't attach the lakehouse I get exactly that error. In my case the following;
AnalysisException: [REQUIRES_SINGLE_PART_NAMESPACE] spark_catalog requires a single-part namespace, but got `test_schema`.`dim`.
Your options are;
source_df = spark.read.format('delta').load('abfss://test@onelake.dfs.fabric.microsoft.com/test_schema.Lakehouse/Tables/dim/user_behavior_dataset')
source_df.createTempView("view_i_created")
df = spark.sql("SELECT * FROM view_i_created LIMIT 1000")
display(df)
Note that just doing a select * here will net you the same original dataframe. The real power of approach 2 is you can connect to tables in different lakehouses dynamically without having to attach them first *and* you can join/merge on different tables using the full power of Spark SQL. That said, I normally prefer to use Python/pySpark myself.
If this helps, please consider Accepting as a Solution to help others find it more easily 
I believe saveAsTable requires a lakehouse to be attached - same as using Spark SQL. So the following works when you've attached a default lakehouse;
df.write.saveAsTable('dim.user_behavior_dataset2')
using the abfss path (kinda needed if you have no attached lakehouse) requires you to use the full save functionality (or use the createTempView and write to that).
The following works;
df.write.format('delta').mode('overwrite').save('abfss://test@onelake.dfs.fabric.microsoft.com/test_schema.Lakehouse/Tables/dim/user_behavior_dataset3')
Hi,
I'm guessing you've not attached the lakehouse to the notebook. If I don't attach the lakehouse I get exactly that error. In my case the following;
AnalysisException: [REQUIRES_SINGLE_PART_NAMESPACE] spark_catalog requires a single-part namespace, but got `test_schema`.`dim`.
Your options are;
source_df = spark.read.format('delta').load('abfss://test@onelake.dfs.fabric.microsoft.com/test_schema.Lakehouse/Tables/dim/user_behavior_dataset')
source_df.createTempView("view_i_created")
df = spark.sql("SELECT * FROM view_i_created LIMIT 1000")
display(df)
Note that just doing a select * here will net you the same original dataframe. The real power of approach 2 is you can connect to tables in different lakehouses dynamically without having to attach them first *and* you can join/merge on different tables using the full power of Spark SQL. That said, I normally prefer to use Python/pySpark myself.
If this helps, please consider Accepting as a Solution to help others find it more easily 
Thanks @spencer_sa , the solution was helpful , reading table with path worked.
So any other ways of reading, writing the delta table like spark.read.table. , saveAsTable resepectively doesnt work in case in custome schema in a lakehouse?
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.
User | Count |
---|---|
6 | |
4 | |
3 | |
2 | |
2 |