Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I have created a package which builds the endpoint for lakehouse - correctly as I have validated the output - and reads the data from files in the lakehouse using spark, but unfortunately the same code keeps failing with the following error
======================
2025-06-25 07:53:33,693 . ERROR - mcp_edp_pipeline - mcp.edp.ingestion_framework.data_io.lakehouse_reader - read:55 - Lakehouse read failed. Traceback (most recent call last): File "/home/trusted-service-user/cluster-env/trident_env/lib/python3.11/site-packages/ingestion_framework/data_io/lakehouse_reader.py", line 53, in read return self._read_files() ^^^^^^^^^^^^^^^^^^ File "/home/trusted-service-user/cluster-env/trident_env/lib/python3.11/site-packages/ingestion_framework/data_io/lakehouse_reader.py", line 90, in _read_files reader = reader.schema(config["schema"]) ^^^^^^^^^^^^^^^^^^^^^^ File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line 307, in load return self._df(self._jreader.load(path)) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/trusted-service-user/cluster-env/trident_env/lib/python3.11/site-packages/py4j/java_gateway.py", line 1322, in __call__ return_value = get_return_value( ^^^^^^^^^^^^^^^^^ File "/opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", line 179, in deco return f(*a, **kw) ^^^^^^^^^^^ File "/home/trusted-service-user/cluster-env/trident_env/lib/python3.11/site-packages/py4j/protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling o9381.load. : Operation failed: "Bad Request", 400, GET, http://onelake.dfs.fabric.microsoft.com/ef4a09c9-61b6-45ca-b99f-1d8ea1cee548?upn=false&resource=file... FriendlyNameSupportDisabled, "Request Failed with WorkspaceId and ArtifactId should be either valid Guids or valid Names" at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.completeExecute(AbfsRestOperation.java:231) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.lambda$execute$0(AbfsRestOperation.java:191) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDurationOfInvocation(IOStatisticsBinding.java:464) at org.apache.hadoop.fs.azurebfs.services.AbfsRestOperation.execute(AbfsRestOperation.java:189) at org.apache.hadoop.fs.azurebfs.services.AbfsClient.listPath(AbfsClient.java:311) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:1173) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:1143) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.listStatus(AzureBlobFileSystemStore.java:1125) at org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.listStatus(AzureBlobFileSystem.java:513) at org.apache.hadoop.fs.Globber.listStatus(Globber.java:128) at org.apache.hadoop.fs.Globber.doGlob(Globber.java:291) at org.apache.hadoop.fs.Globber.glob(Globber.java:202) at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2123) at org.apache.spark.deploy.SparkHadoopUtil.globPath(SparkHadoopUtil.scala:301) at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$3(DataSource.scala:740) at org.apache.spark.util.ThreadUtils$.$anonfun$parmap$2(ThreadUtils.scala:384) at scala.concurrent.Future$.$anonfun$apply$1(Future.scala:659) at scala.util.Success.$anonfun$map$1(Try.scala:255) at scala.util.Success.map(Try.scala:213) at scala.concurrent.Future.$anonfun$map$1(Future.scala:292) at scala.concurrent.impl.Promise.liftedTree1$1(Promise.scala:33) at scala.concurrent.impl.Promise.$anonfun$transform$1(Promise.scala:33) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(ForkJoinTask.java:1426) at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020) at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656) at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594) at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183) --------------------------------------------------------------------------- Py4JJavaError Traceback (most recent call last) File ~/cluster-env/trident_env/lib/python3.11/site-packages/ingestion_framework/data_io/lakehouse_reader.py:53, in LakehouseReader.read(self) 52 else: ---> 53 return self._read_files() 54 except Exception as e: File ~/cluster-env/trident_env/lib/python3.11/site-packages/ingestion_framework/data_io/lakehouse_reader.py:90, in LakehouseReader._read_files(self) 88 reader = reader.schema(config["schema"]) ---> 90 df = reader.load(base_path) 92 # optional row skipping File /opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py:307, in DataFrameReader.load(self, path, format, schema, **options) 306 if isinstance(path, str😞 --> 307 return self._df(self._jreader.load(path)) 308 elif path is not None: File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args) 1321 answer = self.gateway_client.send_command(command) -> 1322 return_value = get_return_value( 1323 answer, self.gateway_client, self.target_id, self.name) 1325 for temp_arg in temp_args: File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:179, in capture_sql_exception.<locals>.deco(*a, **kw) 178 try: --> 179 return f(*a, **kw) 180 except Py4JJavaError as e: File ~/cluster-env/trident_env/lib/python3.11/site-packages/py4j/protocol.py:326, in get_return_value(answer, gateway_client, target_id, name) 325 if answer[1] == REFERENCE_TYPE: --> 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value) 329 else: Py4JJavaError: An error occurred while calling o9381.load. : Operation failed: "Bad Request", 400, GET, http://onelake.dfs.fabric.microsoft.com/ef4a09c9-61b6-45ca-b99f-1d8ea1cee548?upn=false&resource=file... FriendlyNameSupportDisabled,
===================================
here is the code that reads it
Solved! Go to Solution.
I have tested all the logic ensuring the GUID is fine as i am using fabric's api to extract, i am pretty sure its some backend issue, which i do not have the time to sit and debug, i have worked it around and fixed.
Thanks @BhaveshPatel for addressing the issue.
Hi @rookie111 ,
we wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
Regards,
Chaithanya
Hi there,
Locally, the spark dataframe aka python dataframe does not work. For that to work, You need to move the data to public cloud source ( azure environment).
For example, you can use below code:
from pyspark.sql.functions import *
Hi @rookie111 ,
Thank you for reaching out to Microsoft Fabric Community Forum.
Below are the few observations and few debud points to resolve your issue please try with the points and let us know if you need and further assistance.
As a temporary workaround, if schema binding is causing issues, try removing .schema(config["schema"]) and allow Spark to infer the schema just to see if .load() works that helps isolate whether the path is invalid or the reader is misconfigured.
Lastly, ensure your workspace_id, lakehouse_id, and storage path haven’t changed recently due to renaming or redeployment, as that can invalidate the previous GUIDs used.
Regards,
Chaithanya.
I have tested all the logic ensuring the GUID is fine as i am using fabric's api to extract, i am pretty sure its some backend issue, which i do not have the time to sit and debug, i have worked it around and fixed.
Thanks all for your time and in looking into this.