Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
xefere
New Member

Fabric tutorial failing on files path

Hi

 

I am doing this Fabric tutorial (Lakehouse tutorial - Prepare and transform data in the lakehouse - Microsoft Fabric | Microsoft Lear...)

 

When I use the code as provided,

from pyspark.sql.functions import col, year, month, quarter

table_name = 'fact_sale'

df = spark.read.parquet('Files/wwi-raw-data/full/fact_sale_1y_full')
df = df.withColumn('Year', year(col("InvoiceDateKey")))
df = df.withColumn('Quarter', quarter(col("InvoiceDateKey")))
df = df.withColumn('Month', month(col("InvoiceDateKey")))

df.write.mode("overwrite").format("delta").partitionBy("Year","Quarter").save("Tables/" + table_name)

 

I get the following error:

--------------------------------------------------------------------------- AnalysisException Traceback (most recent call last) Cell In[104], line 5 1 from pyspark.sql.functions import col, year, month, quarter 3 table_name = 'fact_sale' ----> 5 df = spark.read.parquet('Files/wwi-raw-data/full/fact_sale_1y_full') 6 df = df.withColumn('Year', year(col("InvoiceDateKey"))) 7 df = df.withColumn('Quarter', quarter(col("InvoiceDateKey")))

File /opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py:531, in DataFrameReader.parquet(self, *paths, **options) 520 int96RebaseMode = options.get("int96RebaseMode", None) 521 self._set_opts( 522 mergeSchema=mergeSchema, 523 pathGlobFilter=pathGlobFilter, (...) 528 int96RebaseMode=int96RebaseMode, 529 ) --> 531 return self._df(self._jreader.parquet(_to_seq(self._spark._sc, paths)))

File ~/cluster-env/trident_env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args) 1316 command = proto.CALL_COMMAND_NAME +\ 1317 self.command_header +\ 1318 args_command +\ 1319 proto.END_COMMAND_PART 1321 answer = self.gateway_client.send_command(command) -> 1322 return_value = get_return_value( 1323 answer, self.gateway_client, self.target_id, self.name) 1325 for temp_arg in temp_args: 1326 if hasattr(temp_arg, "_detach"😞

File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:175, in capture_sql_exception.<locals>.deco(*a, **kw) 171 converted = convert_exception(e.java_exception) 172 if not isinstance(converted, UnknownException): 173 # Hide where the exception came from that shows a non-Pythonic 174 # JVM exception message. --> 175 raise converted from None 176 else: 177 raise AnalysisException:

[PATH_NOT_FOUND] Path does not exist: abfss://8c1fa0f9-27f5-4bd6-9266-e6dfccd1cf2f@onelake.dfs.fabric.microsoft.com/99c8f3be-4e9e-4f83-83f1-cc325343cf6b/Files/wwi-raw-data/full/fact_sale_1y_full.

 

xefere_0-1710114140024.png

 

I've now noticed that the abfss path is not the same.

 

When I run this code with the abfss path copied form my lakehouse, it work perfectly:

 

from pyspark.sql.functions import col, year, month, quarter

table_name = 'fact_sale'

# Read each CSV file in the folder
df = spark.read.option("header", "true").parquet(files).select("*", "_metadata.file_name","_metadata.file_modification_time")

df = spark.read.parquet('abfss://8c1fa0f9-27f5-4bd6-9266-e6dfccd1cf2f@onelake.dfs.fabric.microsoft.com/cbbd6d1f-0ac3-402a-ab8f-fbc7093b6ccc/Files/wwi-raw-data/full/fact_sale_1y_full')
df = df.withColumn('Year', year(col("InvoiceDateKey")))
df = df.withColumn('Quarter', quarter(col("InvoiceDateKey")))
df = df.withColumn('Month', month(col("InvoiceDateKey")))
 
df.write.mode("overwrite").format("delta").partitionBy("Year","Quarter").save("abfss://8c1fa0f9-27f5-4bd6-9266-e6dfccd1cf2f@onelake.dfs.fabric.microsoft.com/cbbd6d1f-0ac3-402a-ab8f-fbc7093b6ccc/Tables/" + table_name)
 
What am I doing wrong or is wrong in my setup?
1 ACCEPTED SOLUTION
v-gchenna-msft
Community Support
Community Support

Hi @xefere ,

Apologies for the delay in reply from our side. 
Based on the screenshot you provided, I can notice that lakehouse is not the default lakehouse in your case.
Once you change it to default lakehouse then you would be able to use the Relative File Path, i.e. 'Files/wwi-raw-data/full/fact_sale_1y_full'

vgchennamsft_0-1710313804660.png


Hope this is helpful. Please let me know incase of further queries.

 

View solution in original post

4 REPLIES 4
xefere
New Member

Thank you, I've added the Lakehouse in the Sources panel of the notebook and the relative path worked perfectly.

Glad to know that your query resolved. Please continue using fabric community for your further queries.

v-gchenna-msft
Community Support
Community Support

Hi @xefere ,

Apologies for the delay in reply from our side. 
Based on the screenshot you provided, I can notice that lakehouse is not the default lakehouse in your case.
Once you change it to default lakehouse then you would be able to use the Relative File Path, i.e. 'Files/wwi-raw-data/full/fact_sale_1y_full'

vgchennamsft_0-1710313804660.png


Hope this is helpful. Please let me know incase of further queries.

 

Hi @xefere ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet .
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFBCUpdateCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.