Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
jcamilo1985
Helper III
Helper III

read Azure Data Lake from notebook fabric

Good morning

The question is simple, I fear that the solution is not so simple, someone has been able to connect from a factory notebook to an Azure Data Lake Storage Gen2 without having to perform a data pipeline or a data flow. That is, directly with Pyspark Spark or SparkR.

I would greatly appreciate, if possible, the code fragment in these three languages.

Thank you very much in advance for any guidance.

 

connection.png

@Anonymous 

1 ACCEPTED SOLUTION
govindarajan_d
Super User
Super User

Hi @jcamilo1985,

 

You need to go to Lakehouse view and a new shortcut: 

govindarajan_d_0-1706351490941.png

1. Specify the DFS path of the storage account (you can get it from the endpoint properties of the storage account)

2. Authentication method - You can either use your own org account or use a SAS token or service principal account. I would suggest service principal account as the best practice.

3. In the next screen, add a specific container as the path and then click on create. 

4. Once added, it will be shown under unidentified folder as a shorcut with the storage account name. You will be able to see all the files underneath that. 

govindarajan_d_1-1706351665282.png

 

5. To use it in Spark notebook, add the corresponding lakehouse in the notebook and then navigate to the file you want to load and click on the '...' and then you will be able to see this option for loading the data using Spark or Pandas. You can use either of them and the code for loading them in a dataframe will automatically be generated. 

govindarajan_d_2-1706351873763.png

 

This is the Pyspark code that was generated for the 1.csv file I had shown

spark.read.format("csv").option("header","true").load("Tables/satrainingdc/week3/files/1.csv")

 

 

View solution in original post

6 REPLIES 6
govindarajan_d
Super User
Super User

Hi @jcamilo1985,

 

You need to go to Lakehouse view and a new shortcut: 

govindarajan_d_0-1706351490941.png

1. Specify the DFS path of the storage account (you can get it from the endpoint properties of the storage account)

2. Authentication method - You can either use your own org account or use a SAS token or service principal account. I would suggest service principal account as the best practice.

3. In the next screen, add a specific container as the path and then click on create. 

4. Once added, it will be shown under unidentified folder as a shorcut with the storage account name. You will be able to see all the files underneath that. 

govindarajan_d_1-1706351665282.png

 

5. To use it in Spark notebook, add the corresponding lakehouse in the notebook and then navigate to the file you want to load and click on the '...' and then you will be able to see this option for loading the data using Spark or Pandas. You can use either of them and the code for loading them in a dataframe will automatically be generated. 

govindarajan_d_2-1706351873763.png

 

This is the Pyspark code that was generated for the 1.csv file I had shown

spark.read.format("csv").option("header","true").load("Tables/satrainingdc/week3/files/1.csv")

 

 

Thank you very much for the reply @govindarajan_d 

Hi @jcamilo1985 ,

 

Happy to help!

AndyDDC
Super User
Super User

Hi @jcamilo1985 you could use a shortcut created in a lakehouse to work with the data in the azure data lake gen2 account https://learn.microsoft.com/en-us/fabric/onelake/onelake-shortcuts

 

or you could just reference the data lake location using the abfss url in the notebook 

@AndyDDC  First of all, thank you for coming to my help.
Does it have a link that explains how the reference is made.

Hi @jcamilo1985 
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. Otherwise, will respond back with the more details and we will try to help.
Thanks

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.