Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
Quique
Frequent Visitor

Run Script for all Lakehouse Files / Lakehouse files storage account key

I would like to use the Notebook to run the same script for all files stored in the Lakehouse files. How can this be done? If there´s a need to get the storage account key for lakehouse files, where can this be obtained?

1 ACCEPTED SOLUTION
Anonymous
Not applicable

You can easily list all files in a Lakehouse folder from your notebook. Just make sure to connect the lakehouse in the left pane first. that will then become your default lakehouse in the notebook
 
file_path = f"/lakehouse/default/Files/yoursubfoldername_here"
lst = os.listdir(file_path)
 
then you can iterate through the files and apply the same logic to each file in a loop. 
 
for file in lst:
    --do stuff
 
So the trick is not to run the same notebook on each file, but to read out the files in the notebook and the run the same logic on each file. 
 
you could create a pipeline that reads out the list of files in a folder and then call a notebook and pass the file path as a parameter, but that just adds unnecessary complexity.
 
if you're looking for the fully qualified path to where your files are in the lakehouse, then you can grab it from the properties on the Files folder
alxdean_0-1709155088632.png

 

View solution in original post

4 REPLIES 4
Quique
Frequent Visitor

Thanks! I'm sure this works and almost worked for me, but for some reason I keep getting this error Spark_Ambiguous_MsSparkUtils_UseMountedPathFailure. 

I'll keep checking.

 
Anonymous
Not applicable

You can easily list all files in a Lakehouse folder from your notebook. Just make sure to connect the lakehouse in the left pane first. that will then become your default lakehouse in the notebook
 
file_path = f"/lakehouse/default/Files/yoursubfoldername_here"
lst = os.listdir(file_path)
 
then you can iterate through the files and apply the same logic to each file in a loop. 
 
for file in lst:
    --do stuff
 
So the trick is not to run the same notebook on each file, but to read out the files in the notebook and the run the same logic on each file. 
 
you could create a pipeline that reads out the list of files in a folder and then call a notebook and pass the file path as a parameter, but that just adds unnecessary complexity.
 
if you're looking for the fully qualified path to where your files are in the lakehouse, then you can grab it from the properties on the Files folder
alxdean_0-1709155088632.png

 

For me it only works when i use the /lakehouse/default/Files... Path.
However when i try to use it with the abfs path, do get the following Error:

FileNotFoundError: [Errno 2] No such file or directory: 'abfss://.../input'

 

file_path = f"abfss://.../input"
lst = os.listdir(file_path)
lst

 

Any idea what causes that issue?

UPDATE: It worked now, thanks again! There were a couple of problems: the delta tables I was creating had blank spaces in their names. Also, when creating the delta tables, I changed to use the qualified path, instead of the relative path. 

Helpful resources

Announcements
Oct Fabric Update Carousel

Fabric Monthly Update - October 2024

Check out the October 2024 Fabric update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.