Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Quique
Frequent Visitor

Run Script for all Lakehouse Files / Lakehouse files storage account key

I would like to use the Notebook to run the same script for all files stored in the Lakehouse files. How can this be done? If there´s a need to get the storage account key for lakehouse files, where can this be obtained?

1 ACCEPTED SOLUTION
alxdean
Advocate V
Advocate V

You can easily list all files in a Lakehouse folder from your notebook. Just make sure to connect the lakehouse in the left pane first. that will then become your default lakehouse in the notebook
 
file_path = f"/lakehouse/default/Files/yoursubfoldername_here"
lst = os.listdir(file_path)
 
then you can iterate through the files and apply the same logic to each file in a loop. 
 
for file in lst:
    --do stuff
 
So the trick is not to run the same notebook on each file, but to read out the files in the notebook and the run the same logic on each file. 
 
you could create a pipeline that reads out the list of files in a folder and then call a notebook and pass the file path as a parameter, but that just adds unnecessary complexity.
 
if you're looking for the fully qualified path to where your files are in the lakehouse, then you can grab it from the properties on the Files folder
alxdean_0-1709155088632.png

 

View solution in original post

4 REPLIES 4
Quique
Frequent Visitor

Thanks! I'm sure this works and almost worked for me, but for some reason I keep getting this error Spark_Ambiguous_MsSparkUtils_UseMountedPathFailure. 

I'll keep checking.

 
alxdean
Advocate V
Advocate V

You can easily list all files in a Lakehouse folder from your notebook. Just make sure to connect the lakehouse in the left pane first. that will then become your default lakehouse in the notebook
 
file_path = f"/lakehouse/default/Files/yoursubfoldername_here"
lst = os.listdir(file_path)
 
then you can iterate through the files and apply the same logic to each file in a loop. 
 
for file in lst:
    --do stuff
 
So the trick is not to run the same notebook on each file, but to read out the files in the notebook and the run the same logic on each file. 
 
you could create a pipeline that reads out the list of files in a folder and then call a notebook and pass the file path as a parameter, but that just adds unnecessary complexity.
 
if you're looking for the fully qualified path to where your files are in the lakehouse, then you can grab it from the properties on the Files folder
alxdean_0-1709155088632.png

 

BoSe
Frequent Visitor

For me it only works when i use the /lakehouse/default/Files... Path.
However when i try to use it with the abfs path, do get the following Error:

FileNotFoundError: [Errno 2] No such file or directory: 'abfss://.../input'

 

file_path = f"abfss://.../input"
lst = os.listdir(file_path)
lst

 

Any idea what causes that issue?

UPDATE: It worked now, thanks again! There were a couple of problems: the delta tables I was creating had blank spaces in their names. Also, when creating the delta tables, I changed to use the qualified path, instead of the relative path. 

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Kudoed Authors