Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
BoSe
Frequent Visitor

Unable to get list of files with ABFS path

Hello,

i added multiple lakehouses toone notebook.

Now i want to check what is the latest file in a specific folder in each of the lakehouses.

 

I'm able to access the data using the 'File API path' of the default Lakehouse:

 

list_of_files = glob.glob('/lakehouse/default/Files/.../input/*') 
last_modified_file = max(list_of_files, key=os.path.getmtime)
last_modified_file

 

 

 

However when i try to do the same with ABFS Path, i dont get a result in list_of_files. It just returns an empty list:

 

list_of_files = glob.glob('abfss://.../input/*')
list_of_files

 

 

If i try to read data with the ABFS Path i works without any issue- so i can not be an issue with path/permission:

 

df = pd.read_csv('abfss://...input/example.csv',sheet_name="Tabelle1")

 

 

 

Any idea how to make it work that not only the default lakehouse can be accessed but also another as datasource added lakehouses?

1 ACCEPTED SOLUTION
v-gchenna-msft
Community Support
Community Support

Hi @BoSe ,

Thanks for using Fabric Community.
It looks like glob is not able to process it when we are passing the abfs path. This looks like a limitation with glob.

You can also try mssparkutils.fs.ls - Introduction to Microsoft Spark utilities - Azure Synapse Analytics | Microsoft Learn

 

files = mssparkutils.fs.ls("abfss://5e****dd/Files")
file_paths = [f.path for f in files]
print(file_path)

 


Hope this is helpful. Please let me know incase of further queries.

View solution in original post

4 REPLIES 4
v-gchenna-msft
Community Support
Community Support

Hi @BoSe ,

Thanks for using Fabric Community.
It looks like glob is not able to process it when we are passing the abfs path. This looks like a limitation with glob.

You can also try mssparkutils.fs.ls - Introduction to Microsoft Spark utilities - Azure Synapse Analytics | Microsoft Learn

 

files = mssparkutils.fs.ls("abfss://5e****dd/Files")
file_paths = [f.path for f in files]
print(file_path)

 


Hope this is helpful. Please let me know incase of further queries.

I was able to get the latest file (with all the info), with this addition:

 

latest_file = max(files, key=lambda file: file.modifyTime)

Thanks for the input.

 

You get the path.. but it does not work with abfss path

Glad to know your query got resolved. Please continue using Fabric Communty for your further queries.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors