Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
BoSe
Frequent Visitor

Unable to get list of files with ABFS path

Hello,

i added multiple lakehouses toone notebook.

Now i want to check what is the latest file in a specific folder in each of the lakehouses.

 

I'm able to access the data using the 'File API path' of the default Lakehouse:

 

list_of_files = glob.glob('/lakehouse/default/Files/.../input/*') 
last_modified_file = max(list_of_files, key=os.path.getmtime)
last_modified_file

 

 

 

However when i try to do the same with ABFS Path, i dont get a result in list_of_files. It just returns an empty list:

 

list_of_files = glob.glob('abfss://.../input/*')
list_of_files

 

 

If i try to read data with the ABFS Path i works without any issue- so i can not be an issue with path/permission:

 

df = pd.read_csv('abfss://...input/example.csv',sheet_name="Tabelle1")

 

 

 

Any idea how to make it work that not only the default lakehouse can be accessed but also another as datasource added lakehouses?

1 ACCEPTED SOLUTION
v-gchenna-msft
Community Support
Community Support

Hi @BoSe ,

Thanks for using Fabric Community.
It looks like glob is not able to process it when we are passing the abfs path. This looks like a limitation with glob.

You can also try mssparkutils.fs.ls - Introduction to Microsoft Spark utilities - Azure Synapse Analytics | Microsoft Learn

 

files = mssparkutils.fs.ls("abfss://5e****dd/Files")
file_paths = [f.path for f in files]
print(file_path)

 


Hope this is helpful. Please let me know incase of further queries.

View solution in original post

4 REPLIES 4
v-gchenna-msft
Community Support
Community Support

Hi @BoSe ,

Thanks for using Fabric Community.
It looks like glob is not able to process it when we are passing the abfs path. This looks like a limitation with glob.

You can also try mssparkutils.fs.ls - Introduction to Microsoft Spark utilities - Azure Synapse Analytics | Microsoft Learn

 

files = mssparkutils.fs.ls("abfss://5e****dd/Files")
file_paths = [f.path for f in files]
print(file_path)

 


Hope this is helpful. Please let me know incase of further queries.

I was able to get the latest file (with all the info), with this addition:

 

latest_file = max(files, key=lambda file: file.modifyTime)

Thanks for the input.

 

You get the path.. but it does not work with abfss path

Glad to know your query got resolved. Please continue using Fabric Communty for your further queries.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Kudoed Authors