Solved: Re: Unable to Read Multiple Excel Files Using Wild...

ArwaAldoud · ‎02-12-2025

I’m trying to read multiple Excel (.xlsx) files from a folder in Lakehouse using Notebook PySpark. However, when I use a wildcard (*), I get a FileNotFound error.

Code that fails (attempting to read multiple files using wildcards):

df_sales = pd.read_excel("abfss://{lakehouse}@onelake.dfs.fabric.microsoft.com/Files/Sales*.xlsx", sheet_name="Sales")

Error Message: Using a wildcard (*) results in FileNotFound.

and the Code that works (reading a single file):

df_sales = pd.read_excel("abfss://{lakehouse}@onelake.dfs.fabric.microsoft.com/Files/Current/Sales_2023.xlsx", sheet_name="Sales")

When specifying the exact file name, it works fine.

Any guidance or best practices would be greatly appreciated

ArwaAldoud · ‎02-17-2025

Thank you for your detailed response.

I followed your steps, but unfortunately, it didn’t work for me. Instead, I specified the exact file name in my scenario, and it worked fine.

I appreciate your support and guidance

View solution in original post

Anonymous · ‎02-16-2025

Hi @ArwaAldoud,

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please "Accept as Solution" and give a 'Kudos' so other members can easily find it.

Thank you,
Pavan.

ArwaAldoud · ‎02-17-2025

Thank you for your detailed response.

I followed your steps, but unfortunately, it didn’t work for me. Instead, I specified the exact file name in my scenario, and it worked fine.

I appreciate your support and guidance

Anonymous · ‎02-17-2025

Hi @ArwaAldoud,

We trust that your issue has been resolved. Kindly mark my solution as "Accept as Solution." Additionally, give it a 'Kudos' to help others find it easily.

If you need any further assistance, feel free to reach out.

Please continue using Microsoft community forum.

Thank you,
Pavan.

ArwaAldoud · ‎02-18-2025

I truly appreciate your help.
Thanks again, and I’ll continue engaging in the Microsoft Community Forum.

Anonymous · ‎02-12-2025

Hi @ArwaAldoud,

Thank you for reaching out in Microsoft Community Forum.

The wildcard (*) in pd.read_excel() is not supported because Pandas expects an exact file path.

please follow below steps to acheive the error;

1. Use dbutils.fs.ls() to get all Excel files in the folder and read them one by one into Pandas.

2. For improved performance and scalability, it is recommended to use PySpark and leverage Spark’s Excel reader (com.crealytics.spark.excel) to efficiently read multiple files.

Please continue using Microsoft community forum.

If you found this post helpful, please consider marking it as "Accept as Solution" and give it a 'Kudos'. if it was helpful. help other members find it more easily.

Regards,
Pavan.

ArwaAldoud · ‎02-12-2025

The file name is correct, the path is correct it's only an issue when using wildcards (*)

Unable to Read Multiple Excel Files Using Wildcard (*) inUnable to Read Mult Lakehouse using PySpark

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025

Party with Power BI’s own Guy in a Cube

Unable to Read Multiple Excel Files Using Wildcard (*) inUnable to Read Mult Lakehouse using PySpark

Helpful resources

Join our Fabric User Panel

Fabric Monthly Update - June 2025

Fabric Community Update - June 2025