Solved: Re: Excel Files Extraction from SharePoint & Lakeh...

Felpan · ‎05-08-2025

Hello everyone,

I'm facing a challenge with extracting data from approximately 1,200 tables spread across 800-900 Excel files stored in a SharePoint location. I've successfully used Dataflow Gen 2, but this approach requires many queries which isn't ideal.

My administrator has provided me with tenantID, AppID, and Secret key credentials to attempt extraction using notebooks. I'm looking for recommendations on the most efficient workflow to extract these files and consolidate them into a single table.

Additionally, I'd like to understand how to share an Excel file stored in a Lakehouse under the "files" section. Initially, I thought users could work directly on these files via Excel, but I don't see how this is possible without installing OneLake File Explorer. My organization is hesitant to implement this software since it's still in preview.

Has anyone solved similar challenges? What approaches would you recommend?

Thank you in advance for your help!

v-veshwara-msft · ‎05-19-2025

Hi @Felpan ,

Thanks for the update.

The warning “Unverified HTTPS request is being made to...” typically appears when SSL certificate verification is disabled using verify=False in a requests call. While this can help bypass certificate issues during testing, it is not a secure long-term solution.

If you're using this approach:

response = requests.get(url, headers=headers, verify=False)

Remove verify=False and ensure your system trusts the SSL certificate used by SharePoint or Graph API. You can also use the certifi package to include trusted CA bundles.

To temporarily suppress the warning during development:

import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

or:

import requests
requests.packages.urllib3.disable_warnings()

Note: Suppressing HTTPS verification is not recommended for production environments. It's safer to ensure your system trusts the certificate or use a valid CA certificate bundle.

Since you're working with SharePoint files via Graph API in Fabric notebooks, suppressing the warning may help temporarily, but securing the connection should be considered as a next step if you plan to move this into production.

Hope this helps. Please reach out for further assistance.
Please consider marking the helpful reply as Accepted Solution to assist others with similar issues. A kudos would also be greatly appreciated.

Thank you.

View solution in original post

v-veshwara-msft · ‎05-22-2025

Hi @Felpan ,

Following up to see if your query has been resolved. If any of the responses helped, please consider marking the relevant reply as the 'Accepted Solution' to assist others with similar questions.

If you're still facing issues, feel free to reach out.

Thank you.

Felpan · ‎05-16-2025

I have tried using notebook but I get this Unverified HTTPS request is being made to

v-veshwara-msft · ‎05-19-2025

Hi @Felpan ,

Thanks for the update.

The warning “Unverified HTTPS request is being made to...” typically appears when SSL certificate verification is disabled using verify=False in a requests call. While this can help bypass certificate issues during testing, it is not a secure long-term solution.

If you're using this approach:

response = requests.get(url, headers=headers, verify=False)

Remove verify=False and ensure your system trusts the SSL certificate used by SharePoint or Graph API. You can also use the certifi package to include trusted CA bundles.

To temporarily suppress the warning during development:

import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

or:

import requests
requests.packages.urllib3.disable_warnings()

Note: Suppressing HTTPS verification is not recommended for production environments. It's safer to ensure your system trusts the certificate or use a valid CA certificate bundle.

Since you're working with SharePoint files via Graph API in Fabric notebooks, suppressing the warning may help temporarily, but securing the connection should be considered as a next step if you plan to move this into production.

Hope this helps. Please reach out for further assistance.
Please consider marking the helpful reply as Accepted Solution to assist others with similar issues. A kudos would also be greatly appreciated.

Thank you.

burakkaragoz · ‎05-14-2025

Hi @Felpan ,

Tthis is a common challenge when working with large volumes of Excel files in SharePoint and trying to integrate them with Fabric Lakehouse.

Here’s a recommended approach that balances scalability and maintainability:

✅ Use Microsoft Graph API with Fabric Notebooks

Since you already have the Tenant ID, App ID, and Secret, you can authenticate using the client credentials flow and leverage the Microsoft Graph API to:

List and download Excel files from SharePoint.
Read each file using pandas or openpyxl in a Fabric Notebook.
Consolidate the data into a Spark DataFrame.
Write the final dataset to a Delta table in your Lakehouse.

This method is highly scalable and avoids the complexity of managing hundreds of Dataflow queries.

📁 Sharing Excel Files from Lakehouse

Files stored in the Lakehouse "Files" section cannot be opened directly in Excel unless OneLake File Explorer is installed. If your organization is hesitant to use it due to its preview status, consider these alternatives:

Store editable Excel files in SharePoint or OneDrive, and use Fabric Pipelines or Notebooks to sync them to the Lakehouse.
If users only need to view or analyze the data, connect Power BI or Excel directly to the Lakehouse using the SQL endpoint.

Let me know if you'd like a sample notebook or Graph API script to get started!

v-veshwara-msft · ‎05-13-2025

Hi @Felpan ,

Just checking in to see if you query is resolved and if any responses were helpful. If so, kindly consider marking the helpful reply as 'Accepted Solution' to help others with similar queries. A kudos would also be greatly appreciated.

Otherwise, feel free to reach out for further assistance.

Thank you.

burakkaragoz · ‎05-12-2025

Hi @Felpan ,

Pulling 1,200 tables from 800-900 Excel files and consolidating them into a single table is a huge and complex workload. Also, the situation you are experiencing with sharing files within Lakehouse is understandable. Below I share my suggestions for both issues:

📥1. How to Efficiently Pull Excel Files from SharePoint

🔹Current Status:

You have been successful with Dataflow Gen2, but it is not sustainable as it requires a large number of queries.

🔹Proposed Approach:

Azure Data Factory (ADF) veya Azure Logic Apps + Azure İşlevi

List all Excel files with SharePoint API.
Loop through each file, read the tables in it and write them to Azure Data Lake or SQL DB.
You can automate this process with Azure Function and schedule it with ADF.

🔹Alternative:

You can pull files from SharePoint with Office365-REST-Python-Client or msal using Python and process and consolidate them with pandas.

📤2. Sharing Excel File Under Lakehouse “Files”

🔹The current situation:

You were able to work with Excel Online, but this feature is still in preview.
That's why your organization is cautious.

🔹Recommended Approach:

To share a file via OneLake, grant the user the Viewer or Contributor role in the workspace where the file is located.
Alternatively, you can copy the file to OneDrive or SharePoint and share it from there.
Creating a report that reads the file via Power BI and sharing this report with the App may also be a solution.

✅Summary Recommendations:

Goal Recommended Approach

Extracting bulk data from SharePoint ;	Azure Data Factory + Azure Function or Python script
Consolidating Excel files ;	Merge using pandas or load into a SQL staging table
Sharing Excel files from Lakehouse;	Grant workspace access + share via OneLake link or use SharePoint as an alternative

v-veshwara-msft · ‎05-09-2025

Hi @Felpan ,

Thanks for using Microsoft Fabric Community.

Thanks for providing the details. Since you're working with over 800 Excel files and 1,200+ tables, handling them through multiple queries in Dataflow Gen2 becomes difficult to scale. Now that you have your App ID, Tenant ID, and Secret, a better option would be to use Fabric Notebooks with the Microsoft Graph API.

Here’s an overviewof the approach:

Authenticate using the client credentials flow to acquire an access token.
Use the Microsoft Graph API to list and download files from SharePoint.
Read each Excel file using libraries.
Append the extracted data to a consolidated Spark DataFrame.
Write the combined result to a Delta table in your Lakehouse.

A similar implementation has been discussed in this thread, where Fabric notebooks and the Graph API are used for this purpose. You can adapt that pattern to loop through all files and consolidate tables efficiently.

Regarding Excel File Sharing in Lakehouse

Files stored in the Lakehouse "Files" section cannot be opened directly in Excel unless OneLake File Explorer is installed. Although it is in preview, many customers are successfully using it as it maps OneLake storage as a local drive, allowing seamless Excel integration.

If your organization prefers not to use a preview tool, here are some alternatives:

Store Excel files in SharePoint or OneDrive for editing.
Use notebooks or pipelines to move files from SharePoint to Lakehouse for reporting.
If users only need to view the data, connect Power BI or Excel directly to the Lakehouse for live datasets.

Hope this helps. Please reach out for further assistance.
If this post helps, then please consider to Accept as the solution to help the other members find it more quickly and a kudos would be appreciated.

Thank you.

suparnababu8 · ‎05-08-2025

Hi @Felpan

yes, if you use dataflow gen2 to extract and consolidate them into single file automatically performance will slow down.

By using notebooks you can do it processing, cleaning, extracting and consolidating data into single file with the help of the tenantID, AppID, and Secret key credentials. If you use notebooks the actions will be speed while comparing to Dataflowgen2. My recommendations is please use notebooks.

At present, Onelake file explorer is the only way to access the xl files in lakehouse and alos it's in preview mode. There is no way to directly work on these files without onelake file explorer. Another alternate approch is , you may use Azure Blob/ ADLS for easily accessing the files by users.

Let me know if it works for you.

Thank you!!

Excel Files Extraction from SharePoint & Lakehouse File Sharing Issues

✅ Use Microsoft Graph API with Fabric Notebooks

📁 Sharing Excel Files from Lakehouse

📥1. How to Efficiently Pull Excel Files from SharePoint

🔹Current Status:

🔹Proposed Approach:

🔹Alternative:

📤2. Sharing Excel File Under Lakehouse “Files”

🔹The current situation:

🔹Recommended Approach:

✅Summary Recommendations:

Helpful resources

Fabric Monthly Update - December 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

Excel Files Extraction from SharePoint & Lakehouse File Sharing Issues

✅ Use Microsoft Graph API with Fabric Notebooks

📁 Sharing Excel Files from Lakehouse

📥1. How to Efficiently Pull Excel Files from SharePoint

🔹Current Status:

🔹Proposed Approach:

🔹Alternative:

📤2. Sharing Excel File Under Lakehouse “Files”

🔹The current situation:

🔹Recommended Approach:

✅Summary Recommendations:

Helpful resources

Fabric Monthly Update - December 2025

FabCon Atlanta 2026