Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!View all the Fabric Data Days sessions on demand. View schedule
I was trying to connect and read a sharepoint excel file using notebooks with PySpark, but I cannot find any tutorial that ables me to perform this accurately.
Please any help would be appreciate it.
It's just conenct and read an excel file in sharepoint so I don't have to import it using dataflow and create a table or importing it on the lakehouse and refresh the dataflow everytime i do a change.
Solved! Go to Solution.
The solution needs to be done by entering to azure portal, create an app, give permitions of reader in mivcrosfot graph and sharepoint.
From that app you will take tenant id, client secret, client id.
When that's done you can run the following code:
# Authentication details
tenant_id = "enter-your-tenant-id"
client_id = "enter-your-client-id"
client_secret = "enter-your-client-secret"
sharepoint_domain = "enter-your-sharepoint-domain"
site_name = "enter-your-site"
file_path = "/path/to/excel.xlsx" # Corrected path
# Step 1: Get an access token using Microsoft Identity OAuth2.0
token_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
token_data = {
"grant_type": "client_credentials",
"client_id": "client_id",
"client_secret": "client_secret",
"scope": "https://graph.microsoft.com/.default"
}
response = requests.post(token_url, data=token_data)
response.raise_for_status() # Raise error if request fails
access_token = response.json().get("access_token")
# Debugging: Print token preview to ensure it's not empty
print(" Access Token Received:", access_token[:50], "...")
# Step 2: Retrieve the SharePoint Site ID
headers = {"Authorization": f"Bearer {access_token}"}
site_url = f"https://graph.microsoft.com/v1.0/sites/{sharepoint_domain}:/sites/{site_name}"
response = requests.get(site_url, headers=headers)
response.raise_for_status()
site_id = response.json()["id"]
print(f" Retrieved Site ID: {site_id}")
# Step 3: Retrieve the File Content from SharePoint using Graph API
file_url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drive/root:/{file_path}:/content"
response = requests.get(file_url, headers=headers)
if response.status_code != 200:
raise Exception(f" Failed to download file: {response.text}")
print(" File successfully downloaded from SharePoint!")
# Step 4: Read the specific sheet "LH_Mapping" from the downloaded Excel file
xls = BytesIO(response.content) # Convert the response content into an in-memory file
df = pd.read_excel(xls, sheet_name="sheet-name") # Trinitec Share Point PLUs
print(' Excel fetched succesfully!\n')
The solution needs to be done by entering to azure portal, create an app, give permitions of reader in mivcrosfot graph and sharepoint.
From that app you will take tenant id, client secret, client id.
When that's done you can run the following code:
# Authentication details
tenant_id = "enter-your-tenant-id"
client_id = "enter-your-client-id"
client_secret = "enter-your-client-secret"
sharepoint_domain = "enter-your-sharepoint-domain"
site_name = "enter-your-site"
file_path = "/path/to/excel.xlsx" # Corrected path
# Step 1: Get an access token using Microsoft Identity OAuth2.0
token_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
token_data = {
"grant_type": "client_credentials",
"client_id": "client_id",
"client_secret": "client_secret",
"scope": "https://graph.microsoft.com/.default"
}
response = requests.post(token_url, data=token_data)
response.raise_for_status() # Raise error if request fails
access_token = response.json().get("access_token")
# Debugging: Print token preview to ensure it's not empty
print(" Access Token Received:", access_token[:50], "...")
# Step 2: Retrieve the SharePoint Site ID
headers = {"Authorization": f"Bearer {access_token}"}
site_url = f"https://graph.microsoft.com/v1.0/sites/{sharepoint_domain}:/sites/{site_name}"
response = requests.get(site_url, headers=headers)
response.raise_for_status()
site_id = response.json()["id"]
print(f" Retrieved Site ID: {site_id}")
# Step 3: Retrieve the File Content from SharePoint using Graph API
file_url = f"https://graph.microsoft.com/v1.0/sites/{site_id}/drive/root:/{file_path}:/content"
response = requests.get(file_url, headers=headers)
if response.status_code != 200:
raise Exception(f" Failed to download file: {response.text}")
print(" File successfully downloaded from SharePoint!")
# Step 4: Read the specific sheet "LH_Mapping" from the downloaded Excel file
xls = BytesIO(response.content) # Convert the response content into an in-memory file
df = pd.read_excel(xls, sheet_name="sheet-name") # Trinitec Share Point PLUs
print(' Excel fetched succesfully!\n')
This seems like it should be simpler. With google sheets, it's a simple and straightforward call from a notebook. It NEEDS to be as simple here. Dataflows and staging tables in my lakehouse are just a pain, given how much business data I need to ingest from Excel sources.
site_url = f"https://graph.microsoft.com/v1.0/sites/{sharepoint_domain}:/sites/{site_name}"I get an error “code”: “generalException”, “message”: “General exception while processing,” when I call the site.
The site is not in English (Italian).
Yet I have the necessary permissions.
site: "https://<name>.sharepoint.com/sites/Commerciale/"
siteurl: "https://<name>.sharepoint.com/sites/Commerciale/Documenti%20condivisi/Forms/AllItems.aspx"
Do you have any idea what the problem might be?
Hi @giupegiupe ,
Thank you for being an active member of the Microsoft Fabric Community. Since this thread has already been marked as resolved, we kindly request you to create a new question post for your current query. This will help ensure that your issue is prioritized and visible to the community and support members, so we can assist you more effectively.
We appreciate your understanding and look forward to helping you.
Hi @CrhIT ,
Thanks for reaching out to the Microsoft fabric community forum.
As @nilendraFabric suggested, have you got an opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.
Hello @CrhIT ,
I wanted to inquire if you had the chance to review the provided information. Please do not hesitate to reach out if you have any additional questions. If the issue is resolved, kindly accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.
Hi @CrhIT ,
May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.
Thank you.
Hello @CrhIT
Here is very basic code, not production ready as secrets are in plain text , but good to try the logic
import requests
from notebookutils import mssparkutils
# Authentication details
tenant_id = "your_tenant_id"
client_id = "your_client_id"
client_secret = "your_client_secret"
sharepoint_site = "https://yourtenant.sharepoint.com/sites/yoursite"
file_path = "/Shared Documents/yourfile.xlsx"
# Step 1: Get access token
auth_url = f"https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token"
auth_data = {
"grant_type": "client_credentials",
"client_id": client_id,
"client_secret": client_secret,
"scope": "https://graph.microsoft.com/.default"
}
response = requests.post(auth_url, data=auth_data)
if response.status_code != 200:
raise Exception(f"Failed to authenticate: {response.text}")
access_token = response.json().get("access_token")
# Step 2: Download file from SharePoint
headers = {"Authorization": f"Bearer {access_token}"}
file_url = f"{sharepoint_site}/_api/web/getfilebyserverrelativeurl('{file_path}')/$value"
response = requests.get(file_url, headers=headers)
if response.status_code != 200:
raise Exception(f"Failed to download file: {response.text}")
# Step 3: Save file to OneLake using notebookutils
# Ensure the path includes a subfolder under /Files
one_lake_path = "/lakehouse/default/Files/subfolder/yourfile.xlsx" # Path relative to Lakehouse Files folder
# Create the subfolder if it doesn't exist
mssparkutils.fs.mkdirs("/lakehouse/default/Files/subfolder")
# Write binary content directly to OneLake
mssparkutils.fs.put(one_lake_path, response.content.decode("latin1"), overwrite=True) # Decode binary content as a string
print(f"File downloaded and saved to OneLake at: {one_lake_path}")
I haven't tried it by my own. Once data is written in Fabric read it using Pandas
from pyspark.sql import SparkSession
file_path = "abfss://<workspace>@onelake.dfs.fabric.microsoft.com/<lakehouse>/Files/yourfile.csv"
# Read the Excel file using Pandas df = pd.read_excel(file_path)
# Display the first few rows of the data
print(df.head())
see if this works.
Thanks
Check out the November 2025 Fabric update to learn about new features.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!