Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Special holiday offer! You and a friend can attend FabCon with a BOGO code. Supplies are limited. Register now.
If you're working with files stored in SharePoint and need to regularly sync them to Microsoft Fabric Lakehouse, you have a few options. While Dataflow Gen2 provides a UI-driven approach for connecting to SharePoint data sources, it has limitations, it can't handle certain file types, may struggle with complex folder structures, and doesn't always support the flexibility needed for custom ETL logic.
What if you needed more control? A code-based solution that could download any file type from SharePoint, apply custom transformations, and load them into your Lakehouse with a single notebook run?
I've built an open-source PySpark notebook that does exactly that. In this post, I'll walk you through the solution, explain how it works, and show you how to get it running in your environment.
This notebook automatically:
Perfect for teams that need regular data syncs without manual intervention.
For each Excel file:
Update Cell 3 with your paths and file details:
# Your Lakehouse path
lakehouse_abfs_path = "abfss://workspace@onelake.dfs.fabric.microsoft.com/lakehouse.Lakehouse/Files/data"
# Your files
source_files = [
{
"url": "SHAREPOINT_FILE_URL",
"sharing_link": "PASTE_SHARING_LINK_HERE",
"lakehouse_name": "sales_data.xlsx",
"description": "Monthly sales report"
}
]That's it! Your data will sync automatically.
Create a Fabric Pipeline, add a Notebook activity, and configure a schedule trigger (hourly, daily, etc.). Now your SharePoint data automatically flows into your Lakehouse without any manual work.
The current implementation uses "Anyone with the link can edit" for simplicity. For production environments, I recommend implementing Azure App Registration with client credentials for proper authentication. The README includes guidance on this approach, and contributions to add native authentication support are welcome!
The notebook is available on GitHub (https://github.com/dyfatai/SharePoint-To-MicrosoftFabric-Lakehouse-Notebook) with complete documentation, configuration examples, and troubleshooting tips.
Are you automating SharePoint to Lakehouse data flows? What's your approach? Drop a comment below, I'd love to hear how you're solving this challenge!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.