<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3701543#M1771</link>
    <description>&lt;P&gt;I have a file located in sharepoint and I want to load the contents of that file into a Lakehouse. The file can have multiple sheets so I need to load each sheet into a separate table.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I couldn't find any reference article that provided the steps to do this.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Even if I upload the file into the Lakehouse itself, and just try to read the file, I am not able to get this to work&lt;/P&gt;&lt;LI-CODE lang="python"&gt;df = (spark.read.format("com.crealytics.spark.excel")
    .option("location", "Files/Test.xlsx")
    .option("useHeader", "true")
    .option("sheetName", "test_sheet")
    .load()
    )

display(df)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It gives an error "&lt;SPAN&gt;[DATA_SOURCE_NOT_FOUND] Failed to find the data source: com.crealytics.spark.excel."&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; I know I can do this using Dataflows. I just want to know how I would do it using Notebooks.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 15 Feb 2024 14:29:08 GMT</pubDate>
    <dc:creator>VickyDev18</dc:creator>
    <dc:date>2024-02-15T14:29:08Z</dc:date>
    <item>
      <title>How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3701543#M1771</link>
      <description>&lt;P&gt;I have a file located in sharepoint and I want to load the contents of that file into a Lakehouse. The file can have multiple sheets so I need to load each sheet into a separate table.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I couldn't find any reference article that provided the steps to do this.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Even if I upload the file into the Lakehouse itself, and just try to read the file, I am not able to get this to work&lt;/P&gt;&lt;LI-CODE lang="python"&gt;df = (spark.read.format("com.crealytics.spark.excel")
    .option("location", "Files/Test.xlsx")
    .option("useHeader", "true")
    .option("sheetName", "test_sheet")
    .load()
    )

display(df)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It gives an error "&lt;SPAN&gt;[DATA_SOURCE_NOT_FOUND] Failed to find the data source: com.crealytics.spark.excel."&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;STRONG&gt;Note:&lt;/STRONG&gt; I know I can do this using Dataflows. I just want to know how I would do it using Notebooks.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Feb 2024 14:29:08 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3701543#M1771</guid>
      <dc:creator>VickyDev18</dc:creator>
      <dc:date>2024-02-15T14:29:08Z</dc:date>
    </item>
    <item>
      <title>Re: How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3702122#M1772</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/509497"&gt;@VickyDev18&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;I can answer the read the excel from a lakeshouse and I did tried the below price of code and it did worked . I am using pandas here .&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; pandas &lt;/SPAN&gt;&lt;SPAN&gt;as&lt;/SPAN&gt;&lt;SPAN&gt; pd&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;# Load data into pandas DataFrame from "/lakehouse/default/" + "Files/test2/data.csv"&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;df = pd.read_excel(&lt;/SPAN&gt;&lt;SPAN&gt;"/lakehouse/default/"&lt;/SPAN&gt;&lt;SPAN&gt; + &lt;/SPAN&gt;&lt;SPAN&gt;"Files/test2/Book1.xlsx"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;display(df)&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="HimanshuSmsft_0-1708021616647.png" style="width: 400px;"&gt;&lt;img src="https://community.fabric.microsoft.com/t5/image/serverpage/image-id/1043506i584A2A14430D9318/image-size/medium?v=v2&amp;amp;px=400" role="button" title="HimanshuSmsft_0-1708021616647.png" alt="HimanshuSmsft_0-1708021616647.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;HTH&amp;nbsp;&lt;BR /&gt;Himanshu&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 15 Feb 2024 18:27:37 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3702122#M1772</guid>
      <dc:creator>HimanshuS-msft</dc:creator>
      <dc:date>2024-02-15T18:27:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3703967#M1773</link>
      <description>&lt;P&gt;Thanks. Pandas option worked!&lt;/P&gt;&lt;P&gt;I'm curious to know why the spark option isn't working though.&lt;/P&gt;&lt;P&gt;The option in the UI menu "Load data &amp;gt; Spark" does nothing when using an xlsx file either. Works for csv.&amp;nbsp; &amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Feb 2024 12:31:02 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3703967#M1773</guid>
      <dc:creator>VickyDev18</dc:creator>
      <dc:date>2024-02-16T12:31:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3971949#M2414</link>
      <description>&lt;P&gt;are we able to write to the excel document in this way as well?&lt;/P&gt;</description>
      <pubDate>Tue, 04 Jun 2024 16:17:27 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/3971949#M2414</guid>
      <dc:creator>HiKent</dc:creator>
      <dc:date>2024-06-04T16:17:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/4147461#M4043</link>
      <description>&lt;P&gt;Hello, I am working on a similar issue. I able to read multiple sheets using pandas as mentioned below if the file is in lakehouse, but is there a automated way that it picks the file from sharepoint using pipeline ? and then can be accessed in a notebook to read it. Please let me know if you have found any solution to this problem.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Sruthi&lt;/P&gt;</description>
      <pubDate>Thu, 12 Sep 2024 14:26:14 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/4147461#M4043</guid>
      <dc:creator>sruthiramana</dc:creator>
      <dc:date>2024-09-12T14:26:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/4314866#M5496</link>
      <description>&lt;P&gt;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/731469"&gt;@sruthiramana&lt;/a&gt;&amp;nbsp; I think for your need a good starting point could be this article &lt;A href="https://medium.com/@khanosman1917/automating-data-ingestion-from-sharepoint-to-azure-lakehouse-using-pyspark-and-microsoft-graph-api-da69782b15bc" target="_blank" rel="noopener"&gt;Automating Data Ingestion from SharePoint to Azure Lakehouse using PySpark and Microsoft Graph API&lt;/A&gt; .&lt;/P&gt;&lt;P&gt;It took me several weeks to develop an automated and general-purpose solution to ingest Sharepoint files in lakehose suitable for an enterprise environment and the above article was my starting point.&lt;/P&gt;&lt;P&gt;The main challenge were permissions: Sharepoint and Fabric are two different and unconnected worlds, there are no shortcuts!&lt;/P&gt;&lt;P&gt;To develop this solution you need to be familiar with app registration in EntraID and with the Microsoft Graph API (I had to learn how to work with Postman to debug my code) and you need the support of a Sharepoint Tenant admin and an Azure Tenant admin to grant proper permissions to your "automated solution".&lt;/P&gt;&lt;P&gt;And last, but for sure not the least, another big issue was that all the official Microsoft documentation on the topic (December 2024) is &lt;SPAN class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;contradictory&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt; or outdated.&lt;/P&gt;&lt;P&gt;Below the rationale of the solution I developed, if you need further information, just let me know.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Copy files from Sharepoint to Lakehouse&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;How to copy all files contained in a SharePoint Online folder into a Fabric Lakehouse folder&lt;/STRONG&gt;&lt;/P&gt;&lt;H1&gt;Use Case&lt;/H1&gt;&lt;P&gt;Sometimes users consume custom classifications for departmental or product data analysis or other specific investigations that integrate or complete the official company classifications defined in the source data systems. These custom definitions are also subject to frequent changes that are managed directly by end users typically using spreadsheets in which semantic dictionaries are defined. These dictionaries are typically managed by users on SharePoint and the data engineer task is to make these additional classifications available in the EDW (Enterprise Data Warehouse).&lt;/P&gt;&lt;H1&gt;Goal&lt;/H1&gt;&lt;P&gt;Automatically ingest business user’s files stored in SharePoint to have a staging copy independent and detached from the source system. Depending on the data process requirement, files can be extracted at regular intervals, such as once a day or every time a file is modified.&lt;/P&gt;&lt;H1&gt;Solution&lt;/H1&gt;&lt;P&gt;Developing a general-purpose notebook for Fabric able to programmatically copy the full content of any SharePoint Online folder in any Lakehouse folder when called by a pipeline activity.&lt;/P&gt;&lt;H1&gt;Note&lt;/H1&gt;&lt;P&gt;A similar but simpler solution has already been developed for Data Factory and Synapse Analytics, but since in Fabric today (October 2024) the required connectors cannot be parameterized to allow a general-purpose solution, we decided to handle the entire process with a Spark Notebook.&lt;/P&gt;&lt;H1&gt;Requirements and constraints&lt;/H1&gt;&lt;OL&gt;&lt;LI&gt;SharePoint Online is not an Azure component, so you need to properly authenticate to access it from Fabric. In addition, since we access SharePoint using an app and not a user (delegated authentication), we need to setup a non-interactive authentication (app identity authentication) using a registered app identity in EntraId.&lt;/LI&gt;&lt;LI&gt;Today you can access SharePoint using SharePoint REST API with ACS (Azure Access Control Service) authentication, or using Microsoft Graph REST API with AD (Azure Active Directory) authentication (ref. &lt;A href="https://learn.microsoft.com/en-us/sharepoint/dev/sp-add-ins-modernize/from-acs-to-aad-apps" target="_blank" rel="noopener"&gt;Upgrading SharePoint applications from Azure Access Control Service to Azure Active Directory | Microsoft Learn&lt;/A&gt;). ACS has been discontinued in 2018, so we decided to develop this solution using Microsoft Graph using AD.&lt;/LI&gt;&lt;LI&gt;Fabric connections today (October 2024) can’t manage secrets using KeyVault, for this reason we decided to develop this solution using a Spark notebook to properly manage secrets stored in Key Vault using the pySpark notebookutils module.&lt;/LI&gt;&lt;LI&gt;Required Fabric connectors for this solution today (October 2024) cannot be parametrised, so we decided to develop this general-purpose solution using a Fabric pySpark notebook passing the configuration parameters to the notebook from the calling pipeline as pipeline parameters.&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Thu, 05 Dec 2024 07:50:42 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/4314866#M5496</guid>
      <dc:creator>_maclura_</dc:creator>
      <dc:date>2024-12-05T07:50:42Z</dc:date>
    </item>
    <item>
      <title>Re: How to load an excel file from Sharepoint to a Lakehouse delta table using Notebooks</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/4708949#M9676</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;I'm able to read the files and folders in the notebook. How I can download the file directly in its format to lakhouse files folders?&lt;/P&gt;</description>
      <pubDate>Tue, 27 May 2025 12:31:23 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/How-to-load-an-excel-file-from-Sharepoint-to-a-Lakehouse-delta/m-p/4708949#M9676</guid>
      <dc:creator>DiKi-I</dc:creator>
      <dc:date>2025-05-27T12:31:23Z</dc:date>
    </item>
  </channel>
</rss>

