<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Hello everyone! I am trying to access to excel files (with multiple tabs) that is in storage con in Data Engineering</title>
    <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4176943#M4288</link>
    <description>&lt;P&gt;HI&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/731469"&gt;@sruthiramana&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;The error message mention you can't directly access them without credentials, perhaps you can try to add authorization steps during load data from storage account.&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;from notebookutils import mssparkutils

# service principal
tenant_id = "&amp;lt;tenant_id&amp;gt;"
client_id = "&amp;lt;client_id&amp;gt;"
client_secret = mssparkutils.credentials.getSecret("https://YourKeyVault.vault.azure.net/","your-client-secret-secret-name")

# Azure storage detail
storage_account_name = "&amp;lt;account&amp;gt;"
container_name = "&amp;lt;container_name&amp;gt;"
file_path= "&amp;lt;file_path&amp;gt;"

# Spark configuration
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", client_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", client_secret)
spark.conf.set("fs.azure.account.oauth2.client.endpoint", f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")

# Full file path
data_path = f"abfss://{container_name}@{storage_account_name}.dfs.core.windows.net/{file_path}"

# Read the data into a Spark DataFrame
df = spark.read.format("csv").option("header", "true").load(data_path)

# show the result
df.show()
&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Xiaoxin Sheng&lt;/P&gt;</description>
    <pubDate>Thu, 26 Sep 2024 07:01:53 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2024-09-26T07:01:53Z</dc:date>
    <item>
      <title>Hello everyone! I am trying to access to excel files (with multiple tabs) that is in storage contain</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4175582#M4269</link>
      <description>&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Hello everyone! I am trying to access to excel files (with multiple tabs) that is in storage container in a Azure Blob. So that I can automate this process using pipeline.&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;1) For a Lakehouse, and I created a shortcut to access azure blob storage. (Successfully done)&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;2) I can see the excel file in the Files folder.&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;3) Now trying to access the sheets using the ABFSS path but it fails to recognize the file.&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;# Define the ABFSS path to the Excel file&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;excel_file_path ="abfss_path/TestFile.xlsx"&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;# Step 1: Read the Excel file using Pandas&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;# You might need to specify engine='openpyxl' depending on the Excel format&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;excel_data = pd.read_excel(excel_file_path, sheet_name=None)&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;# Step 2: Initialize a Spark session if not already available&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;spark = SparkSession.builder.getOrCreate()&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;# Step 3: Loop through each sheet and convert it to a Spark DataFrame&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;for sheet_name, sheet_df in excel_data.items():&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;# Convert Pandas DataFrame to Spark DataFrame&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;spark_df = spark.createDataFrame(sheet_df)&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;# Perform any actions on the Spark DataFrame&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;spark df.show()&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;4) This throws the error&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV class="lia-align-left"&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;Py4JJavaError: An error occurred while calling &amp;nbsp;. : java.nio.file.AccessDeniedException: Operation failed: "Server failed to authenticate the request. Please refer to the information in the www-authenticate header."&lt;/FONT&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 25 Sep 2024 19:08:43 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4175582#M4269</guid>
      <dc:creator>sruthiramana</dc:creator>
      <dc:date>2024-09-25T19:08:43Z</dc:date>
    </item>
    <item>
      <title>Re: Hello everyone! I am trying to access to excel files (with multiple tabs) that is in storage con</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4175756#M4272</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/731469"&gt;@sruthiramana&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What is the abfss path you are using? You can replace the actual id-s with xxx, but I want to check if the right folders are selected. Furthermore, you don't need to start a Spark session within a Fabric notebook. Could you remove step 2? They may be a permission error there, since it is another session. A bit guessing here, but that session is not needed.&lt;/P&gt;</description>
      <pubDate>Wed, 25 Sep 2024 21:04:28 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4175756#M4272</guid>
      <dc:creator>FabianSchut</dc:creator>
      <dc:date>2024-09-25T21:04:28Z</dc:date>
    </item>
    <item>
      <title>Re: Hello everyone! I am trying to access to excel files (with multiple tabs) that is in storage con</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4176943#M4288</link>
      <description>&lt;P&gt;HI&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/731469"&gt;@sruthiramana&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;The error message mention you can't directly access them without credentials, perhaps you can try to add authorization steps during load data from storage account.&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="python"&gt;from notebookutils import mssparkutils

# service principal
tenant_id = "&amp;lt;tenant_id&amp;gt;"
client_id = "&amp;lt;client_id&amp;gt;"
client_secret = mssparkutils.credentials.getSecret("https://YourKeyVault.vault.azure.net/","your-client-secret-secret-name")

# Azure storage detail
storage_account_name = "&amp;lt;account&amp;gt;"
container_name = "&amp;lt;container_name&amp;gt;"
file_path= "&amp;lt;file_path&amp;gt;"

# Spark configuration
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", client_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", client_secret)
spark.conf.set("fs.azure.account.oauth2.client.endpoint", f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")

# Full file path
data_path = f"abfss://{container_name}@{storage_account_name}.dfs.core.windows.net/{file_path}"

# Read the data into a Spark DataFrame
df = spark.read.format("csv").option("header", "true").load(data_path)

# show the result
df.show()
&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Xiaoxin Sheng&lt;/P&gt;</description>
      <pubDate>Thu, 26 Sep 2024 07:01:53 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4176943#M4288</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-09-26T07:01:53Z</dc:date>
    </item>
    <item>
      <title>Re: Hello everyone! I am trying to access to excel files (with multiple tabs) that is in storage con</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4178863#M4307</link>
      <description>&lt;P&gt;Thanks for your help, it was a mistake at my end. I was looking for a wrong file name.!&lt;/P&gt;</description>
      <pubDate>Thu, 26 Sep 2024 18:46:19 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4178863#M4307</guid>
      <dc:creator>sruthiramana</dc:creator>
      <dc:date>2024-09-26T18:46:19Z</dc:date>
    </item>
    <item>
      <title>Re: Hello everyone! I am trying to access to excel files (with multiple tabs) that is in storage con</title>
      <link>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4181754#M4320</link>
      <description>&lt;P&gt;HI&amp;nbsp;&lt;a href="https://community.fabric.microsoft.com/t5/user/viewprofilepage/user-id/731469"&gt;@sruthiramana&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;I'm glad here you find the root causing and share it here.&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Xiaoxin Sheng&lt;/P&gt;</description>
      <pubDate>Fri, 27 Sep 2024 06:21:08 GMT</pubDate>
      <guid>https://community.fabric.microsoft.com/t5/Data-Engineering/Hello-everyone-I-am-trying-to-access-to-excel-files-with/m-p/4181754#M4320</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2024-09-27T06:21:08Z</dc:date>
    </item>
  </channel>
</rss>

