Solved: Re: Can local PySpark access OneLake using ABFSS p...

dead_DE · ‎12-26-2025

Hi everyone,

I’m new to Fabric and Azure, and I’m trying to set up a local development workflow.

I’m running PySpark inside a container on my machine. I can access Microsoft Fabric using only my Microsoft Entra ID (no Azure subscription tied to me personally). Using azure.identity, I’m able to generate tokens and successfully access files in OneLake through the Python SDK.

What I haven’t been able to do is configure my local Spark environment to use these tokens to read data directly from OneLake using abfss:// paths. Every attempt to configure the ABFS drivers fails, and I’ve seen some comments suggesting that this scenario isn’t currently supported.

Right now it looks like the only viable approach for local Spark development is to download the files from OneLake (via the SDK) and work with them locally, essentially mirroring the Lakehouse on my machine.

Given that my access is limited, I’m wondering:

Is there any supported way to authenticate Spark directly against OneLake from outside Fabric?
If token‑based access isn’t possible, is there another authentication method or permission I could request that would allow this?

Any guidance or clarification would be greatly appreciated.

deborshi_nag · ‎12-27-2025

Hello @dead_DE

You can use local spark to access your OneLake storage, however you'd have to use a service principal. These are the prerequisites and the steps involved -

1. Make sure your Fabric tenant allows external apps

Your Fabric admin must enable:

Users can access data stored in OneLake with apps external to Fabric
(For SPs) Service principals can call Fabric public API
Then grant your service principal Contributor (or above) to the Fabric workspace.

2. Collect the OneLake ABFSS path

OneLake uses ADLS Gen2‑compatible URIs. The account name is always onelake, and the filesystem is your workspace name (or GUID). Typical pattern:

abfss://<workspaceName or workspaceGUID>@onelake.dfs.fabric.microsoft.com/<lakehouseName or itemGUID>.lakehouse/Files/

3. Ensure your local Spark has the ABFS connector

You need Hadoop’s hadoop-azure (ABFS) + azure-storage bits on the classpath.

4. Create a Microsoft Entra service principal

Record Tenant (Directory) ID, Client ID (Application ID), and Client Secret, and grant the SP access to your Fabric workspace

5. Configure Spark for ABFS OAuth against OneLake host

pyspark \
  --conf "fs.azure.account.auth.type.onelake.dfs.fabric.microsoft.com=OAuth" \
  --conf "fs.azure.account.oauth.provider.type.onelake.dfs.fabric.microsoft.com=org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" \
  --conf "fs.azure.account.oauth2.client.id.onelake.dfs.fabric.microsoft.com=<APP_CLIENT_ID>" \
  --conf "fs.azure.account.oauth2.client.secret.onelake.dfs.fabric.microsoft.com=<APP_CLIENT_SECRET>" \
  --conf "fs.azure.account.oauth2.client.endpoint.onelake.dfs.fabric.microsoft.com=https://login.microsoftonline.com/<TENANT_ID>/oauth2/token"

Kindly accept this solution if it solves your problem.

I trust this will be helpful. If you found this guidance useful, you are welcome to acknowledge with a Kudos or by marking it as a Solution.

View solution in original post

deborshi_nag · ‎01-20-2026

Hello @dead_DE

Here's how you can create a service principle or SPN. If there's a team who creates SPN for you in your organisation, they'd know this.

Go to Entra ID portal
> Applications > App registrations > New registration
Note the:
- Client ID (App ID)
- Tenant ID
- Client Secret

A tenant setting must be changed for SPN to access a Fabric workspace.

Tenant Settings > Developer Settings > “Service principals can use Fabric APIs”

Next you need to assign the SPN Contributor access to your workspace

In Fabric:

Open the Workspace
Select Manage access
Click Add people or groups
Search for your App Registration name (NOT the GUID!)
- SPNs show up by the registered application name
Assign an appropriate role:
- Contributor (usually recommended)

Once both the steps are done, you can follow the message I posted on this thread earlier. By assigning the SPN Contributor access to the workspace allows it to write to OneLake.

I trust this will be helpful. If you found this guidance useful, you are welcome to acknowledge with a Kudos or by marking it as a Solution.

View solution in original post

dead_DE

Following the guide above on creating a SPN i was able to get set up with a App cleint id and and app client secret.

My container running spark needed some configuring as well. I had to set it up with Delta Lake Jars and Azure Blob Storage configurations.

Full Example of how i got this to work

from pyspark.sql import SparkSession
# Create Spark session with Java 17 compatibility
spark = (SparkSession.builder
    .appName("OneLakeAccess")
    .master("local[*]")
    .config("spark.jars.packages", "org.apache.hadoop:hadoop-azure:3.3.4,io.delta:delta-spark_2.12:3.1.0")
    .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
    .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
    .config("spark.hadoop.fs.abfss.impl", "org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem")
    .config("spark.hadoop.fs.abfs.impl", "org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem")
    .getOrCreate())

conf = spark.sparkContext._jsc.hadoopConfiguration()

conf.set("fs.azure.account.auth.type.onelake.dfs.fabric.microsoft.com", "OAuth")
conf.set("fs.azure.account.oauth.provider.type.onelake.dfs.fabric.microsoft.com", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
conf.set("fs.azure.account.oauth2.client.id.onelake.dfs.fabric.microsoft.com", APP_CLIENT_ID)
conf.set("fs.azure.account.oauth2.client.secret.onelake.dfs.fabric.microsoft.com", APP_CLIENT_SECRET)
conf.set("fs.azure.account.oauth2.client.endpoint.onelake.dfs.fabric.microsoft.com", f"https://login.microsoftonline.com/{TENANT_ID}/oauth2/token")

df = spark.read.csv(<abfss_path_to_fabric_resource_in_lakehouse>)
df = spark.read.format("delta").load(<abfss_path_to_fabric_delta_table_in_lakehouse>)

I am now able to connect to my fabric Lakehouse Files and Delta Tables with ABFSS paths from my spark container.

View solution in original post

AparnaRamakris · ‎01-05-2026

I have personally used cli to login to Fabric Tenant and used Azure.Identity to authenticate and this has worked for me to connect and test things in local ,though it may not be suitable for production deployments .

az login --allow-no-subscriptions --tenant <TenantId>
from azure.identity import DefaultAzureCredential
delta_token = DefaultAzureCredential().get_token("https://storage.azure.com/.default").token
storage_options = {"bearer_token": delta_token, "use_fabric_endpoint": "true"}
DELTA_TABLE_PATH: str = '<complete abfss path of the lakehouse in Fabric>'
df = DeltaTable(DELTA_TABLE_PATH, storage_options=storage_options)
print(df.to_pandas().head(10))

If this code helps ,Accept as a Solution to help others as well

deborshi_nag · ‎01-06-2026

Hi @AparnaRamakris I agree, connecting to Fabric OneLake from local Python (Pandas) is not an issue - it works! I think the user was trying to connect local Spark to Fabric, which is slightly different.

I trust this will be helpful. If you found this guidance useful, you are welcome to acknowledge with a Kudos or by marking it as a Solution.

v-hashadapu · ‎01-05-2026

Hi @dead_DE , Hope you're doing fine. Can you confirm if the problem is solved or still persists? Sharing your details will help others in the community.

v-hashadapu · ‎12-31-2025

Hi @deborshi_nag , Thank you for reaching out to the Microsoft Community Forum.

We find the answer shared by @deborshi_nag is correct. Can you please confirm if the solution worked for you. It will help others with similar issues find the answer easily.

Thank you @deborshi_nag for your valuable response.

deborshi_nag · ‎12-27-2025

Hello @dead_DE

You can use local spark to access your OneLake storage, however you'd have to use a service principal. These are the prerequisites and the steps involved -

1. Make sure your Fabric tenant allows external apps

Your Fabric admin must enable:

Users can access data stored in OneLake with apps external to Fabric
(For SPs) Service principals can call Fabric public API
Then grant your service principal Contributor (or above) to the Fabric workspace.

2. Collect the OneLake ABFSS path

OneLake uses ADLS Gen2‑compatible URIs. The account name is always onelake, and the filesystem is your workspace name (or GUID). Typical pattern:

abfss://<workspaceName or workspaceGUID>@onelake.dfs.fabric.microsoft.com/<lakehouseName or itemGUID>.lakehouse/Files/

3. Ensure your local Spark has the ABFS connector

You need Hadoop’s hadoop-azure (ABFS) + azure-storage bits on the classpath.

4. Create a Microsoft Entra service principal

Record Tenant (Directory) ID, Client ID (Application ID), and Client Secret, and grant the SP access to your Fabric workspace

5. Configure Spark for ABFS OAuth against OneLake host

pyspark \
  --conf "fs.azure.account.auth.type.onelake.dfs.fabric.microsoft.com=OAuth" \
  --conf "fs.azure.account.oauth.provider.type.onelake.dfs.fabric.microsoft.com=org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider" \
  --conf "fs.azure.account.oauth2.client.id.onelake.dfs.fabric.microsoft.com=<APP_CLIENT_ID>" \
  --conf "fs.azure.account.oauth2.client.secret.onelake.dfs.fabric.microsoft.com=<APP_CLIENT_SECRET>" \
  --conf "fs.azure.account.oauth2.client.endpoint.onelake.dfs.fabric.microsoft.com=https://login.microsoftonline.com/<TENANT_ID>/oauth2/token"

Kindly accept this solution if it solves your problem.

I trust this will be helpful. If you found this guidance useful, you are welcome to acknowledge with a Kudos or by marking it as a Solution.

dead_DE · ‎12-27-2025

thank you, i will look into getting a service principal from my organization and check back on this some time this upcoming week.

dead_DE · ‎01-05-2026

just a heads up i am still trying to get more than entra level access to the account to try this solution.
will update soon

v-hashadapu · ‎01-05-2026

Hi @dead_DE , Thanks for the update. Hope you will get the access soon. Please share the details here once you had a chance to try it.

dead_DE · ‎01-20-2026

Here’s where I’m currently stuck. I’ve been told that we need to create an Azure App Registration for programmatic access. However, when I create the App Registration, I don’t see any way to associate it directly with Fabric or OneLake. Because of that, I’m not sure how to ensure the app gets the correct permissions or how to generate the right client ID for Fabric access.

I also can’t just ask IT to give me broad permissions to all of Blob Storage and hope that covers wherever OneLake lives. Everything I’ve read suggests authenticating to Fabric programmatically by generating a token through the Azure CLI, but that’s what led me here since there’s no public Spark driver that supports that token flow yet.

deborshi_nag · ‎01-20-2026

Hello @dead_DE

Here's how you can create a service principle or SPN. If there's a team who creates SPN for you in your organisation, they'd know this.

Go to Entra ID portal
> Applications > App registrations > New registration
Note the:
- Client ID (App ID)
- Tenant ID
- Client Secret

A tenant setting must be changed for SPN to access a Fabric workspace.

Tenant Settings > Developer Settings > “Service principals can use Fabric APIs”

Next you need to assign the SPN Contributor access to your workspace

In Fabric:

Open the Workspace
Select Manage access
Click Add people or groups
Search for your App Registration name (NOT the GUID!)
- SPNs show up by the registered application name
Assign an appropriate role:
- Contributor (usually recommended)

Once both the steps are done, you can follow the message I posted on this thread earlier. By assigning the SPN Contributor access to the workspace allows it to write to OneLake.

I trust this will be helpful. If you found this guidance useful, you are welcome to acknowledge with a Kudos or by marking it as a Solution.

dead_DE

TY for this, I will check with IT on monday!

dead_DE

Following the guide above on creating a SPN i was able to get set up with a App cleint id and and app client secret.

My container running spark needed some configuring as well. I had to set it up with Delta Lake Jars and Azure Blob Storage configurations.

Full Example of how i got this to work

from pyspark.sql import SparkSession
# Create Spark session with Java 17 compatibility
spark = (SparkSession.builder
    .appName("OneLakeAccess")
    .master("local[*]")
    .config("spark.jars.packages", "org.apache.hadoop:hadoop-azure:3.3.4,io.delta:delta-spark_2.12:3.1.0")
    .config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension")
    .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
    .config("spark.hadoop.fs.abfss.impl", "org.apache.hadoop.fs.azurebfs.SecureAzureBlobFileSystem")
    .config("spark.hadoop.fs.abfs.impl", "org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem")
    .getOrCreate())

conf = spark.sparkContext._jsc.hadoopConfiguration()

conf.set("fs.azure.account.auth.type.onelake.dfs.fabric.microsoft.com", "OAuth")
conf.set("fs.azure.account.oauth.provider.type.onelake.dfs.fabric.microsoft.com", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
conf.set("fs.azure.account.oauth2.client.id.onelake.dfs.fabric.microsoft.com", APP_CLIENT_ID)
conf.set("fs.azure.account.oauth2.client.secret.onelake.dfs.fabric.microsoft.com", APP_CLIENT_SECRET)
conf.set("fs.azure.account.oauth2.client.endpoint.onelake.dfs.fabric.microsoft.com", f"https://login.microsoftonline.com/{TENANT_ID}/oauth2/token")

df = spark.read.csv(<abfss_path_to_fabric_resource_in_lakehouse>)
df = spark.read.format("delta").load(<abfss_path_to_fabric_delta_table_in_lakehouse>)

I am now able to connect to my fabric Lakehouse Files and Delta Tables with ABFSS paths from my spark container.

spencer_sa · ‎12-26-2025

I'm not sure about ABFSS paths, but I've successfully accessed data in OneLake using the ADLS dfs endpoints.
I tend to use SPN (rather than organization) credentials, but it works just fine.
I've connected to the SQL Endpoint too.
Having had a read around, this may be of use;
https://christianhenrikreich.medium.com/microsoft-fabric-diving-into-lakehouse-access-from-local-mac...

If this helps, please Accept as a Solution to help others find it more easily.

Can local PySpark access OneLake using ABFSS paths?

3. Ensure your local Spark has the ABFS connector

4. Create a Microsoft Entra service principal

3. Ensure your local Spark has the ABFS connector

4. Create a Microsoft Entra service principal

Helpful resources

Fabric Monthly Update - January 2026

FabCon Atlanta 2026

FabCon is coming to Atlanta

Can local PySpark access OneLake using ABFSS paths?

3. Ensure your local Spark has the ABFS connector

4. Create a Microsoft Entra service principal

3. Ensure your local Spark has the ABFS connector

4. Create a Microsoft Entra service principal

Helpful resources

Fabric Monthly Update - January 2026

FabCon Atlanta 2026