Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
antoniofarias
Regular Visitor

Authentication to read data from Lakehouse using ABFS path and python/spark code

I'm trying to read data from lakehouse using a python code, is there any documentation of authentication to use and help me, because I did't found any one.

 

I already created an app registration conected into fabric and have client id, secret value...

 

I know how to write and read with a abfs path but I don't know how to authenticate into.

 

I already found something using the Azure Databricks, but at this time I need to do this without use azure. 

1 ACCEPTED SOLUTION

Hi @v-nikhilan-msft, I'm using vscode + spark(docker) to run the code. I found the problem, it was my firewall blocking, so here is the code to read a file outside a Microsoft service:

 

from pyspark.sql import SparkSession

spark = (SparkSession.builder.config("spark.jars.packages", "org.apache.hadoop:hadoop-azure:3.3.1")
         .appName("Fabric").getOrCreate())

spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", service_principal_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", service_principal_password)
spark.conf.set("fs.azure.account.oauth2.client.endpoint", f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")
spark.conf.set("spark.sql.files.ignoreCorruptFiles", "true")

table_name = ''

df = spark.read.format("parquet").load(f"abfss://{workspace_name}@onelake.dfs.fabric.microsoft.com/{lakehouse_name}.Lakehouse/Tables/{table_name}/*.parquet")

df.show()

View solution in original post

4 REPLIES 4
v-nikhilan-msft
Community Support
Community Support

Hi @antoniofarias 
Thanks for using Fabric Community.
You can refer to this doc : Connect to Fabric Lakehouses & Warehouses from Python code - Sam Debruyn

Hope this helps. Please let me know if you have any further questions.

Hi @v-nikhilan-msft, thanks for your reply, I used this code yesterday and it worked in Databricks but didn't in vscode

 

antoniofarias_0-1715779264892.png

 

Some errors I got:

 

antoniofarias_1-1715779420811.png

 

Hi @antoniofarias 
Are you running the above code in Databricks or Fabric? Can you please provide me these details

Thanks

Hi @v-nikhilan-msft, I'm using vscode + spark(docker) to run the code. I found the problem, it was my firewall blocking, so here is the code to read a file outside a Microsoft service:

 

from pyspark.sql import SparkSession

spark = (SparkSession.builder.config("spark.jars.packages", "org.apache.hadoop:hadoop-azure:3.3.1")
         .appName("Fabric").getOrCreate())

spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", service_principal_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", service_principal_password)
spark.conf.set("fs.azure.account.oauth2.client.endpoint", f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")
spark.conf.set("spark.sql.files.ignoreCorruptFiles", "true")

table_name = ''

df = spark.read.format("parquet").load(f"abfss://{workspace_name}@onelake.dfs.fabric.microsoft.com/{lakehouse_name}.Lakehouse/Tables/{table_name}/*.parquet")

df.show()

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFBCUpdateCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.