Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
I'm trying to read data from lakehouse using a python code, is there any documentation of authentication to use and help me, because I did't found any one.
I already created an app registration conected into fabric and have client id, secret value...
I know how to write and read with a abfs path but I don't know how to authenticate into.
I already found something using the Azure Databricks, but at this time I need to do this without use azure.
Solved! Go to Solution.
Hi @v-nikhilan-msft, I'm using vscode + spark(docker) to run the code. I found the problem, it was my firewall blocking, so here is the code to read a file outside a Microsoft service:
from pyspark.sql import SparkSession
spark = (SparkSession.builder.config("spark.jars.packages", "org.apache.hadoop:hadoop-azure:3.3.1")
.appName("Fabric").getOrCreate())
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", service_principal_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", service_principal_password)
spark.conf.set("fs.azure.account.oauth2.client.endpoint", f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")
spark.conf.set("spark.sql.files.ignoreCorruptFiles", "true")
table_name = ''
df = spark.read.format("parquet").load(f"abfss://{workspace_name}@onelake.dfs.fabric.microsoft.com/{lakehouse_name}.Lakehouse/Tables/{table_name}/*.parquet")
df.show()
Hi @antoniofarias
Thanks for using Fabric Community.
You can refer to this doc : Connect to Fabric Lakehouses & Warehouses from Python code - Sam Debruyn
Hope this helps. Please let me know if you have any further questions.
Hi @v-nikhilan-msft, thanks for your reply, I used this code yesterday and it worked in Databricks but didn't in vscode
Some errors I got:
Hi @antoniofarias
Are you running the above code in Databricks or Fabric? Can you please provide me these details
Thanks
Hi @v-nikhilan-msft, I'm using vscode + spark(docker) to run the code. I found the problem, it was my firewall blocking, so here is the code to read a file outside a Microsoft service:
from pyspark.sql import SparkSession
spark = (SparkSession.builder.config("spark.jars.packages", "org.apache.hadoop:hadoop-azure:3.3.1")
.appName("Fabric").getOrCreate())
spark.conf.set("fs.azure.account.auth.type", "OAuth")
spark.conf.set("fs.azure.account.oauth.provider.type", "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider")
spark.conf.set("fs.azure.account.oauth2.client.id", service_principal_id)
spark.conf.set("fs.azure.account.oauth2.client.secret", service_principal_password)
spark.conf.set("fs.azure.account.oauth2.client.endpoint", f"https://login.microsoftonline.com/{tenant_id}/oauth2/token")
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")
spark.conf.set("spark.sql.files.ignoreCorruptFiles", "true")
table_name = ''
df = spark.read.format("parquet").load(f"abfss://{workspace_name}@onelake.dfs.fabric.microsoft.com/{lakehouse_name}.Lakehouse/Tables/{table_name}/*.parquet")
df.show()
User | Count |
---|---|
7 | |
3 | |
2 | |
2 | |
1 |
User | Count |
---|---|
10 | |
9 | |
5 | |
3 | |
3 |