Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
In Microsoft Fabric's module 'Synapse Data Engineering' we are able to create a Notebook that uses Python (more specifically the PySpark library) which we can use to process large volumes of Data.
How can I, programmatically in Notebook, connect to a SAP Hana database and fetch for example one table to insert to OneLake?
Here are some resources that I found but I still feel like it's not what I really want.
1. Efficient Data Ingestion with Spark and Microsoft Fabric Notebook | by Satya Nerurkar | Medium
(This one for example connects to some Azure Blob Storage? and not directly the SAP database.)
2. Use Apache Spark in Microsoft Fabric - Training | Microsoft Learn
(There's also this but this is a generic use of Apache Spark and not exactly the connection to SAP Hana.)
3. Accessing SAP HANA Data lake Relational Engine using Apache Spark | SAP Tutorials
(This one seems to be closer to what I was expecting to do, however the Step 1 seems unecessary for me, while Step 2 seems to be the actual CODE to make the connection to the database.
If you could provide any more resources or code that simply connect to the SAP Hana database to fetch a table and then upload it to OneLake I would appreciate.
Thank you in advance.
Solved! Go to Solution.
HI @SergioSurfer,
Perhaps you can try to use jdbc with sap driver to connect to the SAP HANA data source:
from pyspark.sql import SparkSession
# Initialize Spark session
spark = SparkSession.builder \
.appName("SAP HANA") \
.getOrCreate()
# SAP HANA connection properties
hana_url = "jdbc:sap://<SAP_HANA_HOST>:<PORT>"
hana_properties = {
"user": "<USERNAME>",
"password": "<PASSWORD>",
"driver": "com.sap.db.jdbc.Driver"
}
# Load data from SAP HANA
df = spark.read.jdbc(url=hana_url, table="<TABLE_NAME>", properties=hana_properties)
# Show the data
df.show()
Regards,
Xiaoxin Sheng
Thanks for raising this question. I had similar one. Glad found this post. Bookmarking links, now
HI @SergioSurfer,
Perhaps you can try to use jdbc with sap driver to connect to the SAP HANA data source:
from pyspark.sql import SparkSession
# Initialize Spark session
spark = SparkSession.builder \
.appName("SAP HANA") \
.getOrCreate()
# SAP HANA connection properties
hana_url = "jdbc:sap://<SAP_HANA_HOST>:<PORT>"
hana_properties = {
"user": "<USERNAME>",
"password": "<PASSWORD>",
"driver": "com.sap.db.jdbc.Driver"
}
# Load data from SAP HANA
df = spark.read.jdbc(url=hana_url, table="<TABLE_NAME>", properties=hana_properties)
# Show the data
df.show()
Regards,
Xiaoxin Sheng
Hi @SergioSurfer,
is this something that could work for you:
Do note that this is the cloud version of SAP HANA. If you are using the on-premise version you should first install the Fabric gateway. Using the gateway in a notebook is not supported yet, so you will have to use a copy activity in a data pipeline.