Solved: Fabric's Notebook connect to SAP Hana

SergioSurfer · ‎09-20-2024

In Microsoft Fabric's module 'Synapse Data Engineering' we are able to create a Notebook that uses Python (more specifically the PySpark library) which we can use to process large volumes of Data.

How can I, programmatically in Notebook, connect to a SAP Hana database and fetch for example one table to insert to OneLake?

Here are some resources that I found but I still feel like it's not what I really want.

1. Efficient Data Ingestion with Spark and Microsoft Fabric Notebook | by Satya Nerurkar | Medium

(This one for example connects to some Azure Blob Storage? and not directly the SAP database.)

2. Use Apache Spark in Microsoft Fabric - Training | Microsoft Learn

(There's also this but this is a generic use of Apache Spark and not exactly the connection to SAP Hana.)

3. Accessing SAP HANA Data lake Relational Engine using Apache Spark | SAP Tutorials

(This one seems to be closer to what I was expecting to do, however the Step 1 seems unecessary for me, while Step 2 seems to be the actual CODE to make the connection to the database.

If you could provide any more resources or code that simply connect to the SAP Hana database to fetch a table and then upload it to OneLake I would appreciate.

Thank you in advance.

v-shex-msft · ‎09-22-2024

HI @SergioSurfer,

Perhaps you can try to use jdbc with sap driver to connect to the SAP HANA data source:

from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder \
    .appName("SAP HANA") \
    .getOrCreate()

# SAP HANA connection properties
hana_url = "jdbc:sap://<SAP_HANA_HOST>:<PORT>"
hana_properties = {
    "user": "<USERNAME>",
    "password": "<PASSWORD>",
    "driver": "com.sap.db.jdbc.Driver"
}

# Load data from SAP HANA
df = spark.read.jdbc(url=hana_url, table="<TABLE_NAME>", properties=hana_properties)

# Show the data
df.show()

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

View solution in original post

v-shex-msft · ‎09-22-2024

HI @SergioSurfer,

Perhaps you can try to use jdbc with sap driver to connect to the SAP HANA data source:

from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder \
    .appName("SAP HANA") \
    .getOrCreate()

# SAP HANA connection properties
hana_url = "jdbc:sap://<SAP_HANA_HOST>:<PORT>"
hana_properties = {
    "user": "<USERNAME>",
    "password": "<PASSWORD>",
    "driver": "com.sap.db.jdbc.Driver"
}

# Load data from SAP HANA
df = spark.read.jdbc(url=hana_url, table="<TABLE_NAME>", properties=hana_properties)

# Show the data
df.show()

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

FabianSchut · ‎09-22-2024

Hi @SergioSurfer,

is this something that could work for you:

https://community.sap.com/t5/enterprise-resource-planning-blogs-by-sap/using-python-to-get-data-from...

Do note that this is the cloud version of SAP HANA. If you are using the on-premise version you should first install the Fabric gateway. Using the gateway in a notebook is not supported yet, so you will have to use a copy activity in a data pipeline.

Fabric's Notebook connect to SAP Hana

Helpful resources

Join us at the Microsoft Fabric Community Conference

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024

Join us at the 2025 Microsoft Fabric Community Conference

Fabric's Notebook connect to SAP Hana

Helpful resources

Join us at the Microsoft Fabric Community Conference

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024