Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
Shanthan118
Frequent Visitor

Can we Load Data from lakehouse to warehouse in Microsoft fabric using Notebook

Hi Community,

In Microsoft fabric i have lakehouse that has the landing raw data from other source, Now i want to build aggregate tables on top of this data in a warehouse using notebook.

I have this lakehouse and warehouse in the same fabric workspace only.
I want to perform a Full Load(Truncate and Load) operation in the warehouse using the notebook how to do?
Can any one say how to conncet to the warehouse in fabric using notebook and write the data to it.

1 ACCEPTED SOLUTION

Hi,

Here is one example I have used.

First, create SP and make sure it has right permissions and workspace / warehouse access.

 

below is a example code I have used to get access using a service principal (Followed this article: https://debruyn.dev/2023/connect-to-fabric-lakehouses-warehouses-from-python-code/). You can modify this and use it in fabric notebook & also a goood idea is to save and retrieve service principal secret from Key Vault. 

 

//Alexander

 

import logging
import pyodbc
from azure.identity import ClientSecretCredential
import struct
from itertools import chain, repeat
import os


def get_database_connection():
    try:
        tenant_id = os.getenv('TENANT_ID')
        client_id=os.getenv('CLIENT_ID') #Service principal
        client_secret=os.getenv('SP_SECRET') #Service principal key
        sql_endpoint = os.getenv('YOUR_SQL_ENDPOINT')
        database = os.getenv('DATABASE_NAME')

        credential = ClientSecretCredential(tenant_id,client_id,client_secret)
        token_object = credential.get_token("https://database.windows.net//.default") # Retrieve an access token valid to connect to SQL databases
        token_as_bytes = bytes(token_object.token, "UTF-8") # Convert the token to a UTF-8 byte string
        encoded_bytes = bytes(chain.from_iterable(zip(token_as_bytes, repeat(0)))) # Encode the bytes to a Windows byte string
        token_bytes =  struct.pack("<i", len(encoded_bytes)) + encoded_bytes # Package the token into a bytes object

        connection_string = f"Driver={{ODBC Driver 18 for SQL Server}};Server={sql_endpoint},1433;Database={database};Encrypt=Yes;TrustServerCertificate=No"
        attrs_before = {1256: token_bytes}  # Attribute pointing to SQL_COPT_SS_ACCESS_TOKEN to pass access token to the driver
       
        return pyodbc.connect(connection_string, attrs_before=attrs_before) #connection
    except Exception as e:
        logging.error(f"Error in get_database_connection: {str(e)}")
        raise

 

 

View solution in original post

7 REPLIES 7
v-huijiey-msft
Community Support
Community Support

Hi @Shanthan118 ,

 

Thanks for the reply from @AlexanderPowBI .

 

Yes, you can build an aggregate table on top of the warehouse, but once created, the aggregate table belongs to the warehouse, so it is not visible in the SQL endpoint of the lakehouse.

 

To schedule data transfer and keep the aggregate table updated, you can use Dataflow Gen2 in Microsoft Fabric.

 

After selecting Lakehouse in Get Data, you can perform aggregation, set the destination to Warehouse, and set a scheduled refresh after publish, so that when your Lakehouse is updated, the data will be refreshed to aggregate.

 

vhuijieymsft_0-1721632072804.png

vhuijieymsft_1-1721632072812.png

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

v-huijiey-msft
Community Support
Community Support

Hi @Shanthan118 ,

 

Thanks for the reply from @frithjof_v / @AlexanderPowBI .

 

Transferring data from Lakehouse to Warehouse using a notebook can be a little difficult, I have a more convenient solution here.

 

Once you're in the warehouse in Fabric, select the place I've highlighted.

vhuijieymsft_0-1721008178110.png

 

Select the Lakehouse you want to connect to.

vhuijieymsft_1-1721008178118.png

 

Now it's time to use your connected Lakehouse.

vhuijieymsft_2-1721008284107.png

vhuijieymsft_3-1721008284119.png

 

If you have any other questions please feel free to contact me.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

Hi @v-huijiey-msft 

First Thanks for the solution,

By the Solution you gave what i understood is I can connect directly to the lakehouse that has the tables required to the warehouse and build aggregate tables on top of that warehouse right.
Is my understanding correct?

 

if we create aggregate tabel can this table accessible from Lakehouse END Point.

By doing this how can we put schedule for data transferring to get data into aggregate tables and by doing this can i get data in to the warehouse from lakehouse like when there is a change in the data let us assume i have 20 records when i first connected and sometime later the record count is 40 now will these 40 records available in the warehouse table or not?

AlexanderPowBI
Resolver I
Resolver I

Just a heads up that Truncate is not a supported operation as of now:

https://learn.microsoft.com/en-us/fabric/data-warehouse/tsql-surface-area

 

For your question, I have used a stored procedured and scheduled in a pipeline for this data movement. However, you can do this:

  • 1. In your notebook, connect to the warehouse via pyodbc. Use a service principal if you need to automate this, else I have used DeviceCodeCredential(callback=device_code_callback)  < for interactive authentication
  • 2. Drop your table
  • 3. Create new table 
  • 4. Write data to the table

//Alexander

 

Hi @AlexanderPowBI 

Thanks for the solution,
Can you please provide any script or steps to connnect to warehouse using the service principal and the SQL connection string from fabric using Pyodbc.

Hi,

Here is one example I have used.

First, create SP and make sure it has right permissions and workspace / warehouse access.

 

below is a example code I have used to get access using a service principal (Followed this article: https://debruyn.dev/2023/connect-to-fabric-lakehouses-warehouses-from-python-code/). You can modify this and use it in fabric notebook & also a goood idea is to save and retrieve service principal secret from Key Vault. 

 

//Alexander

 

import logging
import pyodbc
from azure.identity import ClientSecretCredential
import struct
from itertools import chain, repeat
import os


def get_database_connection():
    try:
        tenant_id = os.getenv('TENANT_ID')
        client_id=os.getenv('CLIENT_ID') #Service principal
        client_secret=os.getenv('SP_SECRET') #Service principal key
        sql_endpoint = os.getenv('YOUR_SQL_ENDPOINT')
        database = os.getenv('DATABASE_NAME')

        credential = ClientSecretCredential(tenant_id,client_id,client_secret)
        token_object = credential.get_token("https://database.windows.net//.default") # Retrieve an access token valid to connect to SQL databases
        token_as_bytes = bytes(token_object.token, "UTF-8") # Convert the token to a UTF-8 byte string
        encoded_bytes = bytes(chain.from_iterable(zip(token_as_bytes, repeat(0)))) # Encode the bytes to a Windows byte string
        token_bytes =  struct.pack("<i", len(encoded_bytes)) + encoded_bytes # Package the token into a bytes object

        connection_string = f"Driver={{ODBC Driver 18 for SQL Server}};Server={sql_endpoint},1433;Database={database};Encrypt=Yes;TrustServerCertificate=No"
        attrs_before = {1256: token_bytes}  # Attribute pointing to SQL_COPT_SS_ACCESS_TOKEN to pass access token to the driver
       
        return pyodbc.connect(connection_string, attrs_before=attrs_before) #connection
    except Exception as e:
        logging.error(f"Error in get_database_connection: {str(e)}")
        raise

 

 

frithjof_v
Community Champion
Community Champion

I don't think it's possible to write data to Fabric Data Warehouse by using Notebook.

 

I think you can only use T-SQL (scripts, stored procedures) inside the Data Warehouse interface, or use Data Pipeline or Dataflow Gen2.

 

However I am interested if anyone knows how to write data to Fabric Data Warehouse by using Notebook.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.