Solved: Re: Can we Load Data from lakehouse to warehouse i...

Shanthan118 · ‎07-12-2024

Hi Community,

In Microsoft fabric i have lakehouse that has the landing raw data from other source, Now i want to build aggregate tables on top of this data in a warehouse using notebook.

I have this lakehouse and warehouse in the same fabric workspace only.
I want to perform a Full Load(Truncate and Load) operation in the warehouse using the notebook how to do?
Can any one say how to conncet to the warehouse in fabric using notebook and write the data to it.

AlexanderPowBI · ‎07-15-2024

Hi,

Here is one example I have used.

First, create SP and make sure it has right permissions and workspace / warehouse access.

below is a example code I have used to get access using a service principal (Followed this article: https://debruyn.dev/2023/connect-to-fabric-lakehouses-warehouses-from-python-code/). You can modify this and use it in fabric notebook & also a goood idea is to save and retrieve service principal secret from Key Vault.

//Alexander

import logging
import pyodbc
from azure.identity import ClientSecretCredential
import struct
from itertools import chain, repeat
import os


def get_database_connection():
    try:
        tenant_id = os.getenv('TENANT_ID')
        client_id=os.getenv('CLIENT_ID') #Service principal
        client_secret=os.getenv('SP_SECRET') #Service principal key
        sql_endpoint = os.getenv('YOUR_SQL_ENDPOINT')
        database = os.getenv('DATABASE_NAME')

        credential = ClientSecretCredential(tenant_id,client_id,client_secret)
        token_object = credential.get_token("https://database.windows.net//.default") # Retrieve an access token valid to connect to SQL databases
        token_as_bytes = bytes(token_object.token, "UTF-8") # Convert the token to a UTF-8 byte string
        encoded_bytes = bytes(chain.from_iterable(zip(token_as_bytes, repeat(0)))) # Encode the bytes to a Windows byte string
        token_bytes =  struct.pack("<i", len(encoded_bytes)) + encoded_bytes # Package the token into a bytes object

        connection_string = f"Driver={{ODBC Driver 18 for SQL Server}};Server={sql_endpoint},1433;Database={database};Encrypt=Yes;TrustServerCertificate=No"
        attrs_before = {1256: token_bytes}  # Attribute pointing to SQL_COPT_SS_ACCESS_TOKEN to pass access token to the driver
       
        return pyodbc.connect(connection_string, attrs_before=attrs_before) #connection
    except Exception as e:
        logging.error(f"Error in get_database_connection: {str(e)}")
        raise

View solution in original post

v-huijiey-msft · ‎07-22-2024

Hi @Shanthan118 ,

Thanks for the reply from @AlexanderPowBI .

Yes, you can build an aggregate table on top of the warehouse, but once created, the aggregate table belongs to the warehouse, so it is not visible in the SQL endpoint of the lakehouse.

To schedule data transfer and keep the aggregate table updated, you can use Dataflow Gen2 in Microsoft Fabric.

After selecting Lakehouse in Get Data, you can perform aggregation, set the destination to Warehouse, and set a scheduled refresh after publish, so that when your Lakehouse is updated, the data will be refreshed to aggregate.

Best Regards,
Yang
Community Support Team

If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

v-huijiey-msft · ‎07-14-2024

Hi @Shanthan118 ,

Thanks for the reply from @frithjof_v / @AlexanderPowBI .

Transferring data from Lakehouse to Warehouse using a notebook can be a little difficult, I have a more convenient solution here.

Once you're in the warehouse in Fabric, select the place I've highlighted.

Select the Lakehouse you want to connect to.

Now it's time to use your connected Lakehouse.

If you have any other questions please feel free to contact me.

Best Regards,
Yang
Community Support Team

If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

Shanthan118 · ‎07-14-2024

Hi @v-huijiey-msft

First Thanks for the solution,

By the Solution you gave what i understood is I can connect directly to the lakehouse that has the tables required to the warehouse and build aggregate tables on top of that warehouse right.
Is my understanding correct?

if we create aggregate tabel can this table accessible from Lakehouse END Point.

By doing this how can we put schedule for data transferring to get data into aggregate tables and by doing this can i get data in to the warehouse from lakehouse like when there is a change in the data let us assume i have 20 records when i first connected and sometime later the record count is 40 now will these 40 records available in the warehouse table or not?

AlexanderPowBI · ‎07-12-2024

Just a heads up that Truncate is not a supported operation as of now:

https://learn.microsoft.com/en-us/fabric/data-warehouse/tsql-surface-area

For your question, I have used a stored procedured and scheduled in a pipeline for this data movement. However, you can do this:

1. In your notebook, connect to the warehouse via pyodbc. Use a service principal if you need to automate this, else I have used DeviceCodeCredential(callback=device_code_callback) < for interactive authentication
2. Drop your table
3. Create new table
4. Write data to the table

//Alexander

Shanthan118 · ‎07-14-2024

Hi @AlexanderPowBI

Thanks for the solution,
Can you please provide any script or steps to connnect to warehouse using the service principal and the SQL connection string from fabric using Pyodbc.

AlexanderPowBI · ‎07-15-2024

Hi,

Here is one example I have used.

First, create SP and make sure it has right permissions and workspace / warehouse access.

below is a example code I have used to get access using a service principal (Followed this article: https://debruyn.dev/2023/connect-to-fabric-lakehouses-warehouses-from-python-code/). You can modify this and use it in fabric notebook & also a goood idea is to save and retrieve service principal secret from Key Vault.

//Alexander

import logging
import pyodbc
from azure.identity import ClientSecretCredential
import struct
from itertools import chain, repeat
import os


def get_database_connection():
    try:
        tenant_id = os.getenv('TENANT_ID')
        client_id=os.getenv('CLIENT_ID') #Service principal
        client_secret=os.getenv('SP_SECRET') #Service principal key
        sql_endpoint = os.getenv('YOUR_SQL_ENDPOINT')
        database = os.getenv('DATABASE_NAME')

        credential = ClientSecretCredential(tenant_id,client_id,client_secret)
        token_object = credential.get_token("https://database.windows.net//.default") # Retrieve an access token valid to connect to SQL databases
        token_as_bytes = bytes(token_object.token, "UTF-8") # Convert the token to a UTF-8 byte string
        encoded_bytes = bytes(chain.from_iterable(zip(token_as_bytes, repeat(0)))) # Encode the bytes to a Windows byte string
        token_bytes =  struct.pack("<i", len(encoded_bytes)) + encoded_bytes # Package the token into a bytes object

        connection_string = f"Driver={{ODBC Driver 18 for SQL Server}};Server={sql_endpoint},1433;Database={database};Encrypt=Yes;TrustServerCertificate=No"
        attrs_before = {1256: token_bytes}  # Attribute pointing to SQL_COPT_SS_ACCESS_TOKEN to pass access token to the driver
       
        return pyodbc.connect(connection_string, attrs_before=attrs_before) #connection
    except Exception as e:
        logging.error(f"Error in get_database_connection: {str(e)}")
        raise

frithjof_v · ‎07-12-2024

I don't think it's possible to write data to Fabric Data Warehouse by using Notebook.

I think you can only use T-SQL (scripts, stored procedures) inside the Data Warehouse interface, or use Data Pipeline or Dataflow Gen2.

However I am interested if anyone knows how to write data to Fabric Data Warehouse by using Notebook.

Can we Load Data from lakehouse to warehouse in Microsoft fabric using Notebook

Helpful resources

Join us at the Microsoft Fabric Community Conference

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024

Join us at the 2025 Microsoft Fabric Community Conference

Can we Load Data from lakehouse to warehouse in Microsoft fabric using Notebook

Helpful resources

Join us at the Microsoft Fabric Community Conference

Microsoft Fabric Community Conference 2025

A Year in Review - December 2024