Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply

Read data from lakehouse table from azure function

Hi,

I must read data from a lakehouse table in Fabric with an Azure Function (Python). However, I cannot find any code examples. Can anyone provide an example, or how to achieve this? 

Regards from Norway!

1 ACCEPTED SOLUTION

Hi, I found that the easiest way to solve this was to write the data I need in the Azure Function as a csv file from a dataframe in a Python notebook and and read the data in Python code in the Azure with a Pandas dataframe:

This is the Azure Python Function code:

from azure.storage.filedatalake import DataLakeServiceClient
from azure.identity import DefaultAzureCredential
import pandas as pd
from io import StringIO

# Set your account and workspace details
ACCOUNT_NAME = "onelake"
WORKSPACE_NAME = "GUID"  # Workspace GUID
DIRECTORY_PATH = "GUID/Files/FILENAME"  # Directory path containing the CSV file

def main():
    # Create a service client using the default Azure credential
    account_url = f"https://{ACCOUNT_NAME}.dfs.fabric.microsoft.com"
    token_credential = DefaultAzureCredential()
    service_client = DataLakeServiceClient(account_url, credential=token_credential)

    # Create a file system client for the workspace
    file_system_client = service_client.get_file_system_client(WORKSPACE_NAME)

    # List all files in the specified directory
    paths = file_system_client.get_paths(path=DIRECTORY_PATH)

    # Find the CSV file in the directory
    csv_file_path = None
    for path in paths:
        if path.name.endswith('.csv', csv_file_path = path.name
            break  # Stop after finding the first CSV file

    if csv_file_path:
        print(f"Found CSV file: {csv_file_path}")
       
        # Create a file client for the specific CSV file
        file_client = file_system_client.get_file_client(csv_file_path)

        # Download the file content
        download = file_client.download_file()
        downloaded_bytes = download.readall()

        # Convert the downloaded bytes to a StringIO object for use with pandas
        csv_data = StringIO(downloaded_bytes.decode('utf-8'))

        # Load the CSV data into a pandas DataFrame
        df = pd.read_csv(csv_data)

        # Print or process the DataFrame
        print(df)
    else:
        print("No CSV file found in the directory.")
 

View solution in original post

2 REPLIES 2

Hi, I found that the easiest way to solve this was to write the data I need in the Azure Function as a csv file from a dataframe in a Python notebook and and read the data in Python code in the Azure with a Pandas dataframe:

This is the Azure Python Function code:

from azure.storage.filedatalake import DataLakeServiceClient
from azure.identity import DefaultAzureCredential
import pandas as pd
from io import StringIO

# Set your account and workspace details
ACCOUNT_NAME = "onelake"
WORKSPACE_NAME = "GUID"  # Workspace GUID
DIRECTORY_PATH = "GUID/Files/FILENAME"  # Directory path containing the CSV file

def main():
    # Create a service client using the default Azure credential
    account_url = f"https://{ACCOUNT_NAME}.dfs.fabric.microsoft.com"
    token_credential = DefaultAzureCredential()
    service_client = DataLakeServiceClient(account_url, credential=token_credential)

    # Create a file system client for the workspace
    file_system_client = service_client.get_file_system_client(WORKSPACE_NAME)

    # List all files in the specified directory
    paths = file_system_client.get_paths(path=DIRECTORY_PATH)

    # Find the CSV file in the directory
    csv_file_path = None
    for path in paths:
        if path.name.endswith('.csv', csv_file_path = path.name
            break  # Stop after finding the first CSV file

    if csv_file_path:
        print(f"Found CSV file: {csv_file_path}")
       
        # Create a file client for the specific CSV file
        file_client = file_system_client.get_file_client(csv_file_path)

        # Download the file content
        download = file_client.download_file()
        downloaded_bytes = download.readall()

        # Convert the downloaded bytes to a StringIO object for use with pandas
        csv_data = StringIO(downloaded_bytes.decode('utf-8'))

        # Load the CSV data into a pandas DataFrame
        df = pd.read_csv(csv_data)

        # Print or process the DataFrame
        print(df)
    else:
        print("No CSV file found in the directory.")
 
collinq
Super User
Super User

Hi @AslakJonhaugen ,

 

To do this you must first connect to your Lake - for Python I believe that DirectLake mode is the best but I am not an expert in this area.

 

Then, install your python scripts from here -   File > Options and settings > Options > Python scripting

 

Then, use your script to connect and do your operations.




Did I answer your question? Mark my post as a solution!

Proud to be a Datanaut!
Private message me for consulting or training needs.




Helpful resources

Announcements
OCT PBI Update Carousel

Power BI Monthly Update - October 2024

Check out the October 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors