Solved: Re: Read data from lakehouse table from azure func...

AslakJonhaugen · ‎09-20-2024

Hi,

I must read data from a lakehouse table in Fabric with an Azure Function (Python). However, I cannot find any code examples. Can anyone provide an example, or how to achieve this?

Regards from Norway!

AslakJonhaugen · ‎09-23-2024

Hi, I found that the easiest way to solve this was to write the data I need in the Azure Function as a csv file from a dataframe in a Python notebook and and read the data in Python code in the Azure with a Pandas dataframe:

This is the Azure Python Function code:

from azure.storage.filedatalake import DataLakeServiceClient

from azure.identity import DefaultAzureCredential

import pandas as pd

from io import StringIO

# Set your account and workspace details

ACCOUNT_NAME = "onelake"

WORKSPACE_NAME = "GUID" # Workspace GUID

DIRECTORY_PATH = "GUID/Files/FILENAME" # Directory path containing the CSV file

def main():

# Create a service client using the default Azure credential

account_url = f"https://{ACCOUNT_NAME}.dfs.fabric.microsoft.com"

token_credential = DefaultAzureCredential()

service_client = DataLakeServiceClient(account_url, credential=token_credential)

# Create a file system client for the workspace

file_system_client = service_client.get_file_system_client(WORKSPACE_NAME)

# List all files in the specified directory

paths = file_system_client.get_paths(path=DIRECTORY_PATH)

# Find the CSV file in the directory

csv_file_path = None

for path in paths:

if path.name.endswith('.csv', csv_file_path = path.name

break # Stop after finding the first CSV file

if csv_file_path:

print(f"Found CSV file: {csv_file_path}")

# Create a file client for the specific CSV file

file_client = file_system_client.get_file_client(csv_file_path)

# Download the file content

download = file_client.download_file()

downloaded_bytes = download.readall()

# Convert the downloaded bytes to a StringIO object for use with pandas

csv_data = StringIO(downloaded_bytes.decode('utf-8'))

# Load the CSV data into a pandas DataFrame

df = pd.read_csv(csv_data)

# Print or process the DataFrame

print(df)

else:

print("No CSV file found in the directory.")

View solution in original post

AslakJonhaugen · ‎09-23-2024