This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
Microsoft Fabric's Lakehouse helps us better unified management of enterprise-level data environments. In the process of transforming to AI, we cannot do without the assistance of these enterprise data. In my previous blog, I mentioned how to build RAG applications based on data in the Microsoft Fabric environment. In this post, I will introduce how to build a RAG application through prompt flow in a more professional machine learning environment - Azure Machine Learning Service combined with Microsoft Fabric’s Lakehouse data.
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
Azure Machine Learning Service is a machine learning platform that I enjoy using, covering the machine learning process from data, training, testing, deployment, monitoring, etc. We can very quickly introduce Microsoft Fabric Lakehouse data to Azure Machine Learning Service through a short script.
1. Get the ABFS Path of Lakehouse in Microsoft Fabric.
Choose Your Microsoft Fabric’s Lakehouse, Click Files -> Properties.
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
Copy ABFS Path
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
abfss://<One Lake workspace name>@msit-onelake.dfs.fabric.microsoft.com/<Lakehouse ID>/Files
2. Create a new Notebook in your local machine. Execute the following code to import Lakehouse data into Azure Machine Learning Service
! pip install azure-ai-ml -U
! pip install mltable azureml-dataprep[pandas] -U
! pip install azureml-fsspec -U
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azure.ai.ml.entities import OneLakeDatastore, OneLakeArtifact
subscription_id = "Your Azure Subscription ID"
resource_group = "Your Azure Machine Learning Service Workspace Resource Group"
workspace = "Your Azure Machine Learning Service Workspace Name"
ml_client = MLClient(
DefaultAzureCredential(), subscription_id, resource_group, workspace
)
artifact = OneLakeArtifact(
name=<Lakehouse ID>,
type="lake_house"
)
store = OneLakeDatastore(
name="onelake_lh_for_azureml",
description="Credential-less OneLake datastore.",
endpoint="msit-onelake.dfs.fabric.microsoft.com",
artifact=artifact,
one_lake_workspace_name=<One Lake workspace name>,
)
ml_client.create_or_update(store)
3. Test the data to see if it is imported successfully.
from azure.ai.ml.constants import AssetTypes, InputOutputModes
from azureml.fsspec import AzureMachineLearningFileSystem
uri = 'azureml://subscriptions/<Your Azure Subscription ID >/resourcegroups/<Your Azure Machine Learning Service Resource Group>/workspaces/<Your Azure Machine Learning Service Workspace Name>/datastores/onelake_lh_for_azureml'
# create the filesystem
fs = AzureMachineLearningFileSystem(uri)
fs.ls()
with fs.open('Files/csv/sales.csv') as f:
data = f.readlines()
print(data[0:5])
f.close()
You can select Data from Azure Machine Learning Service to see if the relevant data is imported successfully.
from azure.ai.ml.entities import Data
import pandas as pd
import mltable
csv_path = 'azureml://datastores/onelake_lh_for_azureml/paths/Files/csv'
my_csv_data = Data(
path=csv_path,
type=AssetTypes.URI_FOLDER,
description="demo",
name="csv_data_source",
version="1.0.0"
)
ml_client.data.create_or_update(my_csv_data)
csv_data = ml_client.data.get("csv_data_source", version="1.0.0")
path = {
'folder': csv_data.path
}
tbl = mltable.from_delimited_files(paths=[path])
df = pd.read_csv( csv_data.path + '/sales.csv')
df
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
Of course, you can also check the data in the workspace of Azure Machine Learning Service to see if it is synchronized well.
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
In the previous content we used Semantic Kernel. In this blog, we use prompt flow to build the application. Prompt flow is a development tool designed to streamline the entire development cycle of AI applications powered by Large Language Models (LLMs). As the momentum for LLM-based AI applications continues to grow across the globe, Prompt flow provides a comprehensive solution that simplifies the process of prototyping, experimenting, iterating, and deploying your AI applications. If you're looking for a versatile and intuitive development tool that will streamline your LLM-based AI application development, then prompt flow is the perfect solution for you.
The biggest feature of prompt flow is to help the Prompt project to be better integrated into the project. Especially in stabilizing the output of LLM, it allows you to choose the best Prompt and combine it with LLM for effective work.
Prompt flow development applications can be developed on Azure Machine Learning Service, on the command line, or on Visual Studio Code. It is recommended that you develop on Visual Studio Code. Firstly, you need to install prompt flow for VS Code extensions.
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
After successful installation, click on the prompt flow extensions on the left sidebar and select Installation Dependencies. When the environment is successfully configured, you can choose to create and build the Prompt flow application.
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
Prompt flow can support different connections, such as Azure OpenAI Service, Azure Cognitive Search, Azure Content Safety and also support Custom Connections. You can set relevant content according to your needs.
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
Custom Connection is often used. You can set some link configurations, mainly in the form of key-value pairs.
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
Use prompt flow to quickly build a flow for enterprise data. The following are implementations for structured data and unstructured data, as well as a simple example of the Chat flow process. All of this data all comes from our Azure Machine Learning Service (imported from Microsoft Fabric Lakehouse)
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
This is a RAG application for unstructured data and structured data built by prompt flow
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
Using_Microsoft_Fabrics_Lakehouse_Data_and_prompt_flow_in_Azure_Machine_Learning
You can download samples in my GitHub Repo
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.