Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
kirah2128
Helper II
Helper II

How to do the Machine Model application in Fabric?

Dear All,

 

is there any link or tutorial to do machine learning and deploy the models to production using fabric?

 

1. My datasets are stored in Lakehouse

2. We Trained the model and save it in Fabric 

kirah2128_0-1715683630675.png

But when applying the script to load the new data coming the lakehouse. It says the 

RuntimeError: Unable to get model info: Registered Model with name=component_classification_v3 not found

 

 

 

 

from pyspark.sql import SparkSession
from pyspark.sql.functions import col, udf
from pyspark.sql.types import ArrayType, IntegerType
import pandas as pd
from transformers import BertTokenizer
import tensorflow as tf
import mlflow
from synapse.ml.predict import MLFlowTransformer

# Initialize Spark session
spark = SparkSession.builder.getOrCreate()

# Load data from your Spark SQL environment or DataFrame
df = spark.sql("SELECT removal_reasons, reliability_tracked FROM lakehouse1.part_removal")

# Initialize the tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# Tokenization function
def tokenize_text(text):
    tokens = tokenizer(text, padding="max_length", truncation=True, max_length=128, return_tensors="np")
    return tokens['input_ids'][0].tolist(), tokens['attention_mask'][0].tolist()

# Register UDF for tokenization
@udf(ArrayType(IntegerType()))
def udf_tokenize_input_ids(text):
    return tokenize_text(text)[0]

@udf(ArrayType(IntegerType()))
def udf_tokenize_attention_mask(text):
    return tokenize_text(text)[1]

# Apply the UDF to add tokenized columns
df = df.withColumn("input_ids", udf_tokenize_input_ids(col("removal_reasons")))
df = df.withColumn("attention_mask", udf_tokenize_attention_mask(col("removal_reasons")))

# Ensure the DataFrame has the correct format
df = df.select("input_ids", "attention_mask", "reliability_tracked")

# Configure MLflow
mlflow.set_tracking_uri("UNKNOWN")  # Your MLflow tracking URI here
mlflow.set_experiment("Notebook-1")  # Your experiment name here

# Load the model
model = MLFlowTransformer(
    inputCols=["input_ids", "attention_mask"],  # Your input columns here
    outputCol="predictions",  # Your new column name here
    modelName="component_classification_v3",  # Your model name here
    modelVersion=1  # Your model version here
)

# Transform the data using the model
predicted_df = model.transform(df)

# Write the predictions to Delta Lake
predicted_df.write.format('delta').mode("overwrite").save("predicted_amos_part_removal")  # Your output table filepath here

 

 

 

 

Regards,

King

 

7 REPLIES 7
Anonymous
Not applicable

Hi @kirah2128 ,

Thanks for using Fabric Community.
Did you got any chance to look into this doc - Machine learning model - Microsoft Fabric | Microsoft Learn

Hope this is helpful. Do let me know incase of further queries.

Hi, the link is not helpful.


What I want to achieve is to deploy now the model. There's a wizard option but that's not gonna work if the input is converted to other data types. in my Case its TensorFlow data type.; I attached the picture below for your ref. 

kirah2128_1-1715686515298.png

 

 

Anonymous
Not applicable

Hi @kirah2128 ,

Can you please check your input data and also the version configuration from your end once?

Do let me know incase of further queries.

This is the model

kirah2128_0-1715774664395.png

 

 

here is how I supply the ML with the new inputs from lakehouse source.

# Initialize Spark session
spark = SparkSession.builder.getOrCreate()

# Load data from your Spark SQL environment or DataFrame
df = spark.sql("SELECT removal_reasons, reliability_tracked FROM lakehouse1.part_removal")

# Initialize the tokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

# Tokenization function
def tokenize_text(text):
    tokens = tokenizer(text, padding="max_length", truncation=True, max_length=128, return_tensors="np")
    return tokens['input_ids'][0].tolist(), tokens['attention_mask'][0].tolist()

# Register UDF for tokenization
@udf(ArrayType(IntegerType()))
def udf_tokenize_input_ids(text):
    return tokenize_text(text)[0]

@udf(ArrayType(IntegerType()))
def udf_tokenize_attention_mask(text):
    return tokenize_text(text)[1]

# Apply the UDF to add tokenized columns
df = df.withColumn("input_ids", udf_tokenize_input_ids(col("removal_reasons")))
df = df.withColumn("attention_mask", udf_tokenize_attention_mask(col("removal_reasons")))

# Ensure the DataFrame has the correct format
df = df.select("input_ids", "attention_mask", "reliability_tracked")
Anonymous
Not applicable

Hi @kirah2128 ,

I was finf this link in Youtube - click here 
It looks like some similar issue and he did some changes to make the code work.

Before -

vgchennamsft_0-1715853673225.png


After -

vgchennamsft_1-1715853758393.png


Can you please check the video and let me know if it is helpful.

Anonymous
Not applicable

Hello @kirah2128 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet .
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .

Anonymous
Not applicable

Hi @kirah2128 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet .
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

JanFabricDE_carousel

Fabric Monthly Update - January 2025

Explore the power of Python Notebooks in Fabric!

JanFabricDW_carousel

Fabric Monthly Update - January 2025

Unlock the latest Fabric Data Warehouse upgrades!

JanFabricDF_carousel

Fabric Monthly Update - January 2025

Take your data replication to the next level with Fabric's latest updates!