Solved: ML model in Fabric: Get prediction probabilities

Anonymous · ‎01-22-2025

Hi! I have saved my ML-model as an model in Fabric and import it to my notebook using the code below. How can I get the probability for each predticion instead of 0 or 1?

import mlflow

from synapse.ml.predict import MLFlowTransformer

#df = spark.read.format("delta").load()

df = spark.sql("SELECT * from lakehouse.prep_data_score")

model = MLFlowTransformer(

inputCols=["all inpute columns"], # Your input columns here

outputCol="predictions", # Your new column name here

modelName="churn_model", # Your model name here

modelVersion=5 # Your model version here

)

df = model.transform(df)

df_selection = df.select("MED_KEY", "predictions")

df_selection.write.format("delta").mode("overwrite").saveAsTable("lakehouse.member_scores")

#also save as csv

df_selection_pd = df_selection.toPandas()

df_selection_pd.to_csv("", index=False)

nilendraFabric · ‎01-22-2025

Hello @Anonymous

The key idea is to create a custom PyFunc model that calls `predict_proba` or an equivalent function on your base model.

You can make MLFlowTransformer return probabilities by packaging a model whose predict function itself outputs probabilities, rather than just class labels. In other words, if the underlying model supports something like `predict_proba`, you need to ensure that the MLflow model’s prediction method calls that instead of `predict` when it runs.
One way to do this is to define a custom PyFunc model that wraps your existing classifier and overrides its predict method to invoke `predict_proba`. For a scikit-learn model, for example, you could do something like:

import mlflow.pyfunc
import mlflow.sklearn
import sklearn
from sklearn.base import BaseEstimator

class ProbaWrapper(mlflow.pyfunc.PythonModel):
def load_context(self, context):
import joblib
# Load the underlying model (scikit-learn, XGBoost, etc.)
self.model = mlflow.sklearn.load_model(context.artifacts["base_model"])

def predict(self, context, model_input):
# Return probability outputs instead of classes
return self.model.predict_proba(model_input)

# Train or load your existing model (e.g. a scikit-learn classifier).
# Then save it in MLflow with a 'base_model' artifact, wrapping it in ProbaWrapper:

with mlflow.start_run():
mlflow.pyfunc.log_model(
artifact_path="proba_model",
python_model=ProbaWrapper(),
artifacts={"base_model": "<path_or_registered_model_reference>"},
)

Register that model in Fabric, then use the MLFlowTransformer just as before (pointing `modelName` and `modelVersion` to this custom PyFunc model). The result of `model.transform(df)` will now be per-class probabilities instead of 0/1 predictions

Hope this helps

View solution in original post

v-ssriganesh · ‎01-22-2025

Hello @Anonymous,
Thank you for posting your query in microsoft fabric community forum.

Upon investigating your concern, we found that the code you are using appears correct. However, to obtain the probabilities for each prediction, please make the following modification:
Instead of using:
df_selection = df.select("MED_KEY", "predictions")
we recommend replacing it with one of the following:

df_selection = df.select("MED_KEY", "predictions.probability") Or else
df_selection = df.select("MED_KEY", "predictions.probabilities")

If this helps, then please Accept it as a solution and dropping a "Kudos" so other members can find it more easily.

Thank you.

Anonymous · ‎01-23-2025

Thank you for your answer. But I don't think this will work since predictions is a column with a value and can't be called like that.

nilendraFabric · ‎01-22-2025