Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM. Register now.

Find articles, guides, information and community news

Most Recent
Sahir_Maharaj
Super User
Super User

In this edition, we will explore the art of uncovering hidden patterns in your data using KMeans and DBSCAN. By the time you finish reading, you’ll have a clear sense of how each algorithm thinks, how to decide which one fits your data, and how to interpret the clusters they create. You’ll observe how KMeans brings structure and precision, while DBSCAN adds flexibility and adaptability for messier, real-world data. We’ll also bring in UMAP, a powerful tool that turns complex, high-dimensional data into something you can actually understand.

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, we will explore cross-validation strategies and how to put them to work using scikit-learn’s model_selection module. By the time you’re done, you’ll know when to reach for K-Fold versus StratifiedKFold, how to handle grouped data with GroupKFold, and why TimeSeriesSplit is the only safe option for time-based problems. We’ll also walk through practical Python examples in Microsoft Fabric so you can see these strategies in action and apply them right away.

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, we will explore model evaluation metrics. By the time you’re through, you’ll know how to make sense of precision, recall, F1-score, AUC, and MCC in plain language, and more importantly, when to reach for each one depending on the problem in front of you. I’ll also show you how to implement these metrics in Python using scikit-learn, and how to bring them to life with visualizations like precision-recall and ROC curves.

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, we will explore one of the most common challenges in feature engineering - how to handle categorical data. I’ll walk you through three different encoding techniques: one-hot encoding, ordinal encoding, and target encoding. Along the way, I’ll show you how each method works, when it makes sense to use it, and how to put it into practice with pandas and scikit-learn. We’ll start simple, then build up to more advanced approaches, so by the time you’re done, you’ll not only know how to transform categories into numbers but also which encoding strategy gives your model the best shot at success.

Read more...

Sahir_Maharaj
Super User
Super User

What you will learn: In this edition, we’re exploring how to fill in those gaps in your dataset without losing its integrity. By the time you’re through, you’ll know exactly how to handle missing values using sklearn.impute for quick, reliable fixes, fancyimpute for more advanced, context-aware approaches, and KNNImputer when similarity-based estimates make the most sense. You’ll learn when each technique shines, when it’s best to avoid them, and how to put them into action in Python using a Microsoft Fabric notebook.

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, you will learn how to perform outlier detection and handle them using NumPy, SciPy, and Seaborn inside Microsoft Fabric’s Python environment. By the time you’re done, you’ll know how to spot unusual data points using statistical methods, confirm them visually with clear, informative plots, and decide whether to remove, transform, or cap them based on context. And because finding outliers is only half the story, you’ll also learn how to build the instinct to know when those “odd” values are actually your most valuable insights.

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, I will take you into a deeper layer of Pandas using Microsoft Fabric. By the time you’re done, you’ll know how to reshape your data with precision, using tools like melt() and pivot_table() to get it into exactly the structure you need. You’ll learn how to go beyond basic .groupby() operations, building complex aggregations and transformations that give richer insights without losing important detail. And because clean, maintainable code matters just as much as correct results, we’ll wrap it all together with method chaining and .pipe() so your transformations read like a clear story from start to finish.

Read more...

Rufyda
Kudo Kingpin
Kudo Kingpin

MLflow is a powerful tool that helps you manage your machine learning (ML) projects.

In Microsoft Fabric, MLflow makes it easier to train, track, and use your models to make predictions on new data.

 

What is MLflow?


MLflow is an open-source platform to manage the ML lifecycle, including:

Tracking experiments

Logging model parameters and metrics

Saving and versioning models

Reusing models for predictions

 

Using MLflow in Microsoft Fabric helps you organize and reproduce your work easily.

Steps to Use MLflow in Microsoft Fabric:

 

1. Create an Experiment


Start by creating an experiment. Every time you train a model, it will be saved as a run under that experiment. This helps keep track of each version of your model.

 

2. Log Parameters and Metrics

During training, use MLflow to log:

 

Model parameters (like learning rate or depth)

Metrics (like accuracy or RMSE)

This helps you compare different models later.

 

3. Save the Model
After training, save the model in Microsoft Fabric. MLflow stores it along with:

 

The model file (like a .pkl file)

A metadata file called MLmodel

The environment settings to run the model

 

 

What is the MLmodel File?
The MLmodel file includes:

 

Path to the model (where it’s saved)

Flavors (which ML library was used, like scikit-learn)

Signature (what kind of input the model expects and what output it gives)

Customizing Model Behavior


Sometimes your model may need to be adjusted to work with new data. You can customize the input and output schema using MLflow:

Define input columns (e.g., age, gender, BMI)

Define output (e.g., prediction result)

This is important when applying the model to different datasets.

Using the Model for Batch Predictions

 

After saving the model, you can use it to make batch predictions in Microsoft Fabric:

1. Prepare the New Data
Make sure your data is in the correct format. The column names and types should match what the model expects.

 

2. Store Data in Delta Tables
Microsoft Fabric uses Delta Tables to store data in the lakehouse. To save or load data:

 

# Save data
df.write.format("delta").save("Tables/new_table")

# Read data
df = spark.read.format("delta").load("Tables/new_table")

3. Generate Predictions
Once your data is ready, apply the saved model to make predictions. Then, save the results for further use, like showing them in Power BI.

 

Important: Match Data Types
Make sure the data types in your new dataset match the model’s input schema:

Use String for text

Use Integer or Float for numbers

Use Datetime for dates and times

 

If the types don’t match, the model will not work correctly.

 

Conclusion:

 

MLflow in Microsoft Fabric helps you manage your machine learning process from start to finish. It makes it easy to:

Track your training process

Save and reuse models

Apply models to new data

Store and share predictions

 

This helps you build better models and make better decisions using your data.
let’s connect on LinkedIn: https://www.linkedin.com/in/rufyda-abdelhadirahma/

 

Sahir_Maharaj
Super User
Super User

I was chatting with a colleague earlier this week and they mentioned that Power BI intimidates them. Dashboards, DAX, data models… I get it. It feels like you needed a translator just to get started. But that’s exactly why the PL-300 livestream series is so useful. 

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, we’re exploring data relationships and how to make sense of them using Microsoft Fabric and the SemPy library. By the time you’re done with this, you’ll have a clear approach to mapping out your data, visualizing those connections, and making sure everything checks out. And because no dataset is perfect, we’ll also dive into validation - making sure your data is as solid as you need it to be.

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, you'll learn how to transform your approach to big data using the powerful integration of Azure OpenAI, SynapseML, and Microsoft Fabric. You'll explore how Azure OpenAI acts as the brain for natural language understanding and generation, while SynapseML serves as the computational muscle for scalable machine learning. By the end, you'll be equipped with the knowledge and confidence to create AI-driven workflows that deliver actionable insights and drive impactful decisions.

Read more...

Sahir_Maharaj
Super User
Super User

In this edition, you’ll gain an understanding of SemPy and its transformative role within Microsoft Fabric. Whether you're taking your first steps into semantic modeling or you’re a seasoned pro looking to streamline your workflow, this read is designed to meet you where you are and elevate your capabilities.

Read more...

Helpful resources

Join Blog
Interested in blogging for the community? Let us know.