Don't miss your chance to take exam DP-600 or DP-700 on us!
Request nowFabric Data Days Monthly is back. Join us on March 26th for two expert-led sessions on 1) Getting Started with Fabric IQ and 2) Mapping & Spacial Analytics in Fabric. Register now
If you’ve ever trained a machine learning model and felt disappointed that it didn’t quite capture the full picture, you’re not alone. Models can be powerful, but they’re rarely perfect on their own. One model might predict well on some parts of your dataset but stumble badly on others. Another might be the exact opposite. Relying on just one of them is like asking a single person to describe a movie... the story will always have gaps.
This is where ensemble methods come in. Instead of putting all your trust in a single model, you combine multiple models to build something stronger, more reliable, and more accurate. It’s the wisdom of the crowd, but for algorithms. Among the many ensemble techniques out there, stacking and voting stand out because of how intuitive and effective they are. They take ordinary models and turn them into something extraordinary. If you’ve ever wanted to get that extra edge in your predictions, stacking and voting might be exactly what you need.
What you will learn: In this edition, we’re exploring ensemble methods, focusing on stacking and voting, and how they can help you get more performance out of your models. By the time you’re completed, you’ll have a clear understanding of what voting classifiers are, when to use them, and why they often outperform a single model on its own. And because theory only takes you so far, we’ll also explore how these concepts fit into Microsoft Fabric, so you can go from experimenting in a notebook to applying them in real-world projects.
Source: Sahir Maharaj (https://sahirmaharaj.com)
Voting is the simpler of the two techniques (but don’t let that fool you) as it can be surprisingly powerful. Think of voting as a democratic process where each model casts a vote for its prediction, and the final decision is based on the majority (hard voting) or on averaged probabilities (soft voting). For example, say you’re predicting whether a customer will churn. A logistic regression model might give you one answer, a random forest another, and a gradient boosting model yet another. Alone, each one is flawed. But when you put them together, the “vote” often lands closer to the truth than any single model could.
The appeal of voting is its transparency. Unlike some complex algorithms where the reasoning is hard to follow, voting gives you a clear explanation: the final outcome was based on how the models collectively decided. This simplicity makes it an excellent starting point for teams who are new to ensemble learning but still want noticeable improvements in performance. But, the real strength of voting lies in diversity. If all your models are similar (say, three logistic regression models trained on nearly identical features) you won’t see much of a performance boost. It’s like having three people with the same opinion. But if you combine fundamentally different models, such as a decision tree, a support vector machine, and a neural network, their disagreements help balance out weaknesses. Diversity of thought, even in algorithms, often leads to better decisions.
Source: Sahir Maharaj (https://sahirmaharaj.com)
Another subtle point about voting is how you weigh influence. Hard voting gives every model the same say, which might be risky if one model is consistently unreliable. Soft voting, however, gives you the ability to emphasize probabilities, essentially letting stronger models whisper louder. This often makes soft voting the better choice, particularly in cases where the dataset is noisy or unbalanced. And finally, voting isn’t just about accuracy. In many projects, interpretability matters. Stakeholders may want to know why a prediction was made, and showing them that multiple models agreed can build trust. In fields like finance, healthcare, or customer analytics, that trust can make all the difference when adopting machine learning at scale.
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.ensemble import VotingClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
X, y = load_wine(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
log_clf = LogisticRegression(max_iter=5000)
tree_clf = DecisionTreeClassifier(max_depth=6)
svm_clf = SVC(probability=True, kernel='rbf')
voting_clf = VotingClassifier(
estimators=[('lr', log_clf), ('dt', tree_clf), ('svc', svm_clf)],
voting='soft'
)
voting_clf.fit(X_train, y_train)
y_pred = voting_clf.predict(X_test)
print("Voting Accuracy:", accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(6,5))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', cbar=False)
plt.title("Voting Classifier - Confusion Matrix (Wine Dataset)")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
Stacking goes beyond simple voting. Instead of just letting models vote, stacking trains a “meta-model” to learn how to best combine the outputs of your base models. Imagine you have three models: a decision tree, a logistic regression, and an SVM. Each one makes predictions. Instead of deciding the outcome by majority, stacking feeds those predictions into another model (like a linear regression or gradient boosting machine) that learns the right way to combine them. The beauty of stacking is that it can uncover complex relationships between models. Maybe the decision tree is great at capturing non-linear patterns, but weak on outliers. The logistic regression might handle those outliers better. The meta-model learns these strengths and weaknesses and adjusts accordingly. This layered approach can deliver results that are consistently more accurate than either individual models or voting alone.
To really appreciate stacking, think about it as model collaboration with coaching. The base models are players, each with different skills. The meta-model is the coach who watches them perform and then makes smarter final decisions by blending their strengths. Unlike voting, which assumes each model should have the same weight (or proportional weight), stacking learns weights dynamically from the data. As a data scientist, I’ve leaned on stacking when single models plateaued in performance. I’ve had situations where a random forest was good, but an SVM captured different nuances of the same data. When I let a meta-model combine them, the results improved beyond what I could have achieved through parameter tuning alone. It often feels like having a second set of eyes, one that spots details I’d miss.
Source: Sahir Maharaj (https://sahirmaharaj.com)
Moving further, stacking does come with challenges. Overfitting is the biggest one. If you’re not careful, the meta-model may learn patterns from the training set that don’t generalize well to new data. To prevent this, stacking typically uses cross-validation to generate predictions for the meta-model, ensuring that no information leaks from training into testing. This adds complexity but also provides robustness. Another interesting aspect of stacking is its flexibility. You’re not limited to classification or regression. Stacking can work for both, and you can combine nearly any set of models. Want to mix neural networks with random forests and gradient boosting machines? That’s entirely possible. This makes stacking particularly useful in Kaggle competitions or high-stakes projects where every fraction of accuracy matters.
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier, StackingClassifier
from sklearn.metrics import accuracy_score, classification_report
X, y = load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
base_estimators = [
('lr', LogisticRegression(max_iter=5000)),
('dt', DecisionTreeClassifier(max_depth=10)),
('svc', SVC(probability=True, kernel='poly'))
]
meta_model = RandomForestClassifier(n_estimators=200)
stacking_clf = StackingClassifier(
estimators=base_estimators,
final_estimator=meta_model,
cv=5
)
stacking_clf.fit(X_train, y_train)
y_pred = stacking_clf.predict(X_test)
print("Stacking Accuracy:", accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))
Now, if you're new to data science, you might be wondering: how does this translate into Microsoft Fabric? Fabric provides a single place to work with data, build models, and experiment with machine learning. What’s exciting is that the exact code you’ve just seen can run directly in a Fabric notebook without modification. You don’t have to think about infrastructure, and you don’t have to piece together multiple tools. Another advantage is integration. In Fabric, once you’ve trained a voting or stacking model, you can visualize results, share them across your organization, or even deploy them into pipelines for real-time predictions.
This makes ensemble learning not just a research exercise but a practical, production-ready approach. When I’ve used ensemble methods in Fabric, one thing I noticed was the ease of collaboration. Executives or team members could open the same notebook, run the cells, and see the results themselves. That transparency turned abstract discussions into concrete evidence. It’s one thing to claim that stacking outperforms voting... it’s another to let others see the numbers firsthand. Whether you’re testing small datasets or deploying enterprise-level solutions, voting and stacking fit right in.
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.tree import DecisionTreeRegressor
from sklearn.svm import SVR
from sklearn.ensemble import VotingRegressor, StackingRegressor, RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
X, y = fetch_california_housing(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
lin_reg = LinearRegression()
tree_reg = DecisionTreeRegressor(max_depth=8)
svr_reg = SVR(kernel='rbf')
voting_reg = VotingRegressor(
estimators=[('lr', lin_reg), ('dt', tree_reg), ('svr', svr_reg)]
)
stacking_reg = StackingRegressor(
estimators=[('lr', lin_reg), ('dt', tree_reg), ('svr', svr_reg)],
final_estimator=RandomForestRegressor(n_estimators=200),
cv=5
)
voting_reg.fit(X_train, y_train)
stacking_reg.fit(X_train, y_train)
y_pred_vote = voting_reg.predict(X_test)
y_pred_stack = stacking_reg.predict(X_test)
mse_vote = mean_squared_error(y_test, y_pred_vote)
r2_vote = r2_score(y_test, y_pred_vote)
mse_stack = mean_squared_error(y_test, y_pred_stack)
r2_stack = r2_score(y_test, y_pred_stack)
print("Voting Regressor MSE:", mse_vote)
print("Voting Regressor R2:", r2_vote)
print("Stacking Regressor MSE:", mse_stack)
print("Stacking Regressor R2:", r2_stack)
plt.figure(figsize=(6,4))
plt.scatter(y_test, y_pred_vote, label=f"Voting (R2={r2_vote:.2f})", alpha=0.6)
plt.scatter(y_test, y_pred_stack, label=f"Stacking (R2={r2_stack:.2f})", alpha=0.6)
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--')
plt.xlabel("Actual")
plt.ylabel("Predicted")
plt.title("Voting vs Stacking Regressor (California Housing Dataset)")
plt.legend()
plt.show()
What I like most about this is how accessible it really is. You don’t need to reinvent the wheel or explore exotic algorithms to see the benefits. With just a handful of lines in scikit-learn, you can go from “good enough” predictions to results that feel polished and professional. And if you’re working in Microsoft Fabric, it gets even more exciting.
Source: Sahir Maharaj (https://sahirmaharaj.com)
So here’s the big takeaway: don’t stop at building one solid model and calling it a day. Keep going and see what happens when you combine models, experiment with ensemble techniques, and watch how the results transform. So the next time you open a notebook, let your models work together - and you might just surprise yourself with how far they can take you.
Thanks for taking the time to read my post! I’d love to hear what you think and connect with you 🙂
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.