Hyperparameter Tuning with Optuna for Data Science...

Sahir_Maharaj · ‎11-10-2025

If you’ve ever trained a machine learning model, you know the feeling of staring at the results and thinking, “This could perform better, but where do I EVEN start?” You update a learning rate, nudge the number of trees in a random forest, or adjust dropout in a neural network... only to realize that the trial-and-error approach feels like rolling dice in the dark. The reality is, hyperparameter tuning is one of the most overlooked yet powerful levers you can pull to transform a model from average to exceptional. And yet, it’s also the most intimidating because the search space can be huge, the results can be noisy, and time feels like it’s always running out. That’s where Optuna comes in.

What you will learn: In this edition, you will explore hyperparameter tuning and why Optuna is such a powerful tool for the job. By the time you’re completed, you’ll understand what tuning really means, see how Optuna makes the process smarter than traditional search methods, and walk step by step using Python. Along the way, you’ll also build the confidence to take these ideas and apply them to your own models.

Read Time: 8 minutes

Source: Sahir Maharaj (https://sahirmaharaj.com)

Hyperparameters are those “knobs and dials” you don’t learn during model training - they’re set before the training even starts. Think about the learning rate in gradient boosting, the number of neighbors in k-NN, or the regularization strength in logistic regression. Each of these values can dramatically change your model’s performance. The challenge is that hyperparameters interact with one another in unpredictable ways. A slightly higher learning rate might be great when the number of trees is small but disastrous when the tree depth is high.

This is where hyperparameter tuning becomes less about guessing and more about strategy. Traditional methods like grid search or random search are easy to understand but wasteful... they either exhaustively test combinations (grid) or test randomly without direction (random). Optuna takes a smarter route by treating tuning as an optimization problem, learning from each trial, and narrowing down to promising regions of the search space.

Source: Sahir Maharaj (https://sahirmaharaj.com)

Optuna is designed with two principles in mind: efficiency and ease of use. Unlike other libraries, it doesn’t make you write endless configuration files or wrestle with rigid syntax. Instead, you define an objective function, tell Optuna what you want to maximize (or minimize), and it takes care of the rest. But the real is magic Optuna uses algorithms like Tree-structured Parzen Estimators (TPE) to guide the search intelligently. This means it learns from past trials, making future suggestions more likely to improve performance. It also supports pruning, which stops unpromising trials early (saving you hours of wasted compute!).

Source: Sahir Maharaj (https://sahirmaharaj.com)

At the start of my career, when I first switched from grid search to Optuna, the difference was immediately clear. Grid search felt like brute force.. simple but clumsy. Optuna felt like having an assistant who actually paid attention to what worked and what didn’t, and then refined the strategy. What I’ve noticed over time is that Optuna doesn’t just save compute but also energy, focus, and patience. You stop wasting time on experiments you know aren’t going anywhere, and you start spending your energy on interpreting insights. And if you’re like me, that’s the fun part: exploring why certain hyperparameters matter and how that knowledge carries into your next project.

import optuna
import pandas as pd
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

X, y = datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

breast_cancer = datasets.load_breast_cancer()
df = pd.DataFrame(breast_cancer.data, columns=breast_cancer.feature_names)
df['target'] = breast_cancer.target

print("\nSummary statistics:")
display(df.describe())

print("\nTarget distribution:")
print(df['target'].value_counts())

Now that you understand the “why,” let’s move into the “how.” Let's say you’re training a RandomForestClassifier to classify tumors as malignant or benign. With default parameters, you’d get decent accuracy - but “decent” isn’t what you want if this model impacts real-world decisions. The first step is to build an objective function. This is the core of Optuna as you define the search space for hyperparameters, train the model with those values, and return a score (in this case, accuracy). Optuna then repeats this process across multiple trials, intelligently adjusting based on results.

Source: Sahir Maharaj (https://sahirmaharaj.com)

Notice how the search space is wide but structured: from n_estimators to bootstrap. This gives Optuna freedom to experiment without becoming chaotic. At this stage, you’re essentially giving Optuna the “rules of the game.”

def objective(trial):
    n_estimators = trial.suggest_int("n_estimators", 50, 500, step=25)
    max_depth = trial.suggest_int("max_depth", 2, 32)
    min_samples_split = trial.suggest_int("min_samples_split", 2, 15)
    min_samples_leaf = trial.suggest_int("min_samples_leaf", 1, 8)
    bootstrap = trial.suggest_categorical("bootstrap", [True, False])

    clf = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        min_samples_split=min_samples_split,
        min_samples_leaf=min_samples_leaf,
        bootstrap=bootstrap,
        random_state=42,
        n_jobs=-1
    )

    clf.fit(X_train, y_train)
    preds = clf.predict(X_test)
    accuracy = accuracy_score(y_test, preds)
    return accuracy

Having completed the objective function, the next stage is to let Optuna do what it does best: explore. This is where we create a study where Optuna stores every trial’s results and then instruct it to optimize over a set number of trials.

study = optuna.create_study(direction="maximize")

study.optimize(objective, n_trials=60, show_progress_bar=True)

print("Number of finished trials:", len(study.trials))
print("Best trial:", study.best_trial)
print("Best hyperparameters:", study.best_params)
print("Best accuracy:", study.best_value)

As Optuna runs, you’ll see a log of each trial, its chosen parameters, and the resulting accuracy. The fascinating part, and something I personally like, is watching how the model performance starts to improve after just a few iterations. The early trials look random, but then the optimizer starts to zoom into promising regions of the search space. But, to really understand what’s happening, Optuna includes visualization tools!

from optuna.visualization import (
    plot_optimization_history,
    plot_param_importances,
    plot_parallel_coordinate
)

plot_optimization_history(study).show()
plot_param_importances(study).show()
plot_parallel_coordinate(study).show()

I find that these charts give a new perspective as we can see which parameters had the biggest impact, how accuracy evolved across trials, and how different parameter combinations interacted. Based on my experience, these visuals are often just as valuable as the final numbers, because they give intuition for how your model behaves under the hood.

Source: Sahir Maharaj (https://sahirmaharaj.com)

And there it is! Now that you’ve learned how to set it up inside Microsoft Fabric, the next step is simple: try it on your own models. Don’t just copy the code but adapt it, experiment with different hyperparameters, and push your models further. Think of every experiment as an investment (not just in metrics and accuracy scores), but in your growth as a data professional. Hyperparameter tuning is no longer a guessing game. With Optuna, it becomes a strategy. And with Fabric, it becomes part of your daily workflow. So open that notebook, install Optuna, and let your models finally shine the way they’re meant to!

Thanks for taking the time to read my post! I’d love to hear what you think and connect with you 🙂

Hyperparameter Tuning with Optuna for Data Science in Microsoft Fabric

Autoencoder Dimensionality Reduction for Data Scie...

Advanced Pipelines & Transformers for Data Science...

Designing Intelligent Recommendation Systems for D...

Get Fabric Data Agents Running in Minutes – Fast, ...

Hyperparameter Tuning with Optuna for Data Science...

FabCon is coming to Atlanta

Hyperparameter Tuning with Optuna for Data Science in Microsoft Fabric

Autoencoder Dimensionality Reduction for Data Scie...

Advanced Pipelines & Transformers for Data Science...

Designing Intelligent Recommendation Systems for D...

Get Fabric Data Agents Running in Minutes – Fast, ...

Hyperparameter Tuning with Optuna for Data Science...