Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
For this example, assume you have two key tables:
- YourTable — contains the transactional or fact data, including fields like Category, Segment, Value, and Index.
- CategoryTable — a dimension table listing all categories with a unique Index column to control sorting and filtering. (if you don’t have a Y axis table, please create one, we need it)
Relationship
- There is a one-to-many relationship from CategoryTable[Index] → YourTable[Index].
- This means each category in CategoryTable relates to multiple rows in YourTable.
- The Index column is critical for sorting and enables filtering subsets of categories dynamically.
This Python script creates a 100% stacked horizontal bar chart from the dataset based on the columns specified. It shows the percentage composition of different segments within each category, with total values displayed at the end.
import pandas as pd
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.pyplot as plt
- pandas is used for data handling.
- matplotlib.pyplot is for plotting the bar chart.
# Columns for the axes and legend
y_col = 'Category'
legend_col = 'Segment'
value_col = 'Value'
index_col = 'Index'
# Labels for the chart
x_axis_label = "Percentage of Total Bill"
y_axis_label = y_col
legend_title = legend_col
- These variables allow flexible column mapping.
- index_col is used for sorting categories in the chart.
df = dataset[[y_col, legend_col, value_col, index_col]].dropna()
if df.empty:
print("No data available to plot.")
fig, ax = plt.subplots(figsize=(10, 6))
ax.text(0.5, 0.5, "No data available to plot.", ha='center', va='center', fontsize=14, color='red')
ax.axis('off')
plt.show()
- Select necessary columns and drop rows with missing values.
- Show message if data is empty.
category_order = (
df[[y_col, index_col]]
.drop_duplicates()
.sort_values(by=index_col, ascending=True)[y_col]
.tolist()
)
- Get unique categories ordered by the index_col to control plot order.
grouped = df.groupby([y_col, legend_col])[value_col].sum().unstack(fill_value=0)
grouped = grouped.reindex(category_order)
- Aggregate values by category and segment.
- Fill missing segment data with zeros.
- Reorder categories for plotting.
if grouped.empty or grouped.sum().sum() == 0:
print("No numeric data available to plot.")
fig, ax = plt.subplots(figsize=(10, 6))
ax.text(0.5, 0.5, "No numeric data available to plot.", ha='center', va='center', fontsize=14, color='red')
ax.axis('off')
plt.show()
- Check if numeric data is present before plotting.
percent = grouped.divide(grouped.sum(axis=1), axis=0) * 100
totals = grouped.sum(axis=1)
- Convert absolute values into percentages per category.
- Calculate total sums for each category.
fig, ax = plt.subplots(figsize=(10, 6))
percent.plot(kind='barh', stacked=True, colormap='Pastel1', ax=ax)
- Plot a stacked horizontal bar chart using percentage data.
- Use a pastel color map for aesthetics.
for bar in ax.patches:
bar.set_edgecolor('black')
bar.set_linewidth(1.5)
- Adds a black border around each segment for clarity.
ax.invert_yaxis()
ax.set_yticks(range(len(percent.index)))
ax.set_yticklabels(percent.index)
ax.set_xlim(0, 120) # Extra space for total labels
plt.xlabel(x_axis_label)
plt.ylabel(y_axis_label)
plt.title(f"100% Stacked Bar Chart: {value_col} by {y_axis_label} and {legend_title}")
plt.legend(title=legend_title, loc='upper center', bbox_to_anchor=(0.5, 1.3), ncol=2)
- Inverts the y-axis so the first category is at the top.
- Sets category labels.
- Adds 20% extra space on x-axis for total values.
- Sets axis labels, title, and legend.
for i, (group_val, row) in enumerate(percent.iterrows()):
cumulative = 0
for legend_val in percent.columns:
value = row[legend_val]
if value > 5:
ax.text(cumulative + value / 2, i, f"{value:.1f}%", va='center', ha='center', color='black', fontsize=9)
cumulative += value
- Adds percentage text inside bar segments only if they are bigger than 5% for readability.
for i, total_value in enumerate(totals):
ax.text(102, i, f"Total: ${total_value:.2f}", va='center', ha='left', fontweight='bold', fontsize=9, color='black')
- Places total numeric values just past the end of each bar.
plt.tight_layout()
plt.show()
- Adjust layout and show the final plot.
Drag and drop the Category , Segment , Value , Index columns into Python visual and copy and past the full code
Full code:
import pandas as pd
import matplotlib.pyplot as plt
# ==== VARIABLES to update ====
y_col = 'Category' # column for y-axis categories
legend_col = 'Segment' # column for legend categories
value_col = 'Value' # numeric column
index_col = 'Index' # index column for sorting
x_axis_label = "x_axis_label here"
y_axis_label = y_col
legend_title = legend_col
# ==== DATA PREP ====
df = dataset[[y_col, legend_col, value_col, index_col]].dropna()
if df.empty:
print("No data available to plot.")
fig, ax = plt.subplots(figsize=(10, 6))
ax.text(0.5, 0.5, "No data available to plot.", ha='center', va='center', fontsize=14, color='red')
ax.axis('off')
plt.show()
else:
category_order = (
df[[y_col, index_col]]
.drop_duplicates()
.sort_values(by=index_col, ascending=True)[y_col]
.tolist()
)
grouped = df.groupby([y_col, legend_col])[value_col].sum().unstack(fill_value=0)
grouped = grouped.reindex(category_order)
if grouped.empty or grouped.sum().sum() == 0:
print("No numeric data available to plot.")
fig, ax = plt.subplots(figsize=(10, 6))
ax.text(0.5, 0.5, "No numeric data available to plot.", ha='center', va='center', fontsize=14, color='red')
ax.axis('off')
plt.show()
else:
percent = grouped.divide(grouped.sum(axis=1), axis=0) * 100
totals = grouped.sum(axis=1)
fig, ax = plt.subplots(figsize=(10, 6))
percent.plot(kind='barh', stacked=True, colormap='Pastel1', ax=ax)
for bar in ax.patches:
bar.set_edgecolor('black')
bar.set_linewidth(1.5)
ax.invert_yaxis()
# Set y-axis labels to just category names (no totals)
ax.set_yticks(range(len(percent.index)))
ax.set_yticklabels(percent.index)
# Add white space after bars
ax.set_xlim(0, 120) # 20% extra space beyond 100%
plt.xlabel(x_axis_label)
plt.ylabel(y_axis_label)
plt.title(f"100% Stacked Bar Chart: {value_col} by {y_axis_label} and {legend_title}")
plt.legend(title=legend_title, loc='upper center', bbox_to_anchor=(0.5, 1.3), ncol=2)
for i, (group_val, row) in enumerate(percent.iterrows()):
cumulative = 0
for legend_val in percent.columns:
value = row[legend_val]
if value > 5:
ax.text(
cumulative + value / 2, i, f"{value:.1f}%",
va='center', ha='center', color='black', fontsize=9
)
cumulative += value
# Total at the end of each bar only
for i, total_value in enumerate(totals):
ax.text(
102, # slightly past 100%
i,
f"Total: ${total_value:.2f}",
va='center',
ha='left',
fontweight='bold',
fontsize=9,
color='black'
)
plt.tight_layout()
plt.show()
Once everything in place the output should look like this
To filter the visual to show only a subset of categories dynamically (for example, the Top 5), you can use a numeric parameter slider in Power BI. This allows users to select the range of categories to display.
First create a numeric field parameter
Modeling > New Parameter > Numeric Range
Here is a sample DAX measure to filter categories based on a parameter slider:
ShowCategory =
VAR SelectedFrom = SELECTEDVALUE('Parameter '[Parameter], 1)
VAR SelectedTo = SelectedFrom + 4
VAR CurrentIndex = MAX(CategoryTable[Index])
RETURN
IF(CurrentIndex >= SelectedFrom && CurrentIndex <= SelectedTo, 1, 0)
Explanation:
- 'Parameter '[Parameter] is a numeric slicer that users adjust.
- SelectedFrom is the starting index.
- SelectedTo is 4 indices after SelectedFrom, showing 5 categories total.
- The measure returns 1 for categories whose index falls within the range.
- Use this measure as a filter in your visual to only show categories where ShowCategory = 1.
This technique ensures that your 100% stacked bar chart updates dynamically based on user selection, allowing for a flexible and interactive data exploration experience.
This blog post covered how to create a polished 100% stacked horizontal bar chart in Python with detailed explanations for each part of the code. We also explored how to integrate a dynamic Top N filter in Power BI using a slider parameter. By separating your category dimension from the fact data and leveraging index columns, you gain better control over sorting and filtering in your reports. Feel free to adapt the code and techniques here to your own datasets and dashboards.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.