Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Learn from the best! Meet the four finalists headed to the FINALS of the Power BI Dataviz World Championships! Register now

Reply
b-kopik
Helper I
Helper I

Best Data Model for Headcount (Monthly) and Course Data in Power BI

I’m working with two datasets in Power BI that both contain multiple rows per employee, but represent different types of information.


Dataset 1: Combined Headcount Data

  • Every month, I load a headcount report.

  • It includes:

    • Employee ID

    • Gender

    • Age

    • Ethnicity

    • Position / Job level

    • Salary

    • Agency (this is large marketing company with multiple agencies)

  • Because I append monthly files together:

    • Each employee appears once per month

    • Therefore, the table contains multiple rows per Employee ID

Example:
Employee ID 12345 appears in January, February, December, etc.


Dataset 2: Campus (Training) Data

  • Contains training/course records

  • Each row represents one course taken

  • Employees can appear multiple times

Example:
If Employee 12345 took:

  • Excel101

  • PowerBI101

  • PowerPoint101

They will appear three times in this dataset.


What I’m Trying to Achieve

My dashboard has three sections:


1. Headcount Section (Independent from Campus)

  • Has a Month slicer

  • Has an Agency slicer

  • If I select January, it should show:

    • Headcount for January only

    • Headcount by Agency for January

  • This section works correctly and should remain independent from Campus data.


2. Campus Section

  • Shows courses taken

  • Needs:

    • Month slicer

    • Agency slicer

  • If I select:

    • January + Agency A
      → It should show how many courses were taken by employees in Agency A in January.


3. Demographics / Employee Info (Based on Campus Students)

This section should show demographics of employees who took training, such as:

  • Age group

  • Ethnicity

  • Job level

  • Agency

But filtered by:

  • Selected Month

  • Selected Agency


The Issue

When I try to create a relationship between Campus and Headcount, Power BI only allows many-to-many because both tables have multiple rows per Employee ID.

If I use that many-to-many relationship:

  • When I select January in a slicer

  • My demographics section shows the total January headcount

  • Instead of only the employees who actually took training in January


What I’m Trying to Achieve

When I select:

  • Month = January

  • Agency = X

I want to see:

  • Courses taken in January

  • And demographic info (age group, ethnicity, job level)

  • But only for employees who took training in that selected month


My Question

What is the correct way to model this?

  • Should I avoid directly relating the two fact tables?

  • How do I structure the relationships so that Month filtering correctly limits demographics to only Campus participants?

Right now, the many-to-many relationship is giving me incorrect filtering behavior.

What is the recommended modeling approach for this scenario?

2 ACCEPTED SOLUTIONS

You could try

AvgAgeTraining =
VAR EmployeesWithTraining =
    CALCULATETABLE ( VALUES ( 'Campus Data'[Employee ID] ) )
VAR Result =
    CALCULATE (
        AVERAGE ( 'Combined Headcount'[Age] ),
        TREATAS ( EmployeesWithTraining, Employee[Employee ID] )
    )
RETURN
    Result

View solution in original post

Hi @b-kopik , Thank you for reaching out to the Microsoft Community Forum.

 

Linking only by Employee ID is not enough in this case, because Agency is a time dependent attribute. An employee can belong to different agencies in different months, so the correct agency must come from the Headcount snapshot for the selected month. The usual approach is to let the Month and Agency slicers filter the Headcount table first, which gives you the employees who belonged to that agency in that month and then pass that employee set to the Campus table when calculating training metrics.

For example:

EmployeesWithTraining :=
CALCULATE (
DISTINCTCOUNT ( 'Campus Data'[Employee ID] ),
TREATAS (
VALUES ( 'Combined Headcount'[Employee ID] ),
'Campus Data'[Employee ID]
)
)

View solution in original post

11 REPLIES 11
v-hashadapu
Community Support
Community Support

Hi @b-kopik , Hope you're doing okay! May we know if it worked for you, or are you still experiencing difficulties? Let us know — your feedback can really help others in the same situation.

v-hashadapu
Community Support
Community Support

Hi @b-kopik , hope you are doing great. May we know if your issue is solved or if you are still experiencing difficulties. Please share the details as it will help the community, especially others with similar issues.

johnt75
Super User
Super User

Never create a relationship between fact tables, that goes completely against the star schema which is the best way to model data. Instead create dimension tables which have one-to-many relationships to your fact tables, this is much more efficient and it is much easier to manipulate the filters in the way that you want to.

As a minimum you will want a date dimension and an employee dimension, both linked to both your fact tables. You can create an employee dimension by retrieving a list of unique employee IDs. You can do this in Power Query or DAX, the only thing you need to be mindful of is whether or not it is possible for an employee ID to appear in the training data and not in the headcount data. If it is not possible then you can create an employee dimension with a simple DISTINCT(Headcount[EmployeeID] ). If there might be employees in training who are not in headcount then you could do something like

Employee =
DISTINCT (
    UNION (
        DISTINCT ( Headcount[Employee ID] ),
        DISTINCT ( Training[Employee ID] )
    )
)

Once you have the employee dimension, link it to both headcount and training.

You could also create dimensions for e.g. Agency, Gender, Ethnicity, Course etc, again by using DISTINCT. If you do create these extra dimensions, make sure that you use the dimension in all your visuals and filters.

You can now create a measure to show the number of employees who did training in the chosen month like

Num employees with training =
CALCULATE ( COUNTROWS ( Headcount ), Training )

Because this is counting from Headcount, you can break it down by any of the columns in that table, or by dimensions you have chosen to extract from that table. It works by using the expanded table of Training, so only employees who have rows in the Training table for the given month will be considered.

One thing to bear in mind is that this method needs to have only 1 month chosen in the slicer. If more than one month is chosen then employees will be counted multiple times. You can enable single select on the slicer to ensure that only 1 month can be chosen.

Thanks, I’ve now remodelled this into a proper star schema as suggested.

I created an Employee dimension (distinct Employee IDs) linked one-to-many to both fact tables

  •  

There is no longer any direct relationship between the two fact tables.

 

Section 1 (Headcount) works correctly.
Month + Agency slicers correctly show headcount for the selected month.

Section 2 (Training/Campus) also works correctly.
Month + Agency slicers correctly show number of courses taken.

However, Section 3 (Demographics of employees who took training) is still not behaving correctly.

When I select:

  • Month = January

  • Agency = X

The demographics visuals (Age group, Ethnicity, Job level, etc.) are still showing the total January headcount, not just the employees who actually took training in January.

So although the model now follows a star schema and filtering works independently for each fact table, I’m not getting the intersection behavior I need for the demographics section.

Am I correct in thinking I now need a specific measure (rather than relying on implicit filtering) to restrict Headcount to only employees that exist in Training for the selected month?

If so, what is the recommended pattern to achieve this without reintroducing many-to-many behavior?

Yes, you need to write an explicit measure. I included an example, and an explanation of how it works, in my original post but it was quite a long post. Have another look at it from the second code snippet down.

Thanks, I am having trouble with the Age metric. I tried both formulas and they didn't work:

 

Option 1:

 

AvgAgeTraining =
CALCULATE(
AVERAGE('Combined Headcount'[Age]),
'Campus Data'
)

 

Option 2:

 

 AvgAgeTraining =

CALCULATE(AVERAGE('Combined Headcount'[Age]), 'Campus Data') 

 

 

I would like for it to only calculate the average age of those employees who attended the training - but the age column is in the combined headcount dataset

You could try

AvgAgeTraining =
VAR EmployeesWithTraining =
    CALCULATETABLE ( VALUES ( 'Campus Data'[Employee ID] ) )
VAR Result =
    CALCULATE (
        AVERAGE ( 'Combined Headcount'[Age] ),
        TREATAS ( EmployeesWithTraining, Employee[Employee ID] )
    )
RETURN
    Result

Hi @b-kopik , Thank you for reaching out to the Microsoft Community Forum.

 

Your model is correct. The issue is simply that filters don’t flow from one fact table to another in a star schema. So, when you average Age from Headcount, it’s filtered by Month and Agency, but it isn’t automatically limited to employees who took training. You need to explicitly pass the filtered Employee IDs from Campus into Headcount. Please try something like:

AvgAgeTraining :=
CALCULATE (
AVERAGE ( 'Combined Headcount'[Age] ),
TREATAS (
VALUES ( 'Campus Data'[Employee ID] ),
'Combined Headcount'[Employee ID]
)
)

 

This forces Headcount to calculate only for employees who exist in Campus for the selected Month and Agency. If this still doesn’t work, can you please confirm that your relationships are single direction (dimension -> fact), whether Age is monthly or static and can you please share a small sample of both tables for one problematic month so the behaviour can be reproduced precisely. Also, share any other relevant details. Do not include sensitive information. Do not include anything that is unrelated to the issue or question.

 

Need help uploading data? https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...

Thanks, it worked!

 

I have a new challenge with my model. My Campus (training) dataset doesn’t include Agency, but my Headcount dataset does, and employees can move between agencies over time (so the headcount dataset is the most up to data). I want the Campus section of my dashboard to filter by Month and Agency, showing how many employees took courses based on the agency they belonged to at the selected month. For example, if I select January + Agency A, it should count only employees who were in Agency A in January, even if they later moved to another agency. What’s the best way to model this in Power BI so I can link the datasets show the correct agency per selected date - even though agency is only in the headcount data? Is it not enough to link employee ID and it will tell me the agency as of that month?

Hi @b-kopik , Thank you for reaching out to the Microsoft Community Forum.

 

Linking only by Employee ID is not enough in this case, because Agency is a time dependent attribute. An employee can belong to different agencies in different months, so the correct agency must come from the Headcount snapshot for the selected month. The usual approach is to let the Month and Agency slicers filter the Headcount table first, which gives you the employees who belonged to that agency in that month and then pass that employee set to the Campus table when calculating training metrics.

For example:

EmployeesWithTraining :=
CALCULATE (
DISTINCTCOUNT ( 'Campus Data'[Employee ID] ),
TREATAS (
VALUES ( 'Combined Headcount'[Employee ID] ),
'Campus Data'[Employee ID]
)
)

Ritaf1983
Super User
Super User

Hi @b-kopik 

order to get this dashboard working correctly, what needs to be done is to transition from a direct many-to-many relationship between your two data tables to a star schema structure with shared dimensions. The current issue exists because Power BI cannot effectively filter the Headcount table using the Campus table when an Employee ID appears in multiple months across both; this is why selecting January shows you the entire headcount instead of just the trainees. To fix this, the approach involves creating a unique Employee table and a dedicated Calendar table that both connect to your existing data. For the Employee table, you can use a DAX calculated table like Employees = DISTINCT(UNION(SELECTCOLUMNS('Headcount', "ID", 'Headcount'[Employee ID]), SELECTCOLUMNS('Campus', "ID", 'Campus'[Employee ID]))) to ensure every ID is represented only once. Similarly, a Calendar table should be generated using Calendar = CALENDARAUTO() so that your Month slicer comes from a single, neutral source.

Once these tables are created, you should connect the Employee ID from the new Employee table to both the Headcount and Campus tables with a one-to-many relationship, and do the same for the Date or Month columns from the Calendar table. When the slicers for Month and Agency are pulled from these new shared dimension tables, they will filter both fact tables simultaneously and accurately. To ensure the Demographics section specifically isolates only those who took training in the selected month, what needs to be done is to create a simple DAX measure such as IsParticipant = IF(ISBLANK(COUNTROWS('Campus')), 0, 1) and add it to the "Filters on this visual" pane for your demographic charts, setting the value to 1. This logic forces the demographic visuals—which pull their descriptive data like Age and Ethnicity from the Headcount table—to only display rows where a corresponding training record exists in the Campus table for that specific month and agency. This structure keeps your Headcount section independent while allowing the Campus and Demographics sections to sync perfectly without double-counting or over-reporting.

If this post helps, then please consider Accepting it as the solution to help the other members find it more quickly

Regards,
Rita Fainshtein | Microsoft MVP
https://www.linkedin.com/in/rita-fainshtein/
Blog : https://www.madeiradata.com/profile/ritaf/profile

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

February Power BI Update Carousel

Power BI Monthly Update - February 2026

Check out the February 2026 Power BI update to learn about new features.