Need Urgent Help W/ Power Query for HR Analysis

towardhuzzahmo · ‎01-18-2024

Hi,

I'm a college student currently working on a non-school related project using Excel, Power BI, and Power Query. I've got a dataset I'm trying to make sense of and could really use some advice on how to best analyze it. I won't share many specific details, but here's a brief overview of my data structure:

The data is consolidated in a master sheet from all fiscal year sheets.
Each employee has a unique Employee ID, but an ID can appear multiple times across different fiscal years.
The "Last Start" date refers to the most recent start date in a specific fiscal year.
The "Separation Date" indicates when an employee permanently left the company in a specific fiscal year. This field might be blank if the separation occurred in a later fiscal year or if the employee is currently active.
The data I am working with has about 14000 rows!
note: I have already created a query that I would like to reference later on and will be my source.

HTML Fake Sample Below:

Employee ID	Regular/Temporary	Full Time/Part Time	Location	Sex	Last Start	Manager (only mark yes if true)	Ethnicity	Fiscal Year	Seperation Date
000069	Regular	p	santa land	f	1/2/1986		Black	2015
000069	Regular	p	santa land	f	1/2/1986		Black	2016
000069	Regular	p	santa land	f	1/2/1986		Black	2017	9/21/2017
008677	Regular	f	bowling green	f	6/1/2015	yes	Asian	2015
008677	Regular	f	bowling green	f	6/1/2015	yes	Asian	2016	6/16/2017

Overall Goal: I want to explore employment trends, job roles, retention, and exits, focusing on diversity over time.

How I could do this:

Grouping repeating employee IDs by the fiscal years they appear in, creating a matrix where each fiscal year is a column. In this matrix, I'd mark a '1' if an employee is present in that fiscal year, and '0' if not. I plan to do this either in the Advanced Query Editor or Power Pivot. I got the idea from this post.
The result would be the addition of new columns for each fiscal year. The current column named fiscal year column would be deleted.
- Why: I essentially, want to condense all of the repeat Employee ID's which is why i was thinking of using the Group By function. I dont necassarily want to treat the repeating ID's as duplicates because they arent, they are unique records for a different fiscal year.

Mock of what I think I want:

*note (my real data spans from 2015-2023)

Employee ID

Regular/Temporary

Full Time/Part Time

Location

Sex

Last Start

Manager

(only mark yes if true)

Ethnicity

Seperation Date

2015

2016

2017

2018

2019

2020

2021

2022

2023

000069

Regular

p

santa land

f

1/2/1986

Black

9/21/2017

1

0

008677

Regular

f

bowling green

f

6/1/2015

yes

Asian

6/16/2017

1

0

I want, whatever the final product is, to be able to be used, to make pivot tables and graphs, maybe even an HR dashboard. Basically, I know I cant do that with the current way my data is structured.

My questions to the community are:

How do I do this? I've tried so many codes, and i end up with error messages every single time. If you are willing, I would love if you could provide me with some helpful lines of code for this 🙂
Less important question: How would I add calculated columns if I need to like tenure?
Less important question: Are there any alternative approaches or techniques in Power Query or Power Pivot that could be more effective or efficient for this kind of analysis?

Any insights, suggestions, or examples from similar projects would be incredibly helpful. Thank you in advance for your time and assistance!

Thank you,

A very stressed out amost college graduate trying to complete internship project

dufoq3 · ‎01-18-2024

Hi I've tried to create something with sample data, but I'm not sure if I understand what do you exactly need.

(I've added 2 more rows to your sample data).

v1

You can check it by creating blank query and replacing whole code with this one:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rY/BDsIgDIZfhXBewiCxwFETX8DrsgMqkkXClrHF+PYWh4lznnQNpT8FvvSvKlpigKYFPVg3etOj6jCjCYMh3oQzHi6YnAnGtQKUuHbenK5YRck3WAiti59QZM6CFVkSi2YCHyQ9MRVIOWMmxLG9+SY44nprQ+4Be/5L3u424r6NjQlLw6sAYbqHt0l1igV4sHHIUrFs7MXbj33b2S8z/ov6mK5+AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Employee ID" = _t, #"Regular/Temporary" = _t, #"Full Time/Part Time" = _t, Location = _t, Sex = _t, #"Last Start" = _t, #"Manager#(lf)#(lf)(only mark yes if true)" = _t, Ethnicity = _t, #"Fiscal Year" = _t, #"Seperation Date" = _t]),
    CleanTrimColumnNames = Table.TransformColumnNames(Source, each Text.Trim(Text.Clean(_))),
    ChangedTypeUS = Table.TransformColumnTypes(CleanTrimColumnNames,{{"Last Start", type date}, {"Fiscal Year", Int64.Type}, {"Seperation Date", type date}}, "en-US"),
    Ad_FiscalYearsAsColumns = List.Accumulate(
    List.Sort(List.Distinct(ChangedTypeUS[Fiscal Year])),
    ChangedTypeUS,
    (s,c)=> Table.AddColumn(s, Text.From(c), each if Date.Year([Last Start]) <= Date.Year(#date(Number.From(c), 1, 1)) then 1 else 0, Int64.Type)
),
    #"Reordered Columns" = Table.ReorderColumns(Ad_FiscalYearsAsColumns,{"Employee ID", "Last Start", "2015", "2016", "2017", "Regular/Temporary", "Full Time/Part Time", "Location", "Sex", "Manager(only mark yes if true)", "Ethnicity", "Fiscal Year", "Seperation Date"})
in
    #"Reordered Columns"

v2

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rY/BDsIgDIZfhXBewiCxwFETX8DrsgMqkkXClrHF+PYWh4lznnQNpT8FvvSvKlpigKYFPVg3etOj6jCjCYMh3oQzHi6YnAnGtQKUuHbenK5YRck3WAiti59QZM6CFVkSi2YCHyQ9MRVIOWMmxLG9+SY44nprQ+4Be/5L3u424r6NjQlLw6sAYbqHt0l1igV4sHHIUrFs7MXbj33b2S8z/ov6mK5+AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Employee ID" = _t, #"Regular/Temporary" = _t, #"Full Time/Part Time" = _t, Location = _t, Sex = _t, #"Last Start" = _t, #"Manager#(lf)#(lf)(only mark yes if true)" = _t, Ethnicity = _t, #"Fiscal Year" = _t, #"Seperation Date" = _t]),
    CleanTrimColumnNames = Table.TransformColumnNames(Source, each Text.Trim(Text.Clean(_))),
    ChangedTypeUS = Table.TransformColumnTypes(CleanTrimColumnNames,{{"Last Start", type date}, {"Fiscal Year", Int64.Type}, {"Seperation Date", type date}}, "en-US"),
    FiscalYears = List.Buffer(List.Transform(List.Sort(List.Distinct(ChangedTypeUS[Fiscal Year])), Text.From)),
    StepBack = ChangedTypeUS,
    Ad_FiscalYearsAsColumns = 
        List.Accumulate(
            FiscalYears,
            StepBack,
            (s,c)=> Table.AddColumn(s, c, each if Date.Year([Last Start]) <= Date.Year(#date(Number.From(c), 1, 1)) then 1 else 0, Int64.Type)
        ),
    #"Grouped Rows" = Table.Group(Ad_FiscalYearsAsColumns, {"Employee ID"}, {{"Details", each _}}),
    Result = 
        List.Accumulate(
            FiscalYears,
            #"Grouped Rows",
            (s,c)=> Table.AddColumn(s, c, each Record.Field(Table.Max(Table.SelectColumns([Details], c), c), c), Int64.Type)
        )
in
    Result

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

towardhuzzahmo · ‎01-18-2024

hi @dufoq3 ,

Thank you so much for reading and responding to my post!!!! You are defintly on the right track!!!!! I should have includeded an example of the table I was envisioning as the end product. I essentially, want to condense all of the repeat Employee ID's which is why i was thinking of using the Group By function. I dont necassarily want to treat the repeating ID's as duplicates because they arent, they are unique records for a different fiscal year.

Mock of what I think I want:

*note (my real data spans from 2015-2023)

Employee ID

Regular/Temporary

Full Time/Part Time

Location

Sex

Last Start

Manager

(only mark yes if true)

Ethnicity

Seperation Date

2015

2016

2017

2018

2019

2020

2021

2022

2023

000069

Regular

p

santa land

f

1/2/1986

Black

9/21/2017

1

0

008677

Regular

f

bowling green

f

6/1/2015

yes

Asian

6/16/2017

1

0

I want, whatever the final product is, to be able to be used, to make pivot tables and graphs, maybe even an HR dashboard. Basically, I know I cant do that with the current way my data is structured.

dufoq3 · ‎01-19-2024

If you want me to edit the code - len me know the conditions i.e. Fiscal Year starts 1st of October. Seperation date must be greather than.... etc.

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

towardhuzzahmo · ‎01-19-2024

Hi,

I would love if you could edit the code. That would be very helpful 🙂 The fiscal year runs from July 1 to June 30. I'm not quite sure what the seperation date should be greater than. But I think, it would be greater than July 1, 2015. If there are any other conditions that would be helpful for me to share, let me know. Thank you, thank you!

dufoq3 · ‎01-19-2024

Try this one. Edit 2nd step YourSource = Source and refer your data.

Logic: if employee worked at least 1 day in fiscal year then then 1 else 0.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("rY/BDsIgDIZfhXBewjZdB0dNfAGvCwdUXBYJW2CL8e0tQozLPOkaSn+gfPnbNDTHAEEzetTtZJRDNWB6ZUdFjLIXPFwxC1ayQnBAiWtv1PmGtcyLCguhMvsJReYsWJFVYxGsxIagI5NDXc+YAXHq76azLWmd1jbdAXv9C7M9tMd95ztllwOvAoT4Dh9ORYgFeNR+THIbefzNO0yuH/QXj/+igruKbVKPlE8=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Employee ID" = _t, #"Regular/Temporary" = _t, #"Full Time/Part Time" = _t, Location = _t, Sex = _t, #"Last Start" = _t, #"Manager#(lf)#(lf)(only mark yes if true)" = _t, Ethnicity = _t, #"Fiscal Year" = _t, #"Seperation Date" = _t]),
    YourSource = Source,
    CleanTrimColumnNames = Table.TransformColumnNames(YourSource, each Text.Trim(Text.Clean(_))),
    ChangedTypeUS = Table.TransformColumnTypes(CleanTrimColumnNames,{{"Last Start", type date}, {"Fiscal Year", Int64.Type}, {"Seperation Date", type date}}, "en-US"),
    FiscalYears = List.Buffer(List.Transform(List.Sort(List.Distinct(ChangedTypeUS[Fiscal Year])), Text.From)),
    StepBack = ChangedTypeUS,
    Ad_FiscalYearsAsColumns = 
        List.Accumulate(
            FiscalYears,
            StepBack,
            (s,c)=> Table.AddColumn(s, c, each 
                if Date.Year([Last Start]) < Number.From(c) and ( [Seperation Date] = null or Date.Year([Seperation Date]) > Number.From(c) ) then 1 else
                if [Last Start] >= #date(Number.From(c), 7, 1) or ( [Seperation Date] = null or [Seperation Date] >= #date(Number.From(c), 7, 1) ) then 1 else 0 )
        ),
    GroupedRows = Table.Group(Ad_FiscalYearsAsColumns, {"Employee ID"}, {{"Details", each _, type table}}),
    Ad_FiscalYearsAsColumns2 = 
        List.Accumulate(
            FiscalYears,
            GroupedRows,
            (s,c)=> Table.AddColumn(s, c, each Record.Field(Table.Last(Table.SelectColumns([Details], c)), c), Int64.Type)
        ),
    DetailsLastRow = Table.TransformColumns(Ad_FiscalYearsAsColumns2, {{"Details", Table.Last}}),
    ExpandedDetails = Table.ExpandRecordColumn(DetailsLastRow, "Details", {"Regular/Temporary", "Full Time/Part Time", "Location", "Sex", "Last Start", "Manager(only mark yes if true)", "Ethnicity", "Seperation Date"}, {"Regular/Temporary", "Full Time/Part Time", "Location", "Sex", "Last Start", "Manager(only mark yes if true)", "Ethnicity", "Seperation Date"})
in
    ExpandedDetails

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Need Urgent Help W/ Power Query for HR Analysis

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025

Join us at FabCon Vienna from September 15-18, 2025

Need Urgent Help W/ Power Query for HR Analysis

Helpful resources

Power BI Monthly Update - August 2025

Fabric Community Update - August 2025