Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Anonymous
Not applicable

Duplicate row values based on Date field

Hello all,

 

 

I connect with Power BI to BambooHR API. This is an HRIS system, consisting of:

- As of today data for Status (Active and Inactive), Gender, Ethnicit, Hire Date etc 

- Transactional (historical) data for Department, Job Title, Employment Status change (mostly business related data)

 

My issue: this HRIS system does not suport retroactive reports. So let`s assume that if someone was hired in January 2022 and was ACTIVE and Terminated in April 2022, the status will be updated to INACTIVE. When I run a report, I will get the most recent status...

 

So, for the purpose of my analysis, I want to "Transform" this in the following:

 

How it Currently Looks 
Employee #NameStatusHire DateDate of Leaving 
1DianaINACTIVE1/1/20204/4/2021 
2MaryACTIVE2/2/2021NULL 
3BobACTIVE4/4/2021NULL 
What I want to do
Date of Export
Will be generate by myself in PQ
Employee #NameStatusHire DateDate of Leaving
12/31/20201DianaACTIVE1/1/20204/4/2021 (or null)
1/31/20211DianaACTIVE1/1/20204/4/2021 (or null)
2/28/20211DianaACTIVE1/1/20204/4/2021 (or null)
2/28/20212MaryACTIVE2/2/2021null
3/31/20211DianaACTIVE1/1/20204/4/2021 (or null)
3/31/20212MaryACTIVE2/2/2021null
4/30/20211DianaINACTIVE1/1/20204/4/2021
4/30/20212MaryACTIVE2/2/2021null
4/30/20213BobACTIVE4/4/2021null

 

How can I obtain this behaviour in PQ in an automated way? So when I update the power bi, it`s going to do what I need.

 

Any suggestions would be apprecicated. I also tried to look online, but I did not find any advice.

1 ACCEPTED SOLUTION
HotChilli
Super User
Super User

I think I'm going to politely disagree with Pete's response here.

The information in the first table is enough to generate the 2nd table.  All we need is a column of dates for which we want the status to be evaluated.  That column can be hardcoded or generated from parameters or a list up until today, for example.  I'll leave that for the original poster to decide.

Here's my M code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjTSNzbUNzIwMlCK1QFyoTxDMM9I38gCF88YRSUqz0Tf2IAIXiwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Column1", type date}}, "en-US"),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type with Locale"),
    #"Added Custom" = Table.AddColumn(#"Removed Duplicates", "QTab", each TableQ),
    #"Expanded QTab" = Table.ExpandTableColumn(#"Added Custom", "QTab", {"Employee #", "Name", "Status", "Hire Date", "Date of Leaving"}, {"QTab.Employee #", "QTab.Name", "QTab.Status", "QTab.Hire Date", "QTab.Date of Leaving"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded QTab",{{"Column1", type date}, {"QTab.Employee #", Int64.Type}, {"QTab.Name", type text}, {"QTab.Status", type text}, {"QTab.Hire Date", type date}, {"QTab.Date of Leaving", type date}}),
    #"Added Conditional Column" = Table.AddColumn(#"Changed Type1", "KeepRow", each if [Column1] < [QTab.Hire Date] then 0 else 1),
    #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([KeepRow] = 1)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"KeepRow"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns1", "StatusAtDate", each if [QTab.Date of Leaving] = null or [Column1] < [QTab.Date of Leaving] then "Active" else "Inactive"),
    #"Removed Columns2" = Table.RemoveColumns(#"Added Custom1",{"QTab.Status"})
in
    #"Removed Columns2"

TableQ is the first table provided with the 3 rows.

There are quite a few steps but a lot of it is housekeeping for US-style dates and getting a column of month ends to start with.

I just apply a few rules to get the status on each date.  There hasn't been any testing for for more complex scenarios, for example, firing and rehiring individuals but it does produce the 2nd table.

Obviously, it recreates from scratch each time the queries are refreshed.

Let me know what you think.

View solution in original post

5 REPLIES 5
HotChilli
Super User
Super User

I think I'm going to politely disagree with Pete's response here.

The information in the first table is enough to generate the 2nd table.  All we need is a column of dates for which we want the status to be evaluated.  That column can be hardcoded or generated from parameters or a list up until today, for example.  I'll leave that for the original poster to decide.

Here's my M code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjTSNzbUNzIwMlCK1QFyoTxDMM9I38gCF88YRSUqz0Tf2IAIXiwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Column1", type date}}, "en-US"),
    #"Removed Duplicates" = Table.Distinct(#"Changed Type with Locale"),
    #"Added Custom" = Table.AddColumn(#"Removed Duplicates", "QTab", each TableQ),
    #"Expanded QTab" = Table.ExpandTableColumn(#"Added Custom", "QTab", {"Employee #", "Name", "Status", "Hire Date", "Date of Leaving"}, {"QTab.Employee #", "QTab.Name", "QTab.Status", "QTab.Hire Date", "QTab.Date of Leaving"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Expanded QTab",{{"Column1", type date}, {"QTab.Employee #", Int64.Type}, {"QTab.Name", type text}, {"QTab.Status", type text}, {"QTab.Hire Date", type date}, {"QTab.Date of Leaving", type date}}),
    #"Added Conditional Column" = Table.AddColumn(#"Changed Type1", "KeepRow", each if [Column1] < [QTab.Hire Date] then 0 else 1),
    #"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([KeepRow] = 1)),
    #"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows",{"KeepRow"}),
    #"Added Custom1" = Table.AddColumn(#"Removed Columns1", "StatusAtDate", each if [QTab.Date of Leaving] = null or [Column1] < [QTab.Date of Leaving] then "Active" else "Inactive"),
    #"Removed Columns2" = Table.RemoveColumns(#"Added Custom1",{"QTab.Status"})
in
    #"Removed Columns2"

TableQ is the first table provided with the 3 rows.

There are quite a few steps but a lot of it is housekeeping for US-style dates and getting a column of month ends to start with.

I just apply a few rules to get the status on each date.  There hasn't been any testing for for more complex scenarios, for example, firing and rehiring individuals but it does produce the 2nd table.

Obviously, it recreates from scratch each time the queries are refreshed.

Let me know what you think.

Anonymous
Not applicable

I just tested your method and wanted to say, this is GENIOUS!!!!!!! The missing piece was this one:

 

= Table.AddColumn(#"Removed Duplicates", "QTab", each TableQ)

 

This is great, thank you SO very much!!!!!!!!!!!!!!!!!!!!

Anonymous
Not applicable

Wow, thank you, I will definetly give it a try 🙂 I am not sure how it will work exactly, but I am sure to give it a try! There is a lot of modelling I must do. 

 

Even so, thank you for taking your time, I love this solution! 

 

@BA_Pete thank you for your suggestion too, I learned something new - I always heard about incremental refresh, but never really gotten to learn more about it. This is because I did not need it. Thank you!!

Hi @HotChilli ,

 

You were right to disagree. Your answer is clearly in line with what OP was actually asking for.

Additionally, I think your concern around firing and rehiring individuals within your solution is moot as I presume that they would be issued a new [Employee #] on rehire, so your answer still works as intended.

 

@Anonymous : Apologies. My assessment of your post was hasty and incorrect. I believe @HotChilli has provided exactly what you need.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




BA_Pete
Super User
Super User

Hi @Anonymous ,

 

It looks as though you want to use Power Query to retain historical data when it is not present in the source.

Power Query is a mashup tool, not a data warehouse, thus the Power Query refresh process looks like this:

Wipe all data > Get current data from source > Perform transformations.

Your historical data will also be wiped if it is no longer present in the source data.

There are a few hacks that people have done to get this to work but, they're hacks, so I'll not link them. Just know that they do exist if you really wanted to go down that route.

 

The closest you might get to this within the Power BI sphere is by using incremental refresh in the PBI Service:

https://docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-overview 

 

Other than that, see if your BambooHR system devs can build you an Employee Status SCD at source.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors