Tips for handling common scenarios of employee att...

Anonymous · ‎04-12-2022

In this article, we will discuss how to deal with some common problems when creating reports based on employee attendance sheets. Next, let's take a closer look at the below scenarios.

Scenario 1: Clean and extract valid records from attendance sheets
Invalid or repeated employee punching behavior can sometimes lead to duplication or loss of data, which will affect report creation.
This requires us to eliminate useless records as much as possible and keep only the employee's valid punch-in records (earliest time and latest time).

Sample data:

Solution1:
In Dax, we can use FIRSTNONBLANK and LASTNONBLANK to return the first/last value in the column filtered by the current context, where the expression is not blank.

Calculated column:

First_record =
CALCULATE (
    FIRSTNONBLANK ( Table1_D[DateTime], 1 ),   // MIN(Table1_D[DateTime])
    ALLEXCEPT ( Table1_D, Table1_D[EmployeeID], Table1_D[Date] )
)

Last_record =
CALCULATE (
    LASTNONBLANK ( Table1_D[DateTime], 1 ),  // MAX(Table1_D[DateTime])
    ALLEXCEPT ( Table1_D, Table1_D[EmployeeID], Table1_D[Date] )
)

We can also create a summarize calculated table as below:

Table1_D2 =
SUMMARIZE (
    Table1_D,
    Table1_D[Date],
    Table1_D[EmployeeID],
    "First Record", FIRSTNONBLANK ( Table1_D[DateTime], 1 ),//MIN ( Table1_D[DateTime] )
    "Last Record", LASTNONBLANK ( Table1_D[DateTime], 1 ) //MAX ( Table1_D[DateTime] )
)

Solution2:
In PQ, we can use button ‘Group By’ to achieve the same result.

The button implements the same function as the below M code:

= Table.Group(#"Inserted Date", {"EmployeeID", "Date"}, {{"First_record", each List.Min([DateTime]), type nullable datetime}, {"Last_record", each List.Max([DateTime]), type nullable datetime}})

Solution 3：
Considering that the sample data is sorted by datetime field (that is, the first/last record naturally corresponds to the earliest/latest record), we can also use List.First/ List.Last to filter the data in each group to get the first record and the last record.
M code:

= Table.Group(Custom1, {"EmployeeID","Date"}, {{ "First_record", each List.First([DateTime]),type datetime},{"Last_record", each List.Last([DateTime]),type datetime}

Scenario 2: Analyze consecutive absences for each employee
At present, we have completed the filtering and extraction of data. By adding another custom condition column, we can easily obtain the employee’s daily attendance status.

The following figure is the processed sample form, which records the daily attendance of each employee. Next, we'll explore how to calculate the consecutive times of absences for each employee.

In Dax:
Solution 1: we can use formula as below to count consecutive times of absences.

consecutive absences =
VAR _date =
    CALCULATE (
        MAX ( Table2_D[Date] ),
        FILTER (
            ALL ( Table2_D ),
            Table2_D[Status] <> "absent"
                && Table2_D[Date] <= EARLIER ( Table2_D[Date] )
                && Table2_D[EmployeeID] = EARLIER ( Table2_D[EmployeeID] )
        )
    )     // group and get the last non-absence date before current date
VAR _times =
    CALCULATE (
        COUNTROWS ( Table2_D ),
        FILTER (
            ALL ( Table2_D ),
            Table2_D[Date] <= EARLIER ( Table2_D[Date] )
                && Table2_D[Date] > _date
                && Table2_D[EmployeeID] = EARLIER ( Table2_D[EmployeeID] )
        )
    ) + 0 //Group and count all records between the current date and the last non-absent date
RETURN
    _times

Result:

In Power Query, we can pre-group the data, and then discuss absences based on these grouped sub-tables.

Below are two methods based on two different groupings.

Solution 2.1:
1. Group the filtered data based on ‘EmployeeID’

2. Group sub-tables obtained by the first grouping bases on column ‘Status’

3. Consider using Table.RowCount to count the status in each sub-table. Then, filter each sub-table to only keep records with a status of ‘absent’.

4. Extract the values of 'Times' in all sub-tables and use List.Max to get the maximum value

Result:

Entire M code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZJLCoNAEETvMmvFsdr4WXoBLyAuIrgNIfH+qKAQTPXUohdNPbqnHzOOoQxZQATyaHk8mqPen+W7vNYwZTcACjAFVAp47E3cayj6/7BOhc05+jnzya3IO/I0KD1gen42gNnx8krkdzdIuYFwA+bGPY3JMSXHmBwXIHaM2XEHXHpcoFZAo1aQP2RJTdMG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [EmployeeID = _t, Date = _t, Workday = _t, Status = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"EmployeeID", Int64.Type}, {"Date", type date}, {"Workday", Int64.Type}, {"Status", type text}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Workday] = 1)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"EmployeeID"}, {"Maximum consecutive absences", each 
        List.Max(Table.SelectRows(
            Table.Group(_,"Status",{"Times",each Table.RowCount(_)},0),
                        each [Status]="absent")[Times])
})
in
    #"Grouped Rows"

Solution 2.2:
1. Group the filtered data and count rows based on ‘Employee ID’ and ‘Status’.
Different from the grouping method in Solution 2.1, here I set the parameter ‘GroupKind’ of Table.Group to GroupKind.Local. (GroupKind.Global is used by default)

GroupKind.Global treats all rows with the same key value as a single group.

GroupKind.Local treats consecutive rows with the same key value as a single group.

Therefore, when a single key value appears multiple times, it is regarded as a single value under the global grouping, while it is regarded as multiple values under the local grouping.

2. Group the new table based on 'EmployeeID' and 'Status' and create a new custom column to get the maximum value in ‘Count’.
Result：

Entire M code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZJLCoNAEETvMmvFsdr4WXoBLyAuIrgNIfH+qKAQTPXUohdNPbqnHzOOoQxZQATyaHk8mqPen+W7vNYwZTcACjAFVAp47E3cayj6/7BOhc05+jnzya3IO/I0KD1gen42gNnx8krkdzdIuYFwA+bGPY3JMSXHmBwXIHaM2XEHXHpcoFZAo1aQP2RJTdMG", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [EmployeeID = _t, Date = _t, Workday = _t, Status = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"EmployeeID", Int64.Type}, {"Date", type date}, {"Workday", Int64.Type}, {"Status", type text}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Workday] = 1)),
    #"Grouped Rows" = Table.Group(#"Filtered Rows", {"EmployeeID", "Status"}, {{"Count", each Table.RowCount(_), Int64.Type}},GroupKind.Local),
    #"Grouped Rows1" = Table.Group(#"Grouped Rows", {"EmployeeID", "Status"}, {{"Count", each List.Max([Count]), type number}})
in
    #"Grouped Rows1"

For more information, please check the attachment. Hope you can get something out of this article.

Author: Eason Fang

Reviewer: Kerry Wang & Ula Huang

Tips for handling common scenarios of employee attendance sheets

🏆 Announcing the judges' favorites from the Power...

🏆 Announcing the winners of the Notebooks for Pow...

🏆 Announcing the finalists of the Power BI DataVi...

Visualizing an Entire Year with a Power BI Calenda...

Join the FabCon Hackathon!

FabCon is coming to Atlanta