Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi there,
I'm not very good with power bi so was hoping someone can help me with a formula
I have the following table of employee IDs and the date of any sick days that they have taken. Each row therefore represences an instance of sickness for the employee:
Employee ID | Sickness Start |
A | 01/01/2018 |
A | 01/03/2018 |
A | 01/03/2018 |
A | 01/04/2018 |
A | 01/06/2018 |
A | 01/07/2018 |
A | 01/07/2018 |
B | 01/02/2018 |
B | 01/04/2018 |
B | 01/05/2018 |
B | 01/06/2018 |
B | 01/06/2018 |
B | 01/07/2018 |
B | 01/08/2018 |
C | 01/02/2018 |
C | 01/03/2018 |
C | 01/03/2018 |
C | 01/04/2018 |
C | 01/04/2018 |
C | 01/06/2018 |
C | 01/07/2018 |
C | 01/07/2018 |
C | 01/07/2018 |
C | 01/07/2018 |
C | 01/09/2018 |
C | 01/10/2018 |
C | 01/10/2018 |
C | 01/10/2018 |
C | 01/10/2018 |
C | 01/10/2018 |
----------
I'd like to add a new column to this table which - for each employee id - puts a numeric flag against each set of sickness instances where 3 different instances of sickness have occured within a 2 month time period. No instance of sickness should be used twice for this calculation and the dates should be "allocated" from earliest sick date.
I'll explain using an example as it'll be easier to understand - here is the final result i should end up with based on my sample data:
Employee ID | Sickness Start | Flag |
A | 01/01/2018 | |
A | 01/03/2018 | 1 |
A | 01/03/2018 | 1 |
A | 01/04/2018 | 1 |
A | 01/06/2018 | 2 |
A | 01/07/2018 | 2 |
A | 01/07/2018 | 2 |
B | 01/02/2018 | |
B | 01/04/2018 | |
B | 01/05/2018 | 1 |
B | 01/06/2018 | 1 |
B | 01/06/2018 | 1 |
B | 01/07/2018 | |
B | 01/08/2018 | |
C | 01/02/2018 | 1 |
C | 01/03/2018 | 1 |
C | 01/03/2018 | 1 |
C | 01/04/2018 | |
C | 01/04/2018 | |
C | 01/06/2018 | 2 |
C | 01/07/2018 | 2 |
C | 01/07/2018 | 2 |
C | 01/07/2018 | |
C | 01/07/2018 | |
C | 01/09/2018 | 3 |
C | 01/10/2018 | 3 |
C | 01/10/2018 | 3 |
C | 01/10/2018 | 4 |
C | 01/10/2018 | 4 |
C | 01/10/2018 | 4 |
so for employee A - their first set of 3 sick days that were taken in a 2 month period are as follows:
A | 01/03/2018 | 1 |
A | 01/03/2018 | 1 |
A | 01/04/2018 | 1 |
so have all been flaged with a 1 - to denote it is the first occurance
....then second occurance for employee A is the following set of dates:
A | 01/06/2018 | 2 |
A | 01/07/2018 | 2 |
A | 01/07/2018 | 2 |
and so all have been flagged with a 2.
and so on.
I hope that makes sense? Please note that the source table will not be in date order as per my example - and I also may want to change the logic rules - e.g. make it 4 instances of sickness within a 6 month period for example. Also my dates will not always fall on the 01st of the month - however i've given the example as a good way of checking boundary conditions.
Any help you can provide with the DAX will be much appreciated 🙂
Hi,
Just cannot understand the business logic at all. Why should there be 2 rows for the same emploee for the same day?
Hi Ashish,
you are right - please consider the amended sickness table below instead (removed duplicate sickness days per employee):
Employee ID | Sickness Start |
A | 01/01/2018 |
A | 01/03/2018 |
A | 02/03/2018 |
A | 01/04/2018 |
A | 01/06/2018 |
A | 01/07/2018 |
A | 02/07/2018 |
B | 01/02/2018 |
B | 01/04/2018 |
B | 01/05/2018 |
B | 01/06/2018 |
B | 02/06/2018 |
B | 01/07/2018 |
B | 01/08/2018 |
C | 01/02/2018 |
C | 01/03/2018 |
C | 02/03/2018 |
C | 01/04/2018 |
C | 02/04/2018 |
C | 01/06/2018 |
C | 01/07/2018 |
C | 02/07/2018 |
C | 03/07/2018 |
C | 04/07/2018 |
C | 01/09/2018 |
C | 01/10/2018 |
C | 02/10/2018 |
C | 03/10/2018 |
C | 04/10/2018 |
C | 05/10/2018 |
and the results should look as follows:
Employee ID | Sickness Start | flag |
A | 01/01/2018 | |
A | 01/03/2018 | 1 |
A | 02/03/2018 | 1 |
A | 01/04/2018 | 1 |
A | 01/06/2018 | 2 |
A | 01/07/2018 | 2 |
A | 02/07/2018 | 2 |
B | 01/02/2018 | |
B | 01/04/2018 | |
B | 01/05/2018 | |
B | 01/06/2018 | 1 |
B | 02/06/2018 | 1 |
B | 01/07/2018 | 1 |
B | 01/08/2018 | |
C | 01/02/2018 | 1 |
C | 01/03/2018 | 1 |
C | 02/03/2018 | 1 |
C | 01/04/2018 | |
C | 02/04/2018 | |
C | 01/06/2018 | |
C | 01/07/2018 | 2 |
C | 02/07/2018 | 2 |
C | 03/07/2018 | 2 |
C | 04/07/2018 | |
C | 01/09/2018 | 3 |
C | 01/10/2018 | 3 |
C | 02/10/2018 | 3 |
C | 03/10/2018 | 4 |
C | 04/10/2018 | 4 |
C | 05/10/2018 | 4 |
hope that makes a bit more sense.......
please note that the following set of sicknesses:
B | 01/04/2018 | |
B | 01/05/2018 | |
B | 01/06/2018 |
isn't being flagged as a set that fall in a 2 month period - because 01/04/2018 to 01/06/2018 is one day over the 2 month threshold (just for boundary condition logic)
hope that makes sense and many thanks for any help you can give
thanks
Hi,
I am at a loss. I do not know what DAX logic can work here.
Hi Ashish,
Maybe if i explain what I am trying to achieve - then you could think of a way to do this?
Basically - I have a table showing employees and the day they are off sick (assume they are only off sick for 1 day at a time)
What I need to be able to identfy is where an employee has taken 4 sick days within a 6 week period.
can you think of any way to achieve this?
thanks
Dom
Hi,
Do they have to be 4 consecutive days?
no - just 4 days that fall within a 6 week period
many thanks
Dom
Hi,
Based on the source dataset that you shared, show the expected result so that i can match my answer with yours. Also, what will the user be selecting in the slicer/filter. What will be there in the row/column labels of the visual?
Hi again,
I've shared the results already - obviously i don't know how you are going to implement this - but they should flag the following series of dates as falling within the time period that I specified:
A | 01/03/2018 |
A | 02/03/2018 |
A | 01/04/2018 |
A | 01/06/2018 |
A | 01/07/2018 |
A | 02/07/2018 |
B | 01/06/2018 |
B | 02/06/2018 |
B | 01/07/2018 |
C | 01/02/2018 |
C | 01/03/2018 |
C | 02/03/2018 |
C | 01/07/2018 |
C | 02/07/2018 |
C | 03/07/2018 |
C | 01/09/2018 |
C | 01/10/2018 |
C | 02/10/2018 |
C | 03/10/2018 |
C | 04/10/2018 |
C | 05/10/2018 |
again don't worry about slicers/filters/row/col labels - I just need a method of identifying the above dates out of the main dataset - and I will be doing further calculations on them
thanks so much
I cannot relation your question with your expected result. My interpretation is that the user will select a certain date in the slicer (say September 30, 2018) and for each employee, we will count how many leaves that person took in the 6 weeks ended September 30? This measure can then be filtered on values >=4.
No worries Ashish - i think this is too complex to do in Dax
I'm going to write an ETL script to do these calculations before loading instead
many thanks though for all your help as always!
@domdom Just want to understand bit more detail about the data, Could you please explain what's the reason behind having same sickstartdate more than once, for example - For Employee A, 01/03/2018 was repeated twice.
Proud to be a PBI Community Champion