The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi,
I have a table which looks like this (simplified version):
Date Employee Workflow-Step Workflow Duration
2020-01-01 Smith AA A 5
2020-01-01 Garcia AA A 2
2020-01-01 Jones AB A 6
2020-01-02 Smith AC A 10
2020-01-03 Dixon BA B 15
2020-01-03 Dixon BB B 15
2020-01-03 Dixon BC B 15
2020-01-03 Garcia CA C 20
2020-01-03 Garcia CB C 20
2020-01-03 Garcia CC C 20
.
.
.
There are thousands of rows with hundreds of different Workflow-Steps and dozens of different Workflows. Every Workflow Step belongs to one Workflow.
I want to calculate the sum of the duration of all Workflow-Steps grouped by the Workflow. So in my example the sum of Workflow A would be 23, of Workflow B 45 and Workflow C 60.
Then I want to calculate the cumulative shares to see which Workflow-Steps account for 95% for the duration of all Workflow-Steps.
Ultimately I want to add a calculated column to the initial table which indicates if a Workflow-Step belongs to Workflow-Steps which account for 95% of the durations of a Workflow.
So lets say for example for Workflow A the Workflow Steps AB and AC, for Workflow B the Workflow Steps BA and BB and for Workflow C the Workflow-Step CA would account for 95% of the sum of duration. Then my table would look like this:
Date Employee Workflow-Step Workflow Duration Relevant
2020-01-01 Smith AA A 5 No
2020-01-01 Garcia AA A 2 No
2020-01-01 Jones AB A 6 Yes
2020-01-02 Smith AC A 10 Yes
2020-01-03 Dixon BA B 15 Yes
2020-01-03 Dixon BB B 15 Yes
2020-01-03 Dixon BC B 15 No
2020-01-03 Garcia CA C 20 Yes
2020-01-03 Garcia CB C 20 No
2020-01-03 Garcia CC C 20 No
.
.
.
How can I add this column?
Thanks.
Hi @Anonymous ,
According to my understand, you want to set a flag when the Duration is larger than 95% of the total duration each Workflow.
For A, the total duration is 23 ,so when the Duration >=23*0.95, then set Yes, otherwise ,set No ,right?
You could use ALLEXCEPT() to calculate each group's total:
Measure =
VAR sumEachWorkflow =
CALCULATE (
SUM ( EmpTable[Duration] ),
ALLEXCEPT ( EmpTable, EmpTable[Workflow] )
)
RETURN
IF (
SELECTEDVALUE ( EmpTable[Duration] ) / sumEachWorkflow >= sumEachWorkflow * 0.95,
"Yes",
"No"
)
But It seems different from the expected output you gave...
Could explain to me in more detail? Thanks.
Best regards,
Eyelyn Qin
Hi @Anonymous
Thanks for your time.
Not exactly, I want to set a flag with "Yes" if a Workflow-Step belongs to the Workflow-Steps which account for 95% of the duration of all Workflow-Steps in a Workflow, so I need the cumulative share. All the other Workflow-Steps, for which the sum of the durations is very small (96% - 100% of the cumulative share) should be flagged with "No".
Thanks.
Hi,
On the data that you have shared, show the expected result very clearly.
Hi @Anonymous ,
Sorry,I felt a little confused about your description...
For A ,when the workflow-step begins with the same workflow, then each step accounts for 2/(2+5+6+10), 5/(2+5+6+10)
,6/(2+5+6+10) 10/(2+5+6+10), Is it correct ?
Could you use specific values as examples?
Measure =
VAR sumEachWorkflow =
CALCULATE (
SUM ( EmpTable[Duration] ),
ALLEXCEPT ( EmpTable, EmpTable[Workflow] )
)
VAR _belongsToWorkflow =
IF (
LEFT ( SELECTEDVALUE ( EmpTable[Workflow-Step] ), 1 )
= SELECTEDVALUE ( EmpTable[Workflow] ),
1,
0
)
RETURN
IF (
_belongsToWorkflow = 1
&& SELECTEDVALUE ( EmpTable[Duration] ) / sumEachWorkflow <= sumEachWorkflow * 0.95,
"Yes",
"No"
)
Did I answer your question ? Please mark my reply as solution. Thank you very much.
If not, please upload some insensitive data samples and expected output.
Best Regards,
Eyelyn Qin
You said that you want to "calculate the sum of the duration of all Workflow-Steps grouped by the Workflow."
In that case, A is 23, B is 45, and C = 60.
But you said, "A would be 23, of Workflow B 15 and Workflow C 20. "
Why?
Also, How did you calculate the "Yes/No" for the relevant column? It appears random to me. If you can show the calculation method, I think I can help you with the DAX code for the calculated column.
Hi,
You are right , A is 23, B is 45, and C = 60., that was my mistake.
Regarding the YES/NO column, I made the assumption that there are way more rows than I posted. - so yes, it seems random if you only take the durations I posted). If I only took the Workflow Steps I posted the sample would be too small. So lets assume that there are a million more rows and it turns out Workflow Steps AB, AC, BA, BB and CA are the relevant ones in their respective Workflows.