Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hello everyone 😃
I do have a table containing the following columns:
Personal Number, Date, Team, Project, TimeCategory, Time
There are multiple Personal Numbers per Team, multiple Teams per Project and also multiple TimeCategories per Date and Personal Number.
It is my goal to determine whether the high qutoas of illness are due to some extreme examples of high illness rates or a case of illness rates being high in general.
In order to answer this question, I want to use a statistical tool called the coefficient of variation. This is the standard deviation divided by the expected value.
https://en.wikipedia.org/wiki/Coefficient_of_variation
Because of different contract lengths, I will not use some equal distribution of the quotas.
I will use the illness quota per employee as a random variable with probability being equal to the proportion of the employees hours to the hours of the team/project within the selected time period.
Times relevant may be filtered using the column TimeCategory.
I know how to use CALUCATE() im combination with FILTER(), ALL(), ALLEXCEPT(), but I am lost here.
My problem is: The data must be dynamically preaggregated by Personal Number in order to get the right results. And I do not not know how to do this.
The formula itself will look like this:
CV = (((Illness Quota of employee A- Illness Quota of Team)^2)*(Total Hours of Employee A)/(Total Hours of Team) +.....+((Illness Quota of employee Y- Illness Quota of Team)^2)*(Total Hours of Employee Y)/(Total Hours of Team) )/(IllnessQuota of Team)
As said: My problem is the fact, that the data are not preaggregated by Personal Number.
Need help!
Solved! Go to Solution.
All filtering is one-way only from a dimension to the fact table.
// Dimensions:
// Employee connected to FactTable[PIN] (1:*)
// Calendar connected to FactTable[Date] (1:*)
// Team connected to FactTable[TeamId] (1:*)
// Project connected to FactTable[ProjectId] (1:*)
// TimeCategory connected to FactTable[TimeCategoryId] (1:*)
// All *Id fields are hidden in dimensions.
//
// All columns in the FactTable must be hidden.
// Only measures can be visible. All slicing is
// done through dimensions. This is the correct
// star schema model.
// Define the following measures. They work for any
// slice for any dimension. In particular for
// slices on Employee.
[Total] = SUM( FactTable[Hours] )
[Illness] =
CALCULATE(
[Total],
KEEPFILTERS( TimeCategory[TimeCategoryId] = "Illness" )
)
[IllnessQuota] = DIVIDE( [Illness], [Total] )
[Proportion To Team] =
var __totalForEmps = [Total]
var __teamsOfEmps =
summarize(
FactTable,
Team[TeamId]
)
var __totalForTeams =
calculate(
[Total],
__teamsOfEmps,
all( Employee ),
all( Team )
)
var __result =
divide( __totalForEmps, __totalForTeams )
return
__result
[VC for Team] =
var __oneTeamVisible = hasonevalue( Team[TeamId] )
var __team =
SUMMARIZE(
FactTable,
Team[TeamId]
)
var __employees =
SUMMARIZE(
FactTable,
Employee[PIN]
)
var __numerator =
SQRT(
SUMX(
__employees,
var __iqForEmp = [IllnessQuota]
var __iqForTeam =
calculate(
[IllnessQuota],
__team,
all( Team ),
all( Employee )
)
var __propToTeam = [Proportion To Team]
var __result =
__propToTeam
* POWER( __iqForEmp - __iqForTeam, 2 )
return
__result
)
)
var __denominator =
calculate(
[IllnessQuota],
__team,
all( Team ),
all( Employee )
)
var __varCoeff =
DIVIDE( __numerator, __denominator )
return
if( __oneTeamVisible, __varCoeff )
Once you've implemented the correct model, please let me know how it goes. Thanks 🙂
Best
D
I'll have to look, I believe I cover Covariance in my upcoming book, DAX Cookbook. Comes out next week. But would need sample data. Please see this post regarding How to Get Your Question Answered Quickly: https://community.powerbi.com/t5/Community-Blog/How-to-Get-Your-Question-Answered-Quickly/ba-p/38490
Ah yes, here is the formula that I was using for covariance.
Covariance =
VAR __Table = 'R04_Table'
VAR __Count = COUNTROWS(__Table)
VAR __AvgA = AVERAGEX(__Table,[A])
VAR __AvgB = AVERAGEX(__Table,[B])
VAR __Table1 =
ADDCOLUMNS(
__Table,
"__Covariance",
DIVIDE(
([A] - __AvgA) * ([B] - __AvgB),
__Count
)
)
RETURN
SUMX(__Table1,[__Covariance])
All filtering is one-way only from a dimension to the fact table.
// Dimensions:
// Employee connected to FactTable[PIN] (1:*)
// Calendar connected to FactTable[Date] (1:*)
// Team connected to FactTable[TeamId] (1:*)
// Project connected to FactTable[ProjectId] (1:*)
// TimeCategory connected to FactTable[TimeCategoryId] (1:*)
// All *Id fields are hidden in dimensions.
//
// All columns in the FactTable must be hidden.
// Only measures can be visible. All slicing is
// done through dimensions. This is the correct
// star schema model.
// Define the following measures. They work for any
// slice for any dimension. In particular for
// slices on Employee.
[Total] = SUM( FactTable[Hours] )
[Illness] =
CALCULATE(
[Total],
KEEPFILTERS( TimeCategory[TimeCategoryId] = "Illness" )
)
[IllnessQuota] = DIVIDE( [Illness], [Total] )
[Proportion To Team] =
var __totalForEmps = [Total]
var __teamsOfEmps =
summarize(
FactTable,
Team[TeamId]
)
var __totalForTeams =
calculate(
[Total],
__teamsOfEmps,
all( Employee ),
all( Team )
)
var __result =
divide( __totalForEmps, __totalForTeams )
return
__result
[VC for Team] =
var __oneTeamVisible = hasonevalue( Team[TeamId] )
var __team =
SUMMARIZE(
FactTable,
Team[TeamId]
)
var __employees =
SUMMARIZE(
FactTable,
Employee[PIN]
)
var __numerator =
SQRT(
SUMX(
__employees,
var __iqForEmp = [IllnessQuota]
var __iqForTeam =
calculate(
[IllnessQuota],
__team,
all( Team ),
all( Employee )
)
var __propToTeam = [Proportion To Team]
var __result =
__propToTeam
* POWER( __iqForEmp - __iqForTeam, 2 )
return
__result
)
)
var __denominator =
calculate(
[IllnessQuota],
__team,
all( Team ),
all( Employee )
)
var __varCoeff =
DIVIDE( __numerator, __denominator )
return
if( __oneTeamVisible, __varCoeff )
Once you've implemented the correct model, please let me know how it goes. Thanks 🙂
Best
D
@Anonymous Thanks a bunch.
SUMMARIZE was all the thing I actually needed. Was able to put it all into one smaller measure:
Yeah, sorry about that @Schmidtmayer , read that one too quickly!
Hi there. Before you think you've got the solution right, have a look at this article:
https://www.sqlbi.com/articles/all-the-secrets-of-summarize/
Here's an excerpt from it:
UPDATE 2018-01-24 : The content of this article is obsolete as of January 2018. Recent versions of Excel 2016, Power BI, and Analysis Services have a SUMMARIZE behavior that is different from the one described in this article. As noted below, using SUMMARIZE should be deprecated for aggregations and you should use SUMMARIZECOLUMNS instead. Read more in Introducing SUMMARIZECOLUMNS.
Disregard the "obsolete" note. It's not relevant to your case. You should not use SUMMARIZE to do any calculations inside it. Instead, you should always use the SUMMARIZE/ADDCOLUMNS combination.
Please read the article if you want to be sure your DAX is always correct. But there's more to it. You should also have the proper dimensional design (as I outlined in my measure in the preamble) if you want to be safe in the knowledge that your measures always work correctly in any circumstances. If you want to know what can happen if you don't do this, then go and read this: https://www.sqlbi.com/articles/understanding-dax-auto-exist/
Be careful with how you structure your models because the correctness of figures DAX produces depends on it.
Best
D
@Anonymous:
Thanks for pointing out. I was aware of this during testing of my measure. All the values were complete garbage the first time I tried it. That's why I used the second table for the values of the whole group I am looking at and crossjoining it. Til now, everything works fine. Also checked the values via pocket calculator.
Regaring the DataModell: Sorry for ignoring it. We are using this model as a standard within our departement, so I didn't answer this.
Greetings Tom
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
10 | |
10 | |
10 | |
9 | |
7 |
User | Count |
---|---|
17 | |
12 | |
11 | |
11 | |
10 |