The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi,
I am rather new to Power BI, and would be truly greatfull for some help. I would like to calculate the simpson's diversity index based on the data avaliblie in my report. The calculation is rather straight forward:
It will all be based on the values of a single column. And what is needed is to calculate the number of times a specific value occurs (n), as well as to calculate the total numer of rows (observations) in that column (N). Then these values can be used to calculate the diversity based on the formula.
How can one write such a formula in Power BI?
If anyone kowns and would like to explain I would be forever greatfull 🙂
Best, Ville
Solved! Go to Solution.
Hi @vemee ,
Here some steps that I want to share, you can check them if they suitable for your requirement.
Here is my test data:
Create a calculate column
Simpson's Diversity Index =
VAR Total_Observations = COUNTROWS('Table')
VAR Value_Occurrences =
CALCULATE(
COUNTROWS('Table'),
FILTER('Table',
'Table'[Colors] = EARLIER('Table'[Colors])
)
)
RETURN
1 - SUMX('Table',Value_Occurrences * (Value_Occurrences-1)) / (Total_Observations * (Total_Observations - 1))
Final output
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Hi,
Thank you for your qick reply. For me it does not seem to work. How would the SUMX function be able to calculate the number of occurances for each type in the column? And how to sum over all different types?
Lets say my column looks like this:
Colors:
Red
Blue
Blue
Green
Red
Red
Then, what the formula should do is to calculate the total number of values (in this case 6), and also the number of "red" = 3, "blue" = 2 and "green" = 1. Then the diversity can be calculated as,
1 - ( ( 3*(3-1) + 2*(1-1) + 1*(1-1) ) / (6*(6-1) )
Any suggestions how to write a formula for this?
Best, Ville
Hi @vemee ,
Here some steps that I want to share, you can check them if they suitable for your requirement.
Here is my test data:
Create a calculate column
Simpson's Diversity Index =
VAR Total_Observations = COUNTROWS('Table')
VAR Value_Occurrences =
CALCULATE(
COUNTROWS('Table'),
FILTER('Table',
'Table'[Colors] = EARLIER('Table'[Colors])
)
)
RETURN
1 - SUMX('Table',Value_Occurrences * (Value_Occurrences-1)) / (Total_Observations * (Total_Observations - 1))
Final output
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Hola Ville,
Podés probar:
Simpson_Index =
VAR TotalCount = SUM('Species'[Count])
RETURN
DIVIDE(
SUMX('Species', 'Species'[Count] * ('Species'[Count] - 1)),
TotalCount * (TotalCount - 1) )