Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
vemee
Regular Visitor

Help calculating simpson's diversity index

Hi,

I am rather new to Power BI, and would be truly greatfull for some help. I would like to calculate the simpson's diversity index based on the data avaliblie in my report. The calculation is rather straight forward:

 

Skärmavbild 2024-02-27 kl. 08.48.27.png

 

It will all be based on the values of a single column. And what is needed is to calculate the number of times a specific value occurs (n), as well as to calculate the total numer of rows (observations) in that column (N). Then these values can be used to calculate the diversity based on the formula.

How can one write such a formula in Power BI?

If anyone kowns and would like to explain I would be forever greatfull 🙂

Best, Ville

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @vemee ,

Here some steps that I want to share, you can check them if they suitable for your requirement.

Here is my test data:

vheqmsft_0-1709103058225.png
Create a calculate column

Simpson's Diversity Index = 
VAR Total_Observations = COUNTROWS('Table')
VAR Value_Occurrences = 
    CALCULATE(
        COUNTROWS('Table'),
         FILTER('Table',
          'Table'[Colors] = EARLIER('Table'[Colors])
        )
    )
RETURN
1 -  SUMX('Table',Value_Occurrences * (Value_Occurrences-1)) / (Total_Observations * (Total_Observations - 1))

Final output

vheqmsft_1-1709103317436.png

 

Best regards,

Albert He

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

 

 

View solution in original post

3 REPLIES 3
vemee
Regular Visitor

Hi,

Thank you for your qick reply. For me it does not seem to work. How would the SUMX function be able to calculate the number of occurances for each type in the column? And how to sum over all different types?

Lets say my column looks like this:

Colors:
Red
Blue
Blue
Green
Red
Red

Then, what the formula should do is to calculate the total number of values (in this case 6), and also the number of "red" = 3, "blue" = 2 and "green" = 1. Then the diversity can be calculated as,

1 - ( ( 3*(3-1) + 2*(1-1) + 1*(1-1) ) / (6*(6-1) )

Any suggestions how to write a formula for this?

Best, Ville

Anonymous
Not applicable

Hi @vemee ,

Here some steps that I want to share, you can check them if they suitable for your requirement.

Here is my test data:

vheqmsft_0-1709103058225.png
Create a calculate column

Simpson's Diversity Index = 
VAR Total_Observations = COUNTROWS('Table')
VAR Value_Occurrences = 
    CALCULATE(
        COUNTROWS('Table'),
         FILTER('Table',
          'Table'[Colors] = EARLIER('Table'[Colors])
        )
    )
RETURN
1 -  SUMX('Table',Value_Occurrences * (Value_Occurrences-1)) / (Total_Observations * (Total_Observations - 1))

Final output

vheqmsft_1-1709103317436.png

 

Best regards,

Albert He

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

 

 

Syndicate_Admin
Administrator
Administrator

Hola Ville,
Podés probar:
Simpson_Index =
VAR TotalCount = SUM('Species'[Count])
RETURN
DIVIDE(
SUMX('Species', 'Species'[Count] * ('Species'[Count] - 1)),
TotalCount * (TotalCount - 1) )

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors