Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.

Reply
Loubot3000
Resolver II
Resolver II

Significance testing

Is there any way to do significance testing in Power BI and have it somehow display on a graph with the raw numbers?

For example, I have the following graph, which is calculated by doing the weighted percentages of people responding with certain answers at certain questions each month.

 

Loubot3000_1-1690379715665.png

 

I want to do significance testing to compare the number of each month with the previous month, and highlight it or possibly have an arrow or something if it significantly different.

 

Is this possible? How would I do this?

 

Thanks

1 ACCEPTED SOLUTION
Loubot3000
Resolver II
Resolver II

I think I've figured it out. After a bunch of research, for my case where I'm comparing the population of two weighted independent samples whose sizes are quite large (in the thousands), I think the correct statistical test to use is a weighted two-sample two-tailed Z-test of populations.

 

Scroll to the bottom to see my explanation of why this test makes sense. Also pease somebody correct me if I'm wrong.

 

Here is my code:

 

Buzz significance test =

VAR currentMonthSuccesses = CALCULATE(SUM(Sheet1[Weight]), BUZZ[BUZZ] = "SkyShowtime", BUZZ[Response] = "Yes")
VAR currentMonthPopulation = SUM(Sheet1[Weight])
VAR currentMonthProportion = currentMonthSuccesses / currentMonthPopulation

VAR prevMonthSuccesses = CALCULATE(SUM(Sheet1[Weight]), BUZZ[BUZZ] = "SkyShowtime", BUZZ[Response] = "Yes", PREVIOUSMONTH(CALANDER[Date]))
VAR prevMonthPopulation = CALCULATE(SUM(Sheet1[Weight]), PREVIOUSMONTH(CALANDER[Date]))
VAR prevMonthProportion = prevMonthSuccesses / prevMonthPopulation

VAR pooledProportion =
    (currentMonthSuccesses + prevMonthSuccesses)
    / (currentMonthPopulation + prevMonthPopulation)

VAR Z_score =
    IF(
        NOT(currentMonthPopulation) || NOT(prevMonthPopulation),
        BLANK(),
        (currentMonthProportion - prevMonthProportion)
        / SQRT(pooledProportion * (1 - pooledProportion) * (1/currentMonthPopulation + 1/prevMonthPopulation))
    )
// This is the Z-test statistic, used to compare against the critical Z-score(s) for a given confidence interval. It has to be encased in an IF() statement to make sure it's not calculating the score for empty months, otherwise the SQRT() function breaks.

VAR Z_test =
    IF(Z_score > 1.96, 1, IF(z_score < -1.96, -1, 0))
// +/-1.96 is the critical Z-score for a two-tailed Z-test with a confidence level of 5%.

RETURN Z_test

 

 

The weight is accounted for by just summing the weights instead of doing distinct counts to calculate the proportions.

 

Here you can see the significance for each month illustrated in a table above the graph:

 

Loubot3000_0-1691051281931.png

 

The arrows and +/- 1 show the direct of significance. You can see that it visually makes sense with what the data is doing. I achieved that by right clicking the significance test field in the visualization and doing conditional formatting.

 

This way of illustrating it (arrows) should work well if added to an actual table of data, however not so much for a line graph. Will have to find another way of doing it.

 

Why Z-test:

The reason the T-test isn't done is because that assumes that the samples are so small (<30) that the mean/variance can't reliably be calculated. In my case, it's in the thousands. Also one reason to rule out a bunch of other statistical tests is that they test the dependence of one variable on another (e.g. proportion of people answering a question, and month), whereas in this case we just have two samples that happen to be from two consecutive months, but the month variable doesn't really matter.

 

On my earlier question about the mean/standard deviation of a proportion - I think actually it does make sense to talk about the varience of a proportion. It is a statistic about the normal probability distribution of making a new observation of a proportion. That distribution definitely has a mean and stdev. Another way of thinking about it is that if you have 500 successes out of 1000 people, the stdev is smaller because you're more likely to see new observations of successes centred around 500 (e.g. 501 successes would make no difference to the proportion), however if you had 5 successes out of 10 people, the stdev would be much bigger because the new observations wouldn't be as cented around 5 (6 or 7 people out of 10 would be a much bigger difference to the proportion).

 

At least that's the way I understand it. Please somebody correct me if I'm wrong.

 

Here's more on the Z-test of proportions: Z Test: Definition & Two Proportion Z-Test - Statistics How To

View solution in original post

10 REPLIES 10
Loubot3000
Resolver II
Resolver II

I think I've figured it out. After a bunch of research, for my case where I'm comparing the population of two weighted independent samples whose sizes are quite large (in the thousands), I think the correct statistical test to use is a weighted two-sample two-tailed Z-test of populations.

 

Scroll to the bottom to see my explanation of why this test makes sense. Also pease somebody correct me if I'm wrong.

 

Here is my code:

 

Buzz significance test =

VAR currentMonthSuccesses = CALCULATE(SUM(Sheet1[Weight]), BUZZ[BUZZ] = "SkyShowtime", BUZZ[Response] = "Yes")
VAR currentMonthPopulation = SUM(Sheet1[Weight])
VAR currentMonthProportion = currentMonthSuccesses / currentMonthPopulation

VAR prevMonthSuccesses = CALCULATE(SUM(Sheet1[Weight]), BUZZ[BUZZ] = "SkyShowtime", BUZZ[Response] = "Yes", PREVIOUSMONTH(CALANDER[Date]))
VAR prevMonthPopulation = CALCULATE(SUM(Sheet1[Weight]), PREVIOUSMONTH(CALANDER[Date]))
VAR prevMonthProportion = prevMonthSuccesses / prevMonthPopulation

VAR pooledProportion =
    (currentMonthSuccesses + prevMonthSuccesses)
    / (currentMonthPopulation + prevMonthPopulation)

VAR Z_score =
    IF(
        NOT(currentMonthPopulation) || NOT(prevMonthPopulation),
        BLANK(),
        (currentMonthProportion - prevMonthProportion)
        / SQRT(pooledProportion * (1 - pooledProportion) * (1/currentMonthPopulation + 1/prevMonthPopulation))
    )
// This is the Z-test statistic, used to compare against the critical Z-score(s) for a given confidence interval. It has to be encased in an IF() statement to make sure it's not calculating the score for empty months, otherwise the SQRT() function breaks.

VAR Z_test =
    IF(Z_score > 1.96, 1, IF(z_score < -1.96, -1, 0))
// +/-1.96 is the critical Z-score for a two-tailed Z-test with a confidence level of 5%.

RETURN Z_test

 

 

The weight is accounted for by just summing the weights instead of doing distinct counts to calculate the proportions.

 

Here you can see the significance for each month illustrated in a table above the graph:

 

Loubot3000_0-1691051281931.png

 

The arrows and +/- 1 show the direct of significance. You can see that it visually makes sense with what the data is doing. I achieved that by right clicking the significance test field in the visualization and doing conditional formatting.

 

This way of illustrating it (arrows) should work well if added to an actual table of data, however not so much for a line graph. Will have to find another way of doing it.

 

Why Z-test:

The reason the T-test isn't done is because that assumes that the samples are so small (<30) that the mean/variance can't reliably be calculated. In my case, it's in the thousands. Also one reason to rule out a bunch of other statistical tests is that they test the dependence of one variable on another (e.g. proportion of people answering a question, and month), whereas in this case we just have two samples that happen to be from two consecutive months, but the month variable doesn't really matter.

 

On my earlier question about the mean/standard deviation of a proportion - I think actually it does make sense to talk about the varience of a proportion. It is a statistic about the normal probability distribution of making a new observation of a proportion. That distribution definitely has a mean and stdev. Another way of thinking about it is that if you have 500 successes out of 1000 people, the stdev is smaller because you're more likely to see new observations of successes centred around 500 (e.g. 501 successes would make no difference to the proportion), however if you had 5 successes out of 10 people, the stdev would be much bigger because the new observations wouldn't be as cented around 5 (6 or 7 people out of 10 would be a much bigger difference to the proportion).

 

At least that's the way I understand it. Please somebody correct me if I'm wrong.

 

Here's more on the Z-test of proportions: Z Test: Definition & Two Proportion Z-Test - Statistics How To

Hello,

 

I wanted to check if you could find a way to emplement the test results into charts as colorful arrows or something similar...

Have you figured out a way to do this dynamically rather than hard coding? For example, I have data like this: 

 AAAABBBBTotalTotal
 N%N%N%
A330%770%10100%
B440%660%10100%
C660%440%10100%
D990%110%10100%
E110%990%10100%


I kind of doubt it's possible but in an ideal world I want to know for each row if the difference between AA and total is stat sig and if the difference between BB and total is stat sig.  

Hi @Loubot3000, Could you please confirm the above formula is fully correct, were you able to get the desired result?

Pretty sure yeah.

Thanks for the confirmation!
Anonymous
Not applicable

Hi @Loubot3000 ,

I've changed the formula. I apologize. I may have misunderstood you.

 

How to Get Your Question Answered Quickly 

 

If it does not help, please provide more details with your desired output and pbix file without privacy information (or some sample data) .

 

Best Regards
Community Support Team _ Rongtie

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous
Not applicable

Hi @Loubot3000 ,

Please have a try.

 

 

Sales Difference =
VAR CurrentMonthSales = [Sales]
VAR PreviousMonthSales =
    CALCULATE ( [Sales], DATEADD ( 'Calendar'[Date], -1, MONTH ) )
RETURN
    CurrentMonthSales - PreviousMonthSales

 

 

 

Please check the CONFIDENCE.T.

 

I have also found the document, please refer to .

Paired T-test in Power BI using DAX - Ben's Blog (datakuity.com)

 

 

More details: Apply conditional table formatting in Power BI - Power BI | Microsoft Learn

 

How to Get Your Question Answered Quickly 

 

If it does not help, please provide more details with your desired output and pbix file without privacy information (or some sample data) .

 

Best Regards
Community Support Team _ Rongtie

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks for the reply.

The confidence level is not really what is needed to perform a t-test.

Also that doesn't really address the main concern:
Are you sure Student's T-test is a suitable test for this scenario? The data is categorical - people can either choose answer A or answer B or answer C. I'm graphing the percentage of people answering C each month, and looking to determine whether it is significantly different between two consecutive months.  Since it is categorical, there's no standard deviation or mean or anything, and therefore you can't do a student's t-test.

Please do correct me if I'm wrong about this though, I don't fully understand all of this.

T.TEST() doesn't seem to be an existing function.

Also, are you sure Student's T-test is a suitable test for this scenario? The data is categorical - people can either choose answer A or answer B or answer C. I'm graphing the percentage of people answering C each month, and looking to determine whether it is significantly different each month.  Since it is categorical, there's no standard deviation or mean or anything, and therefore you can't do a student's t-test.

Please do correct me if I'm wrong about this though, I don't fully understand all of this.

Helpful resources

Announcements
September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Kudoed Authors