Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
SteveBe
New Member

Standard Deviation of a group of unique workers

Newbie Alert!

I have something that I can do easily in SQL, but has me stumped in PowerBI/DAX.

I'm helping our space planners understand how a site is being used. I have a table of badge swipes (when, which building, who).

I need to calc the std dev of the unique daily workers who come to the site.

Here's the SQL:

SELECT
	StdDevUniqueDailyWorkers = FLOOR(STDEV(slice.CDWhoSwiped))
FROM (
	SELECT bs.WhenSwiped, CDWhoSwiped = COUNT(DISTINCT bs.WhoSwiped)
	FROM BadgeSwipes AS bs
	INNER JOIN DayTypesByDate AS dtd ON bs.WhenSwiped = dtd.Day_Of_Year
	WHERE DATEPART(YEAR, bs.WhenSwiped) = 2018
		  AND dtd.DayType = 'Weekday'
	GROUP BY bs.WhenSwiped
	--ORDER BY bs.WhenSwiped
) AS slice

 

The inner select creates the date-uniqueCount list, and the outer select computes the std dev over that column.

I know the WHERE parts will all be taken care of by the filters in play, but I just don't see how to get from the data I have to a column of daily unique counts to feed into the STDEV function.

Please turn on the light bulb!

Thanks,

S~

1 ACCEPTED SOLUTION

Putting the whole solution together:

 

Can you try a new measure of:

StdDevUniqueDailyWorkers  = 
STDEVX.P (
    CALCULATETABLE (
        SUMMARIZE (
            'BadgeSwipes',
            'BadgeSwipes'[WhenSwiped],
            "CDWhoSwiped", DISTINCTCOUNT( 'BadgeSwipes'[WhoSwiped] )
        ),
        YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018,
        WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6
    ),
    [CDWhoSwiped]
)

 


Breaking this out. The first part is the SUMMARIZE function.

        SUMMARIZE (
            'BadgeSwipes',
            'BadgeSwipes'[WhenSwiped],
            "CDWhoSwiped", DISTINCTCOUNT( 'BadgeSwipes'[WhoSwiped] )
        )

It looks at the table BadgeSwipes, and returns a uniqe list of dates WhenSwiped. Then it adds a column, I've called "CDWhoSwiped", and perfoms a count of distinct values from WhoSwiped.

This the step in your query:

	SELECT bs.WhenSwiped, CDWhoSwiped = COUNT(DISTINCT bs.WhoSwiped)
	FROM BadgeSwipes AS bs

 Note: The result of this is a table, If you added just this expression to a New Table in Power BI, it would result in  a table with two columns.

 

 

 

Next, it applies the filters to that table, by wrapping this new table in CALCULATETABLE:

    CALCULATETABLE (
  .....
        YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018,
        WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6

This applies your WHERE condition:

	WHERE DATEPART(YEAR, bs.WhenSwiped) = 2018
		  AND dtd.DayType = 'Weekday'

I've assumed dtd.DayType = 'Weekday' means Monday to Friday, which DAX can abstract from the date so no need of getting the other table involved. This still outputs a table - it just simply applies some filteres to it.

 

 

 

Lastly, we have the actual calculation:

STDEVX.P (
    ...
        ...
  ...
    [CDWhoSwiped]
)

This says to take the standard deviation of our newly created column. It takes two arguments, the first is the table, which is the new one we created. The second is the column, which is the second column in our new table, one the I called "CDWhoSwiped".

 

 

 

You could also wrap a FLOOR or ROUNDOWN around this. Note that Microsoft needs a second argument to both these arguments.
Also, note that this is STDEVX.P, meaning that we assume this is the entire population, wheras STDEVX.S would be that of a sample. The calculation differes from having a denominator of N or N-1.

 

 

Asking for page filters instead of hard coded:

 

By setting up your model correctly, you can add in the table DayTypesByDate table and join it in  power bi desktop (in the "relationships" tab). If Day_Of_Year is not a unique column, you can set the relationship direction to "both", so that this table can filter BadgeSwipes table.

 

Then you take out the (make sure you get the preceeding comma):

, 
YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018, WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6

 Your filter and your page level filters should recalculate correctly!

 

 

(Or you can take out all the green CALCULATETABLE condition if you didn't want to pass any filters at all)



Did I answer your question? Mark my post as a solution! Proud to be a Super User!


Connect with me!
Stay up to date on  
Read my blogs on  



View solution in original post

3 REPLIES 3
SteveCampbell
Memorable Member
Memorable Member

Hello,

 

Can you try a new measure of:

StdDevUniqueDailyWorkers  = 
STDEVX.P (
    CALCULATETABLE (
        SUMMARIZE (
            'BadgeSwipes',
            'BadgeSwipes'[WhenSwiped],
            "CDWhoSwiped", DISTINCTCOUNT( 'BadgeSwipes'[WhoSwiped] )
        ),
        YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018,
        WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6
    ),
    [CDWhoSwiped]
)

 


Breaking this out. The first part is the SUMMARIZE function.

        SUMMARIZE (
            'BadgeSwipes',
            'BadgeSwipes'[WhenSwiped],
            "CDWhoSwiped", DISTINCTCOUNT( 'BadgeSwipes'[WhoSwiped] )
        )

It looks at the table BadgeSwipes, and returns a uniqe list of dates WhenSwiped. Then it adds a column, I've called "CDWhoSwiped", and perfoms a count of distinct values from WhoSwiped.

This the step in your query:

	SELECT bs.WhenSwiped, CDWhoSwiped = COUNT(DISTINCT bs.WhoSwiped)
	FROM BadgeSwipes AS bs

 Note: The result of this is a table, If you added just this expression to a New Table in Power BI, it would result in  a table with two columns.

 

 

 

Next, it applies the filters to that table, by wrapping this new table in CALCULATETABLE:

    CALCULATETABLE (
  .....
        YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018,
        WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6

This applies your WHERE condition:

	WHERE DATEPART(YEAR, bs.WhenSwiped) = 2018
		  AND dtd.DayType = 'Weekday'

I've assumed dtd.DayType = 'Weekday' means Monday to Friday, which DAX can abstract from the date so no need of getting the other table involved. This still outputs a table - it just simply applies some filteres to it.

 

 

 

Lastly, we have the actual calculation:

STDEVX.P (
    ...
        ...
  ...
    [CDWhoSwiped]
)

This says to take the standard deviation of our newly created column. It takes two arguments, the first is the table, which is the new one we created. The second is the column, which is the second column in our new table, one the I called "CDWhoSwiped".

 

 

 

You could also wrap a FLOOR or ROUNDOWN around this. Note that Microsoft needs a second argument to both these arguments.
Also, note that this is STDEVX.P, meaning that we assume this is the entire population, wheras STDEVX.S would be that of a sample. The calculation differes from having a denominator of N or N-1.




Did I answer your question? Mark my post as a solution! Proud to be a Super User!


Connect with me!
Stay up to date on  
Read my blogs on  



Hi Steve,

Thanks!

This works and I learned something in the process!

FYI, the DayType of "WeekDay" is because I exclude Inclement Weather days when a site is shut down and company holidays.

I would mark this as a solution, BUT (why is there always a "BUT" ;)...

The year and the weekday values are set as page level filters.

Is there a way the CALCULATETABLE could respect those, instead of having them spelled out for this measure?

Again, thanks for the quick resonse!

S~

Putting the whole solution together:

 

Can you try a new measure of:

StdDevUniqueDailyWorkers  = 
STDEVX.P (
    CALCULATETABLE (
        SUMMARIZE (
            'BadgeSwipes',
            'BadgeSwipes'[WhenSwiped],
            "CDWhoSwiped", DISTINCTCOUNT( 'BadgeSwipes'[WhoSwiped] )
        ),
        YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018,
        WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6
    ),
    [CDWhoSwiped]
)

 


Breaking this out. The first part is the SUMMARIZE function.

        SUMMARIZE (
            'BadgeSwipes',
            'BadgeSwipes'[WhenSwiped],
            "CDWhoSwiped", DISTINCTCOUNT( 'BadgeSwipes'[WhoSwiped] )
        )

It looks at the table BadgeSwipes, and returns a uniqe list of dates WhenSwiped. Then it adds a column, I've called "CDWhoSwiped", and perfoms a count of distinct values from WhoSwiped.

This the step in your query:

	SELECT bs.WhenSwiped, CDWhoSwiped = COUNT(DISTINCT bs.WhoSwiped)
	FROM BadgeSwipes AS bs

 Note: The result of this is a table, If you added just this expression to a New Table in Power BI, it would result in  a table with two columns.

 

 

 

Next, it applies the filters to that table, by wrapping this new table in CALCULATETABLE:

    CALCULATETABLE (
  .....
        YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018,
        WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6

This applies your WHERE condition:

	WHERE DATEPART(YEAR, bs.WhenSwiped) = 2018
		  AND dtd.DayType = 'Weekday'

I've assumed dtd.DayType = 'Weekday' means Monday to Friday, which DAX can abstract from the date so no need of getting the other table involved. This still outputs a table - it just simply applies some filteres to it.

 

 

 

Lastly, we have the actual calculation:

STDEVX.P (
    ...
        ...
  ...
    [CDWhoSwiped]
)

This says to take the standard deviation of our newly created column. It takes two arguments, the first is the table, which is the new one we created. The second is the column, which is the second column in our new table, one the I called "CDWhoSwiped".

 

 

 

You could also wrap a FLOOR or ROUNDOWN around this. Note that Microsoft needs a second argument to both these arguments.
Also, note that this is STDEVX.P, meaning that we assume this is the entire population, wheras STDEVX.S would be that of a sample. The calculation differes from having a denominator of N or N-1.

 

 

Asking for page filters instead of hard coded:

 

By setting up your model correctly, you can add in the table DayTypesByDate table and join it in  power bi desktop (in the "relationships" tab). If Day_Of_Year is not a unique column, you can set the relationship direction to "both", so that this table can filter BadgeSwipes table.

 

Then you take out the (make sure you get the preceeding comma):

, 
YEAR ( 'BadgeSwipes'[WhenSwiped] ) = 2018, WEEKDAY ( 'BadgeSwipes'[WhenSwiped], 2 ) < 6

 Your filter and your page level filters should recalculate correctly!

 

 

(Or you can take out all the green CALCULATETABLE condition if you didn't want to pass any filters at all)



Did I answer your question? Mark my post as a solution! Proud to be a Super User!


Connect with me!
Stay up to date on  
Read my blogs on  



Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.