Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes! Register now.

Reply
jmfillman
Helper I
Helper I

Help With Complex Standard Deviation

I have a table of data, like the below sample data and trying to write a measure from the parent table for Item ID.

 

I need to calculate Standare Deviation based on the individual Plan ID, where State <> Cancelled that is then aggregated together to get a Standard Deviation for the all the Plans ID's for each Item ID.

 

Taking the first Item ID (12345), I need the STDV of Plan A for Feb, Mar, and Apr (Jan excluded because it is Cancelled), then for Plan B and Plan C, but the value I need to return is, what I see as a weighted average of the STDV for Plan A, Plan B, and Plan C or whatever gives me the overall STDV weighted by the variability of months of each plan, and then again the same approach for each subsequent Item ID.

 

 

Item IDPlan IDDateHoursState
12345AJan5Cancelled
12345AFeb1Complete
12345AMar4Active
12345AApr5Active
12345BJan10Complete
12345BFeb8Active
12345BMar9Cancelled
12345BApr5Active
12345CFeb5Complete
12345CMar5Active
54321AAJan5Complete
54321AAFeb1Complete
54321AAMar4Active
54321AAApr5Active
54321BBJan10Complete
54321BBFeb8Complete
54321BBMar9Cancelled
54321BBApr5Active
54321CCFeb5Complete
54321CCMar5Active
5 REPLIES 5
talespin
Solution Sage
Solution Sage

hi @jmfillman 

 

Please see if this is what you require.

The logic for 

Measure1 - SD formula at Product level.

Measure2 - SD formula at Item level(All Products).

Measure3 - Please ignore, Removed measure 3, The formula I used for Weighted SD is not correct. Will add once I understand the formula.

 

 

Measure1

------------------

StdDev Plan = CALCULATE( STDEV.P(SD[Hours]), REMOVEFILTERS(), SUMMARIZE(SD, SD[Item ID], SD[Plan ID]), SD[State] <> "Cancelled" )
 
Measure2
------------------
StdDev Item = CALCULATE( STDEV.P(SD[Hours]), REMOVEFILTERS(), VALUES(SD[Item ID]), SD[State] <> "Cancelled" )
 
 
talespin_0-1710681369236.png

 

Hi @tailspin,

 

Measure 2 doesn't reference Measure 1, so seems you are proposing only Measure 2?

 

I've done some general reading on the Web on topics related, and seems a formula that does the following will work:
Square Rool of the squared sums of the STDV of each resource plan.

 

Square Root (STDV.P(Plan A^2)+STDV.P(Plan B^2)+etc...)

hi @jmfillman 

 

"Square Root of the squared sums of the STDV of each resource plan."

Used below measure to implement above statement.

 

Measure Used

---------------------------------------------------------------------------------------------------

Weighted SD =
VAR _SelectItem = SELECTEDVALUE(SD[Item ID])
VAR _SummTbl =
ADDCOLUMNS(
    SUMMARIZE(ALL(SD), SD[Item ID], SD[Plan ID]),
    "@Count", CALCULATE( COUNT(SD[Plan ID]), ALLEXCEPT(SD, SD[Item ID], SD[Plan ID]), SD[State] <> "Cancelled"),
    "@STDDEV", CALCULATE( STDEV.P(SD[Hours]), ALLEXCEPT(SD, SD[Item ID], SD[Plan ID]), SD[State] <> "Cancelled")
)
--VAR _CountPlanByItem = SUMX( FILTER(_SummTbl, [Item ID] = _SelectItem), [@Count] )
VAR _SD = SUMX( FILTER(_SummTbl, [Item ID] = _SelectItem), [@STDDEV]*[@STDDEV] )

RETURN SQRT(_SD)
---------------------------------------------------------------------------------------------------

 

Based on what I have read on web

When calculating SD, you use mean in the SD formula.

When calculating Weighted SD, you need to use weighted mean instead of simple mean. I am unable to fully understand the formula which I read on web, that is where I am stuck.

 

Measure 2 doesn't reference Measure 1, so seems you are proposing only Measure 2?

Measure1 was just for testing. Measure2 gives you SD at item level, but it is not weighted SD.

 

talespin_0-1710731848268.png

 

lbendlin
Super User
Super User

There is not enough data to make this statistically significant.  How many rows are in your actual data?

This is just sample data. Each plan can have 1 to n rows and each Item ID can have 1 to n plans. Actual data is currently about 11,000 rows, but varies month-to-month. Ultimately, I need a value to indicate whether month-to-month hours are nearly the same, by plan, or have significant variability, where higher variability between months of the same Plan ID and for the Item ID is a good thing, where little or no variability is not good.

Helpful resources

Announcements
September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.