Don't miss your chance to take the Fabric Data Engineer (DP-600) exam for FREE! Find out how by attending the DP-600 session on April 23rd (pacific time), live or on-demand.
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
Hello, I am fairly new to Power BI and need some help determining the best method to calculate outliers using IQR for many unique items in one transaction table ('Unfiltered Forecast') with transaction data. The idea is to be able to forecast inventory needs while excluding any outliers which would skew the data. I would like to create another transaction table while filtering out the outliers so that they are not included in the item forecast demand. For this step of the process, I need to filter out the outliers. There are roughly 900,000 rows for 1,000 unique items in the transaction table, here is an example of what it contains:
| GroupSKU | Total Quantity |
| Item1 | 55 |
| Item1 | 54 |
| Item1 | 58 |
| Item1 | 500 |
| Item1 | 68 |
| Item2 | 95 |
| Item2 | 98 |
| Item2 | 97 |
| Item2 | 7000 |
| Item2 | 73 |
| Item3 | 5 |
| Item3 | 4 |
| Item3 | 5 |
| Item3 | 6 |
| Item3 | 75 |
| Item3 | 3 |
I would expect the output to look like this:
| GroupSKU | Total Quantity | isOutlier |
| Item1 | 55 | no |
| Item1 | 54 | no |
| Item1 | 58 | no |
| Item1 | 500 | yes |
| Item1 | 68 | no |
| Item2 | 95 | no |
| Item2 | 98 | no |
| Item2 | 97 | no |
| Item2 | 7000 | yes |
| Item2 | 73 | no |
| Item3 | 5 | no |
| Item3 | 4 | no |
| Item3 | 5 | no |
| Item3 | 6 | no |
| Item3 | 75 | yes |
| Item3 | 3 | no |
Since there are about 1,000 unique items, it does not make sense to me to generate a table for each item. The main issue I have come across is being able to specify that I want the outlier calculations to apply only to each item in the table, rather than perform the calculation on the entire data set. Here is the calculated column I am currently working with:
Solved! Go to Solution.
Please try this variation on your expression as a calculated column in your original table. You can then use that column as a filter in your other measures for analysis.
IsOutlier =
VAR thisSKU = 'Unfiltered Forecast'[GroupSKU]
VAR SKUValues =
FILTER ( 'Unfiltered Forecast', 'Unfiltered Forecast'[GroupSKU] = thisSKU )
VAR LowerQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .25 )
VAR UpperQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .75 )
VAR InterQuartileRange = UpperQuartile - LowerQuartile
VAR OutlierThresholdLower = LowerQuartile - InterQuartileRange * 1.5
VAR OutlierThresholdUpper = UpperQuartile + InterQuartileRange * 1.5
RETURN
IF (
'Unfiltered Forecast'[Total Quantity] >= OutlierThresholdUpper,
"yes",
"no"
)
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Please try this variation on your expression as a calculated column in your original table. You can then use that column as a filter in your other measures for analysis.
IsOutlier =
VAR thisSKU = 'Unfiltered Forecast'[GroupSKU]
VAR SKUValues =
FILTER ( 'Unfiltered Forecast', 'Unfiltered Forecast'[GroupSKU] = thisSKU )
VAR LowerQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .25 )
VAR UpperQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .75 )
VAR InterQuartileRange = UpperQuartile - LowerQuartile
VAR OutlierThresholdLower = LowerQuartile - InterQuartileRange * 1.5
VAR OutlierThresholdUpper = UpperQuartile + InterQuartileRange * 1.5
RETURN
IF (
'Unfiltered Forecast'[Total Quantity] >= OutlierThresholdUpper,
"yes",
"no"
)
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Pat, thanks so much! This worked perfectly. Great idea to put those into their own variables.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Experience the highlights from FabCon & SQLCon, available live and on-demand starting April 14th.
| User | Count |
|---|---|
| 47 | |
| 44 | |
| 40 | |
| 20 | |
| 15 |
| User | Count |
|---|---|
| 70 | |
| 67 | |
| 32 | |
| 27 | |
| 25 |