The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
Hello, I am fairly new to Power BI and need some help determining the best method to calculate outliers using IQR for many unique items in one transaction table ('Unfiltered Forecast') with transaction data. The idea is to be able to forecast inventory needs while excluding any outliers which would skew the data. I would like to create another transaction table while filtering out the outliers so that they are not included in the item forecast demand. For this step of the process, I need to filter out the outliers. There are roughly 900,000 rows for 1,000 unique items in the transaction table, here is an example of what it contains:
GroupSKU | Total Quantity |
Item1 | 55 |
Item1 | 54 |
Item1 | 58 |
Item1 | 500 |
Item1 | 68 |
Item2 | 95 |
Item2 | 98 |
Item2 | 97 |
Item2 | 7000 |
Item2 | 73 |
Item3 | 5 |
Item3 | 4 |
Item3 | 5 |
Item3 | 6 |
Item3 | 75 |
Item3 | 3 |
I would expect the output to look like this:
GroupSKU | Total Quantity | isOutlier |
Item1 | 55 | no |
Item1 | 54 | no |
Item1 | 58 | no |
Item1 | 500 | yes |
Item1 | 68 | no |
Item2 | 95 | no |
Item2 | 98 | no |
Item2 | 97 | no |
Item2 | 7000 | yes |
Item2 | 73 | no |
Item3 | 5 | no |
Item3 | 4 | no |
Item3 | 5 | no |
Item3 | 6 | no |
Item3 | 75 | yes |
Item3 | 3 | no |
Since there are about 1,000 unique items, it does not make sense to me to generate a table for each item. The main issue I have come across is being able to specify that I want the outlier calculations to apply only to each item in the table, rather than perform the calculation on the entire data set. Here is the calculated column I am currently working with:
Solved! Go to Solution.
Please try this variation on your expression as a calculated column in your original table. You can then use that column as a filter in your other measures for analysis.
IsOutlier =
VAR thisSKU = 'Unfiltered Forecast'[GroupSKU]
VAR SKUValues =
FILTER ( 'Unfiltered Forecast', 'Unfiltered Forecast'[GroupSKU] = thisSKU )
VAR LowerQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .25 )
VAR UpperQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .75 )
VAR InterQuartileRange = UpperQuartile - LowerQuartile
VAR OutlierThresholdLower = LowerQuartile - InterQuartileRange * 1.5
VAR OutlierThresholdUpper = UpperQuartile + InterQuartileRange * 1.5
RETURN
IF (
'Unfiltered Forecast'[Total Quantity] >= OutlierThresholdUpper,
"yes",
"no"
)
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Please try this variation on your expression as a calculated column in your original table. You can then use that column as a filter in your other measures for analysis.
IsOutlier =
VAR thisSKU = 'Unfiltered Forecast'[GroupSKU]
VAR SKUValues =
FILTER ( 'Unfiltered Forecast', 'Unfiltered Forecast'[GroupSKU] = thisSKU )
VAR LowerQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .25 )
VAR UpperQuartile =
PERCENTILEX.INC ( SKUValues, 'Unfiltered Forecast'[Total Quantity], .75 )
VAR InterQuartileRange = UpperQuartile - LowerQuartile
VAR OutlierThresholdLower = LowerQuartile - InterQuartileRange * 1.5
VAR OutlierThresholdUpper = UpperQuartile + InterQuartileRange * 1.5
RETURN
IF (
'Unfiltered Forecast'[Total Quantity] >= OutlierThresholdUpper,
"yes",
"no"
)
If this works for you, please mark it as the solution. Kudos are appreciated too. Please let me know if not.
Regards,
Pat
To learn more about Power BI, follow me on Twitter or subscribe on YouTube.
Pat, thanks so much! This worked perfectly. Great idea to put those into their own variables.
User | Count |
---|---|
68 | |
63 | |
59 | |
54 | |
28 |
User | Count |
---|---|
182 | |
81 | |
64 | |
46 | |
38 |