Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
shinney
Helper I
Helper I

Best way to filter out outliers when plotting count of a string column

Hello,

I've been trying to figure out a way to filter out outliers when plotting the counts of a string column. For example, my table:

Table = ActivityLogs

DateActivity
Jan 1, 2021Create
Jan 1, 2021View
Jan 1, 2021View
Jan 1, 2021Delete
Jan 1, 2021Create
Jan 2, 2021Create
Jan 2, 2021Create
Jan 3, 2021View
Jan 9, 2021View
Jan 10, 2021View
etcetc

And this would go on for hundreds of thousands of rows, etc.

 

For instance, there are a few days where View might have 10,000+ rows which is a clear outlier, as typically I would expect less than 1,000 views per day. I would like to find a way to filter out those rows so when plotted on a line graph, it's not heavily skewed.

 

I've tried making a new column below. The idea was to find the % compared to the overall sum of Activity counts and filter out anything over 97%+, for example. However, ActivityLogs_Count is a measure I created so the SUM function didn't like that ... SUM(Count(ActivityLogs[Activity])) doesn't work either since SUM doesn't like nesting a COUNT formula.

 

Percentage = ActivityLogs[ActivityLogs_Count] / CALCULATE(SUM([ActivityLogs_Count]),ALLSELECTED())*100

 

 

I also tried to use STDEV.P(Count(ActivityLogs[Activity])), but it doesn't work ... STDEV.P doesn't like a nested COUNT function either.

 

Any one have any suggestions for handling outliers like this? Thanks

1 ACCEPTED SOLUTION
v-eqin-msft
Community Support
Community Support

Hi @shinney ,

 

You may apply the measure to filter pane, set as "is less than or equal to 0.97":

Measure = 
var _count=CALCULATE(COUNTROWS('Table'),FILTER(ALL('Table'),[Date]<=MAX('Table'[Date])))
var _overall=CALCULATE( COUNTROWS('Table'),ALL('Table'))
return  _count / _overall 

Eyelyn9_1-1638513999032.png

 

 

Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

3 REPLIES 3
v-eqin-msft
Community Support
Community Support

Hi @shinney ,

 

You may apply the measure to filter pane, set as "is less than or equal to 0.97":

Measure = 
var _count=CALCULATE(COUNTROWS('Table'),FILTER(ALL('Table'),[Date]<=MAX('Table'[Date])))
var _overall=CALCULATE( COUNTROWS('Table'),ALL('Table'))
return  _count / _overall 

Eyelyn9_1-1638513999032.png

 

 

Best Regards,
Eyelyn Qin
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

amitchandak
Super User
Super User

Thanks for the link. The log trick may work, but the main component my table lacks is that count column. I had to create a Measure for Count of Activity but it doesn't work for these calculations or stdev.

Is there any way to make a Count of Activity calculated column instead?

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.