Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
One of our machines is measuring the temperature development of certain articles. Problem is that we sometimes have 'outliers' in our measurements of the temperature. Now we want to filter out these outliers to analyze where they are caused.
The whole file has about 5M rows and contains many articles. The temperature measurements take place every 10 minutes. Question is: How can I create a formula in which I ask Power BI to:
- filter out the rows with an identical ArticleNr
- place the measurements in chronological order (so every ten minutes)
- filter out the measurements that differ 10 degrees opposed to the previouw measure (so +10 degrees or - 10 degrees), these are the outliers we want to filter out and analyze
ArticleNr | MeasurementDate | Temperature |
12896007858555 | 5-2-2018 11:40 | 21 |
12896007858555 | 5-2-2018 11:50 | 21 |
12896007858555 | 5-2-2018 12:00 | 20,8 |
12896007858555 | 5-2-2018 12:10 | 20,8 |
12896007858555 | 5-2-2018 12:20 | 5,8 |
12896007858555 | 5-2-2018 12:30 | 20,8 |
12896007858555 | 5-2-2018 12:40 | 20,8 |
12896007858555 | 5-2-2018 12:50 | 21 |
12896007858555 | 5-2-2018 13:00 | 20,9 |
12896007858555 | 5-2-2018 13:10 | 20,9 |
12896007864199 | 26-2-2018 00:40 | 20,1 |
12896007864199 | 26-2-2018 00:50 | 20,3 |
12896007864199 | 26-2-2018 01:00 | 20,1 |
12896007864199 | 26-2-2018 01:10 | 20,3 |
12896007864199 | 26-2-2018 01:20 | 54,8 |
12896007864199 | 26-2-2018 01:30 | 20,1 |
12896007864199 | 26-2-2018 01:40 | 19,9 |
12896007864199 | 26-2-2018 01:50 | 20,1 |
12896007864199 | 26-2-2018 02:00 | 19,9 |
12896007864199 | 26-2-2018 02:10 | 19,9 |
Above example shows two different article numbers. In both measurements is an outlier in Bold.
Can you please advise if it's possible to filter out the outliers in Power BI?
Solved! Go to Solution.
Hi ,
Based on my test, you can modify the column [pre10minvalue] I have created in the previous pbix.
pre10minvalue =
var previousweek = Sheet1[pre10min]
var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),FILTER(Sheet1,Sheet1[MeasurementDate] = previousweek && Sheet1[ArticleNr]=EARLIER(Sheet1[ArticleNr])))
return
IF(ISBLANK(saleonpre),0,saleonpre)
Now you can see the correct result.
You can also download the PBIX file to have a view.
https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0
Regards,
Daniel He
Hi @Mark88,
Based on my test, you can follow below steps:
1.I have entered some sample data to test for your problem.
2.Create three calculated columns to calculate your [ProblemValue].
pre10min = [MeasurementDate]-10/1440
pre10minvalue =
var previousweek = Sheet1[pre10min]
var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),ALL(Sheet1),Sheet1[MeasurementDate] = previousweek)
return
IF(ISBLANK(saleonpre),0,saleonpre)
ProblemValue = IF(([Temperature]-[pre10minvalue])=[Temperature],BLANK(),IF(([Temperature]-[pre10minvalue])>10,[Temperature],IF(([Temperature]-[pre10minvalue])<-10,[Temperature])))
3.Create a Slicer visual and add the [ArticleNr] field to filter out the rows with an identical ArticleNr.
4.Create a Table visual and add the [ArticleNr], [MeasurementData], [Temperature] and the [ProblemValue].
5.You can use the sort function in the table visual to order the [MeasurementData].
You can also download the PBIX file to have a view.
https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0
Regards,
Daniel He
Hi @v-danhe-msft,
Thanks for your reply! When testing this solution on our 5M row file I faced a problem. Our machine is measuring multiple articles on the same time, so we have duplicate MeasurementsDates in the file. The formula you've created in the 'pre10minvalue' column works perfect when measured in unique time intervals (unique MeasurementDates). However, our machine is sometimes measuring 5 ArticleNr in the same time interval and your formula sums up the temperature per time interval. See the example below:
ArticleNr | MeasurementDate | Temperature | pre10min | pre10minvalue |
12896007858555 | 5-2-2018 11:40 | 21 | 5-2-2018 11:30 | 0 |
12896007858555 | 5-2-2018 11:50 | 21 | 5-2-2018 11:40 | 21 |
12896007858555 | 5-2-2018 12:00 | 20,8 | 5-2-2018 11:50 | 21 |
12896007858555 | 5-2-2018 12:10 | 20,8 | 5-2-2018 12:00 | 20,8 |
12896007858555 | 5-2-2018 12:20 | 5,8 | 5-2-2018 12:10 | 20,8 |
12896007858555 | 5-2-2018 12:30 | 20,8 | 5-2-2018 12:20 | 5,8 |
12896007858555 | 5-2-2018 12:40 | 20,8 | 5-2-2018 12:30 | 20,8 |
12896007858555 | 5-2-2018 12:50 | 21 | 5-2-2018 12:40 | 20,8 |
12896007858555 | 5-2-2018 13:00 | 20,9 | 5-2-2018 12:50 | 21 |
12896007858555 | 5-2-2018 13:10 | 20,9 | 5-2-2018 13:00 | 20,9 |
12896007864199 | 26-2-2018 00:40 | 20,1 | 26-2-2018 00:30 | 0 |
12896007864199 | 26-2-2018 00:50 | 20,3 | 26-2-2018 00:40 | 20,1 |
12896007864199 | 26-2-2018 01:00 | 20,1 | 26-2-2018 00:50 | 20,3 |
12896007864199 | 26-2-2018 01:10 | 20,3 | 26-2-2018 01:00 | 20,1 |
12896007864199 | 26-2-2018 01:20 | 54,8 | 26-2-2018 01:10 | 20,3 |
12896007864199 | 26-2-2018 01:30 | 20,1 | 26-2-2018 01:20 | 54,8 |
12896007864199 | 26-2-2018 01:40 | 19,9 | 26-2-2018 01:30 | 20,1 |
12896007864199 | 26-2-2018 01:50 | 20,1 | 26-2-2018 01:40 | 43,7 |
12896007864199 | 26-2-2018 02:00 | 19,9 | 26-2-2018 01:50 | 20,1 |
12896007864199 | 26-2-2018 02:10 | 19,9 | 26-2-2018 02:00 | 19,9 |
12896007864188 | 26-2-2018 01:36 | 23,6 | 26-2-2018 01:26 | 0 |
12896007864188 | 26-2-2018 01:40 | 23,8 | 26-2-2018 01:30 | 20,1 |
The rows where the problem starts are highlighted with an underscore. You see that it sums up the temperature for that unique time interval. What I am looking for is the temperature for a unique ArticleNR+MeasurementDate combination.
Can you please advise how I can fix this?
Hi ,
Based on my test, you can modify the column [pre10minvalue] I have created in the previous pbix.
pre10minvalue =
var previousweek = Sheet1[pre10min]
var saleonpre = CALCULATE(SUM(Sheet1[Temperature]),FILTER(Sheet1,Sheet1[MeasurementDate] = previousweek && Sheet1[ArticleNr]=EARLIER(Sheet1[ArticleNr])))
return
IF(ISBLANK(saleonpre),0,saleonpre)
Now you can see the correct result.
You can also download the PBIX file to have a view.
https://www.dropbox.com/s/y1l6phnzvo7zw0o/Filtering%20table%20rows%20on%20outliers.pbix?dl=0
Regards,
Daniel He
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
115 | |
112 | |
105 | |
95 | |
58 |
User | Count |
---|---|
174 | |
147 | |
136 | |
102 | |
82 |