March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
This Quick Measure is an update to my Mean Time Between Failure (MTBF) measure that I developed in this article:
The original method had some significant column-based elements and my goal was to make this measure truly dynamic. A significant issue with the original approach is that when slicing by "Cause", the MTBF figure calculated was not truly dynamic by Cause of failure. In other words, the Uptime calculation did not account for only looking at particular causes, only between failures of the same machine. So, uptime calculated by considering all causes is vastly different than when only looking at a particular cause.
This measure corrects that flaw, although I have also included a version of the measure that does not need a calculated column but calculates uptime in the exact same way as in the aforementioned article. The new measures are down in the bottom, right-hand corner. You will notice that the new fully dynamic measure is much larger when selecting a Cause. This is because the MTBF calculation is only considering the MTBF for that particular cause.
MTBF (Hours) Measure = VAR __table = 'Repairs' VAR __table1 = ADDCOLUMNS(__table,"__next", MINX( FILTER(__table, [MachineName]=EARLIER([MachineName]) && [RepairStarted]>EARLIER([RepairStarted]) && [RepairType]<>"PM" ), [RepairStarted] ) ) VAR __table2 = ADDCOLUMNS(__table1,"__uptime", IF([RepairType]="PM", 0, IF(ISBLANK([__next]), DATEDIFF([RepairCompleted],NOW(),SECOND), DATEDIFF([RepairCompleted],[__next],SECOND) ) ) ) VAR __repairs = CALCULATE(COUNTROWS('Repairs'),FILTER(ALLSELECTED('Repairs'),[RepairType]<>"PM")) RETURN DIVIDE(SUMX(__table2,[__uptime]),__repairs,BLANK())/3600
eyJrIjoiZmFjYTQwYjMtMGJlOS00Mjk4LTlkM2QtNmQ2NTc4MmU4YjAwIiwidCI6IjRhMDQyNzQzLTM3M2EtNDNkMi04MjdiLTAwM2Y0YzdiYTFlNSIsImMiOjN9
This is really handy and would like to implement it in my model, however my data model is structured abit differently. I have a fact table (Notificatons) with notification data (notification, location id, breakdown status) and this is connected to a dimension table via "notificaiton" field showing the start and end times of the notifcation (lets call it MTBF table).
The fact table is connected to another dim table (location data) via location id and this contains the machine field. I am not sure how to determine the next_repair with this current model, any help will be much appreciated
Hi @Greg_Deckler,
Thanks for the awesome post.
However, with the previous calculated column for uptime, MTBF measure can be filter by period. Let's say for example, I don't want machine7 MTBF for the whole period, I just want machine7 MTBF for a particular peiord only. How can I tweak your codes to suit my needs?
Also, is it possible to get separate measure for just Uptime, which is the same as Uptime calculated column?
Should be possible, would need more information. Definitely been through a lot of edits with this one, the best version is in my book which fixed a bunch of bugs and other little things that I found annoying. I can reference that version but I believe some of the fixes were related to being able to properly slice the metric by things like date ranges, etc.
Thanks for the reply. Please find the sample attached.
You are right,
I think my issue could be solved if I can create a separate uptime measure (the one I created is not right?).
Another separate issue in my data is eliminating multiple alarms. My data source is the daily csv logs which contain thousands of events. Some sites' event duration last more than a day (>24hr) but they are captured as separate entry in the next-day csv logs. I kind of manage to eliminate those events (via Critical Alarm v2) but I am not sure that is fail safe outside of this sample data.
There are also dummy events (downtime 0hr) which I'd like to eliminate as well. Eliminating them would reduce the total critical alarm, which is vital in calculating correct mtbf and availability.
Im struggling with some points:
I really need this and your post is the only one I found that gets closer to my issue.
Thanks,