If you’ve ever sat in a maintenance review meeting and watched two people argue about whether MTBF dropped because reliability got worse or because the fleet got bigger, you’ve seen the problem. These metrics are simple to define and easy to get wrong.
MTBF — Mean Time Between Failures
MTBF measures the average operating time between unplanned failures of a repairable asset. The formula is straightforward:
MTBF = total operating hours ÷ number of failures
If a pump ran 9,000 hours over a year and failed three times, its MTBF is 3,000 hours. The trap is in the inputs. Operating hours has to mean operating — runtime, not calendar time. Failures has to mean unplanned breakdowns, not all work orders.
MTTR — Mean Time To Repair
MTTR measures how long it takes to restore an asset to service after a failure.
MTTR = total repair time ÷ number of repairs
Define “repair time” carefully. Most teams measure from the moment the work order is opened to the moment it’s closed. That bundles together waiting for the technician, waiting for parts, the actual hands-on repair, and post-repair testing. Each of those is a different lever you can pull, and a CMMS that timestamps each phase tells you which one to focus on.
Use them together, not separately
MTBF in isolation rewards teams that hide failures (skip the work order, fix it quietly). MTTR in isolation rewards teams that do the bare minimum to restart and skip root cause. Together they create the right tension: increase the time between failures, decrease the time to recover when one happens.
What to do with the numbers
Trend them over rolling 90-day windows for your top 20 assets. Compare similar assets to find outliers. When MTBF drops on one machine while peers stay flat, that’s a signal worth investigating. When MTTR drops across the fleet, your process is improving — celebrate it, then ask why.
Tags
Writes about CMMS, reliability and operations excellence at UniCMMS.
Discussion
0 comments — join the conversation.