What Is Availability?

Availability is the share of time a system is operational and usable, expressed as a percentage over a defined window. In reliability math it is calculated as Availability = MTBF / (MTBF + MTTR), so it rises when failures are rarer and when recovery is faster.

Availability answers a single question: out of all the time in a period, what fraction was the service actually working? It is usually quoted as a percentage in "nines" (99.9%, 99.99%) and is, for most practical purposes, the same idea as uptime. The terms are often used interchangeably, with one nuance: uptime tends to describe what you observe and measure directly, while availability is the broader reliability concept that the formula below makes precise.

The availability formula

In reliability engineering, availability is derived from two component metrics: how often a system fails and how long it takes to recover each time. The standard formula is:

  • Availability = MTBF / (MTBF + MTTR)
  • MTBF (mean time between failures) is the average time the system runs normally between one failure and the next. Higher is better.
  • MTTR (mean time to recovery) is the average time to restore service once a failure begins. Lower is better.

Worked example: a service that runs 200 hours between failures and takes 1 hour to recover is available 200 / (200 + 1), which is 200 / 201, or about 99.5% of the time. Cut the recovery time in half to 30 minutes and availability climbs to 200 / 200.5, roughly 99.75%. The same headline percentage improves either by failing less often (raising MTBF) or by recovering faster (lowering MTTR), which is why both metrics matter.

What the percentages allow

High availability targets sound nearly perfect but still permit real outage time, and each extra nine cuts the allowance by roughly a factor of ten. 99.9% ("three nines") allows about 43 minutes of downtime per month; 99.95% allows about 22 minutes; and 99.99% ("four nines") allows only about 4.3 minutes per month. Because a single extended outage can exhaust an entire monthly budget at once, recovery speed often matters as much as raw stability. For a full breakdown of what each nine permits, see what 99.9% uptime actually means.

Availability is also the metric most often written into contracts. A Service Level Agreement (SLA) commits a provider to a stated availability target over a defined window and sets out remedies, such as service credits, if it is missed. The figure is only trustworthy when measured the way a visitor actually connects, from outside your own infrastructure, since a server can report itself as healthy while DNS, a CDN, an expired certificate, or a broken checkout makes the site unusable. Continuous external uptime and SLA monitoring records the data needed to track availability over time and prove it against an SLA.

See also: Uptime & SLA monitoring

Frequently asked questions

  • How is availability calculated?

    In reliability math, Availability = MTBF / (MTBF + MTTR). With a mean time between failures of 200 hours and a mean time to recovery of 1 hour, availability is 200 / 201, or about 99.5%. Measured empirically, it is also just (total time minus downtime) divided by total time.

  • What is the difference between availability and uptime?

    The terms are usually treated as the same thing: both describe the share of time a service is usable. The slight distinction is that uptime is typically the measured percentage you observe, while availability is the broader reliability concept defined by MTBF / (MTBF + MTTR), which produces the same figure over a long enough window.

  • How much downtime does 99.9% availability allow?

    About 43 minutes per month, or roughly 8.8 hours per year. Each additional nine reduces the allowance by about ten times: 99.99% permits only around 4.3 minutes per month, and 99.95% sits in between at about 22 minutes.

  • How do MTBF and MTTR affect availability?

    Availability rises when failures are less frequent (higher MTBF) and when recovery is faster (lower MTTR). Because the two combine in Availability = MTBF / (MTBF + MTTR), halving recovery time lifts the same headline percentage that failing less often would, so improving either metric improves availability.