System design basics (Part 4) — Latency, throughput and availability

Thomas Varghese
2 min readOct 28, 2021

--

In any system, latency, throughput and availability are key measures of system performance.

How do we define each of these?

Latency is the time it takes for a certain operation to complete in a system.

Most often, this measure is a time duration, like milliseconds or seconds.

What are some typical latencies?

  • Reading 1 MB from RAM: 0.25 ms
  • Reading 1 MB from SSD: 1 ms
  • Transfer 1 MB over network: 10 ms
  • Reading 1 MB from HDD: 20 ms
  • Inter-continental round trip: 150 ms

Throughput is the number of operations that a system can handle properly, per time unit.

For instance, the throughput of a server can often be measured in requests per second.

Availability of a system, is the odds of a particular server or service being up and running at any point in time, usually measured in percentages.

For example, a server that has 99% availability will be operational 99% of the time.

When designing any system, it is important to keep the following factors in mind:

  • What are the parts of the system which need high availability and which are the parts of the system which can function without high availability?
  • For example, a payment gateway would need to have high availability, but a service to say, update user profile information may be less important to prioritize for high availability
  • What are the potential areas in the system which could be single points of failure; in this case, it is important to add redundancy, or duplication of resources in the system to ensure it does not fail at a single point
  • Passive redundancy can be achieved with more servers or load balancers
  • Active redundancy can be achieved when multiple machines are setup to work together; i.e active machines take over work of a machine that has failed

High availability is used to describe systems that have particularly high levels of availability, typically five ‘nines’ or more;

Five ‘nines’ of availability would refer to an uptime of 99.999%.

So typically downtimes expected per year depending on the number of ‘nines’ could be as:

  • 99% (two 9s): 87.7 hours
  • 99.9% (three 9s): 8.8 hours
  • 99.99% (four 9s): 52.6 minutes
  • 99.999% (five 9s): 5.3 minutes

Lastly, availabilities are typically provided or guaranteed to end users via SLAs, or service level agreements; multiple SLOs, or service-level objectives would make up an SLA.

--

--

Thomas Varghese

Build tech products and optimize outcomes with data; hobbiyst musician and video creator.