System design basics (Part 4) — Latency, throughput and availability
In any system, latency, throughput and availability are key measures of system performance.
How do we define each of these?
Latency is the time it takes for a certain operation to complete in a system.
Most often, this measure is a time duration, like milliseconds or seconds.
What are some typical latencies?
- Reading 1 MB from RAM: 0.25 ms
- Reading 1 MB from SSD: 1 ms
- Transfer 1 MB over network: 10 ms
- Reading 1 MB from HDD: 20 ms
- Inter-continental round trip: 150 ms
Throughput is the number of operations that a system can handle properly, per time unit.
For instance, the throughput of a server can often be measured in requests per second.
Availability of a system, is the odds of a particular server or service being up and running at any point in time, usually measured in percentages.
For example, a server that has 99% availability will be operational 99% of the time.
When designing any system, it is important to keep the following factors in mind:
- What are the parts of the system which need high availability and which are the parts of the system which can function without high availability?
- For example, a payment gateway would need to have high availability, but a service to say, update user profile information may be less important to prioritize for high availability
- What are the potential areas in the system which could be single points of failure; in this case, it is important to add redundancy, or duplication of resources in the system to ensure it does not fail at a single point
- Passive redundancy can be achieved with more servers or load balancers
- Active redundancy can be achieved when multiple machines are setup to work together; i.e active machines take over work of a machine that has failed
High availability is used to describe systems that have particularly high levels of availability, typically five ‘nines’ or more;
Five ‘nines’ of availability would refer to an uptime of 99.999%.
So typically downtimes expected per year depending on the number of ‘nines’ could be as:
- 99% (two 9s): 87.7 hours
- 99.9% (three 9s): 8.8 hours
- 99.99% (four 9s): 52.6 minutes
- 99.999% (five 9s): 5.3 minutes
Lastly, availabilities are typically provided or guaranteed to end users via SLAs, or service level agreements; multiple SLOs, or service-level objectives would make up an SLA.