distributed_system – unreasonably effective

Distributed Cache

Memcache A global, in-memory hash-table distributed across servers. Key and value are strings, item has TTL on top of LRU eviction. Memcache@scale

tanvirdotzaman

May 6, 2025

distributed_system, Uncategorized

Distributed Systems: Replication

If you are not doing replication, you are not fault-tolerant. The more availability we want from replication, more work is required. The more consistent we want writes and reads to be, the more work is required. Six consistency levels have varying cost in terms of performance and availability. Guarantee Consistency Performance Availability Strong consistency Excellent…

tanvirdotzaman

April 26, 2025

distributed_system, Uncategorized

Distributed Systems: Consistency Tradeoffs

CAP A partition-tolerant distributed database system continues to work in the face of network partition. Modern distributed database systems are partition-tolerant. In the face of network partition, the system can tradeoff between consistency and availability. When network becomes partitioned, a CP system chooses consistency over availability and an AP system chooses availability over consistency. How…

tanvirdotzaman

April 22, 2025

distributed_system

Distributed Systems: Reading

Topology Concurrency Consistency and Availability Scalability Systems Patterns

tanvirdotzaman

April 21, 2025

distributed_system

Scalability

With a given amount of resource, as QPS increases, a system will eventually hit the maximum QPS it can support. We can also look at how system’s output rate (response/sec) is related to its input rate (request/sec). A system is scalable if by adding more resources to it, we can move the max throughput level…

tanvirdotzaman

March 9, 2025

distributed_system

Reliability

Measured by Mean Time Between Failures (MTBF) and Mean Time To Recover (MTTR). MTBF . Note, the numerator excludes downtime which may be contributed by maintenance or repair. In other words, . It measures how often good time had been punctured by failures. We want high MTBF. MTTR . Measures how good the repair mechanism…

tanvirdotzaman

March 8, 2025

distributed_system, Uncategorized

Availability

Within a time period of units, if a system is down for units, then its availability for the time period is defined as . It usually is expressed as percentage. In other words, . We note that is a linear function of down time with a slope of . Therefore, if down time increases, availability…

tanvirdotzaman

March 7, 2025

distributed_system, Uncategorized

Category: distributed_system