Distributed Systems

date: May 02, 2026

To understand distributed systems we need to move away from the "single computer" mindset and embracing the messy reality of loosly linked networks that transports are data through messy lines of Fiber and CAT cables.

Distributed systems are notoriously hard to implement because they are required to handle partial failures, unreliable networks, and data inconsistency.

The Foundations

Over the years engineers have come up with practices and concepts to make these systems be more reliable at scale.

Network Topology

1. CAP Theorem

The foundation of distributed database design. It states that in a distributed system, you can only provide two of the following three guarantees at once:

  • (C) Consistency: Every read receives the most recent write or an error.

  • (A) Availability: Every request receives a (non-error) response, without the guarantee that it contains the most recent write.

  • (P) Partition Tolerance: The system continues to operate despite an arbitrary number of messages being dropped or delayed by the network between nodes.

The Reality: Partition Tolerance is not optional in distributed systems, networks will fail, meaning "Partition Tolerance" is required. Therefore, we must choose between Consistency and Availability.

2. PACELC Theorem

An extension of CAP that accounts for what happens during normal operation.

  • (P) Partition: If a network partition occurs, do you choose Availability (A) or Consistency (C)?

  • (A) Availability: Prioritize system availability over data consistency during a partition.

  • (C) Consistency: Prioritize strong consistency over availability during a partition.

  • (E) Else: If no partition exists (normal operation), do you choose Latency (L) or Consistency (C)?

  • (L) Latency: Prioritize low latency (fast response) over strict consistency during normal operations.

  • (C) Consistency: Prioritize strong consistency over low latency during normal operations.

Implementation Patterns

The following two patterns are the "how to" for managing data across multiple nodes, directly influenced by the trade-offs of the above theories.

1. Two-Phase Commit (2PC) — The "CP" Approach

2PC is a blocking protocol used to achieve atomic transactions across multiple resources.

2. Saga Pattern — The "AP" Approach

A Saga manages a distributed transaction as a sequence of local transactions. Each step updates its own database and triggers the next step.

1. Choreography (Event-Based)
2. Orchestration (Command-Based)

Notes

Patterns like event driven architecture, domain driven models and distributed systems treadoffs and practices.

ACID Compliance with service to service interactions.

Transactional Outbox pattern in event driven design.

Why adpot domain driven design and package by component coding styles.