Module 5: Consistency Models

What is Consistency?

Consistency defines the contract between a distributed system and its clients about what values reads can return after writes.
┌─────────────────────────────────────────────────────────────────┐ │ THE CONSISTENCY SPECTRUM │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ STRONGER ◄─────────────────────────────────────► WEAKER │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌────────┐ │ │ │Linearizable │ │ Sequential │ │ Causal │ │Eventual│ │ │ │ │ │ Consistency │ │ Consistency │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ └────────┘ │ │ │ │ ◄── Easier to reason about Faster performance ──► │ │ ◄── Higher latency Lower latency ──► │ │ ◄── Lower availability Higher availability ──► │ │ │ └─────────────────────────────────────────────────────────────────┘

1. Linearizability (Strong Consistency)

The gold standard. Operations appear to happen instantaneously at some point between invocation and response.

Definition

┌─────────────────────────────────────────────────────────────────┐ │ LINEARIZABILITY │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Every operation appears to take effect atomically at some │ │ instant between its invocation and response. │ │ │ │ Visual representation: │ │ │ │ Client A: ├──── write(x=1) ────┤ │ │ │ │ │ ▼ (linearization point) │ │ Client B: ├─── read(x) ───┤ → returns 1 │ │ │ │ If B's read overlaps with or follows A's write, │ │ B MUST see the write (or a later value). │ │ │ │ Key properties: │ │ 1. Real-time ordering: If op1 completes before op2 starts, │ │ op1 appears before op2 │ │ 2. Single value: All clients see the same value at any time │ │ 3. Atomicity: No partial states visible │ │ │ └─────────────────────────────────────────────────────────────────┘

Linearizability Examples

┌─────────────────────────────────────────────────────────────────┐ │ LINEARIZABILITY: VALID vs INVALID │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ VALID execution (linearizable): │ │ │ │ Time ──────────────────────────────────────────────────────► │ │ │ │ Client A: ├─── write(x=1) ───┤ │ │ Client B: ├─── read(x) ───┤ → 1 │ │ Client C: ├─── read(x) ───┤ → 1 │ │ │ │ Linearization: write(x=1), read→1, read→1 ✓ │ │ │ │ ─────────────────────────────────────────────────────────── │ │ │ │ INVALID execution (NOT linearizable): │ │ │ │ Time ──────────────────────────────────────────────────────► │ │ │ │ Client A: ├─── write(x=1) ───┤ │ │ Client B: ├─── read(x) ───┤ → 1 │ │ Client C: ├─── read(x) ───┤ → 0 │ │ │ │ C reads 0 even though B (which started earlier) read 1! │ │ No valid linearization order exists. ✗ │ │ │ └─────────────────────────────────────────────────────────────────┘

The Cost of Linearizability

┌─────────────────────────────────────────────────────────────────┐ │ LINEARIZABILITY PERFORMANCE │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ To ensure linearizability, you typically need: │ │ │ │ Option 1: Single leader │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ All writes go through one node │ │ │ │ Reads must either: │ │ │ │ a) Go to leader (adds latency) │ │ │ │ b) Use quorum reads │ │ │ │ c) Track replication lag │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Option 2: Consensus protocol (Paxos/Raft) │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Every write requires majority acknowledgment │ │ │ │ Latency = network round-trip to majority │ │ │ │ Cross-region: 100-300ms per write │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Performance comparison: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Single-region linearizable: 1-5ms latency │ │ │ │ Multi-region linearizable: 100-300ms latency │ │ │ │ Eventual consistency: <1ms latency │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

When to Use Linearizability

┌─────────────────────────────────────────────────────────────────┐ │ LINEARIZABILITY USE CASES │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ✓ USE LINEARIZABILITY FOR: │ │ │ │ 1. Leader election / distributed locks │ │ └─► Only one leader at a time, ever │ │ │ │ 2. Unique constraints │ │ └─► Usernames, email addresses, account numbers │ │ │ │ 3. Financial transactions │ │ └─► Balance updates, transfers │ │ │ │ 4. Sequence number generation │ │ └─► Order IDs, transaction IDs │ │ │ │ ✗ AVOID LINEARIZABILITY FOR: │ │ │ │ 1. User profile reads │ │ └─► Slightly stale data is acceptable │ │ │ │ 2. Analytics / reporting │ │ └─► Minutes-old data is fine │ │ │ │ 3. Content delivery │ │ └─► CDN caching is essential │ │ │ │ 4. Social media feeds │ │ └─► Eventual consistency with good UX │ │ │ └─────────────────────────────────────────────────────────────────┘

2. Sequential Consistency

Weaker than linearizability: operations appear in a total order consistent with program order, but not necessarily real-time order.
┌─────────────────────────────────────────────────────────────────┐ │ SEQUENTIAL CONSISTENCY │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ All clients see the same order of operations. │ │ That order respects each client's program order. │ │ But: May NOT respect real-time order! │ │ │ │ Example (sequentially consistent but NOT linearizable): │ │ │ │ Real-time: │ │ Client A: write(x=1) ───────────────────── │ │ Client B: ───────── write(x=2) ───── │ │ Client C: read(x) → 1 │ │ │ │ A finishes before B starts (real-time order). │ │ But C reads 1, not 2! │ │ │ │ Valid sequential order: write(x=2), write(x=1), read(x)→1 │ │ This respects program order but not real-time order. │ │ │ │ Why it matters: │ │ • CPU caches provide sequential consistency, not linearizable │ │ • ZooKeeper provides sequential consistency for reads │ │ • Much cheaper to implement than linearizability │ │ │ └─────────────────────────────────────────────────────────────────┘

Sequential vs Linearizable

┌─────────────────────────────────────────────────────────────────┐ │ SEQUENTIAL vs LINEARIZABLE CONSISTENCY │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ │ Linearizable │ Sequential │ │ ──────────────────┼──────────────┼─────────────────── │ │ Real-time order │ Respected │ Not required │ │ Program order │ Respected │ Respected │ │ Global order │ Yes │ Yes │ │ Cost │ Higher │ Lower │ │ │ │ Linearizable: Operations take effect "during" the call │ │ Sequential: Operations take effect "sometime after" the call │ │ │ │ Example: ZooKeeper │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Writes: Go through leader (linearizable) │ │ │ │ Reads: Can read from any replica (sequential) │ │ │ │ │ │ │ │ Client may read stale data after another client's write │ │ │ │ But: A client will never see its own writes go backward │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

3. Causal Consistency

Operations that are causally related appear in the same order everywhere. Concurrent operations may appear in different orders.
┌─────────────────────────────────────────────────────────────────┐ │ CAUSAL CONSISTENCY │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ If operation A "happened before" operation B (causally), │ │ then everyone sees A before B. │ │ │ │ Causal relationships: │ │ 1. Same client: Earlier ops happen before later ops │ │ 2. Read-after-write: Read depends on the write │ │ 3. Transitive: A→B and B→C means A→C │ │ │ │ Example: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Alice posts: "Anyone want coffee?" │ │ │ │ Bob reads Alice's post, replies: "I do!" │ │ │ │ Charlie reads Alice's post, replies: "Me too!" │ │ │ │ │ │ │ │ Causal order: Alice → Bob (Bob read Alice's post) │ │ │ │ Alice → Charlie │ │ │ │ Bob ∥ Charlie (concurrent) │ │ │ │ │ │ │ │ Valid views: │ │ │ │ [Alice, Bob, Charlie] ✓ │ │ │ │ [Alice, Charlie, Bob] ✓ │ │ │ │ [Bob, Alice, Charlie] ✗ (violates Alice → Bob) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

Causal Consistency Implementation

┌─────────────────────────────────────────────────────────────────┐ │ IMPLEMENTING CAUSAL CONSISTENCY │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Approach 1: Vector Clocks / Version Vectors │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Each operation carries a vector clock │ │ │ │ Server delays delivery until causal dependencies met │ │ │ │ │ │ │ │ Write: VC=[A:5, B:3, C:2] │ │ │ │ Server waits until it has seen: │ │ │ │ - A's operations up to 4 │ │ │ │ - B's operations up to 3 │ │ │ │ - C's operations up to 2 │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Approach 2: Explicit Dependency Tracking │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Each operation includes IDs of operations it depends on │ │ │ │ │ │ │ │ Op1: write(x=1), deps=[] │ │ │ │ Op2: write(y=2), deps=[Op1] // read x before writing y │ │ │ │ │ │ │ │ Server applies Op2 only after Op1 is applied │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Approach 3: Session Guarantees (Practical) │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Track per-session "read set" and "write set" │ │ │ │ Ensure session sees its own writes │ │ │ │ MongoDB: "readConcern: majority" │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

Session Guarantees

┌─────────────────────────────────────────────────────────────────┐ │ SESSION GUARANTEES │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Read Your Writes (RYW): │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ If a client writes, subsequent reads see that write │ │ │ │ │ │ │ │ Client: write(profile.name = "Alice") │ │ │ │ Client: read(profile.name) → MUST return "Alice" │ │ │ │ │ │ │ │ Implementation: Track write timestamp, read from replica │ │ │ │ that has caught up to that timestamp │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Monotonic Reads (MR): │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ If a client reads value v, subsequent reads see v or │ │ │ │ later values, never earlier │ │ │ │ │ │ │ │ Client: read(x) → 5 │ │ │ │ Client: read(x) → MUST be ≥ 5, never 4 │ │ │ │ │ │ │ │ Violation feels like "going back in time" │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Monotonic Writes (MW): │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Writes from same client applied in order everywhere │ │ │ │ │ │ │ │ Client: write(x=1), write(x=2) │ │ │ │ All replicas: x=1 then x=2, never x=2 then x=1 │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Writes Follow Reads (WFR): │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ If client reads then writes, write is ordered after read│ │ │ │ │ │ │ │ Client: read(post) → "Hello" │ │ │ │ Client: write(reply = "Hi!") │ │ │ │ Everyone sees: post before reply (causal order) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

4. Eventual Consistency

The weakest useful consistency: if no new updates, all replicas eventually converge to the same value.
┌─────────────────────────────────────────────────────────────────┐ │ EVENTUAL CONSISTENCY │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ "Eventually, all replicas will have the same value" │ │ │ │ No guarantees about: │ │ • When convergence happens │ │ • What intermediate values you'll read │ │ • Order of operations │ │ │ │ Timeline: │ │ │ │ write(x=1) │ │ │ │ │ ▼ │ │ Replica A: x=1 ───────────────────────────────── │ │ Replica B: x=0 ─── x=0 ─── x=0 ─── x=1 ──────── │ │ Replica C: x=0 ─── x=0 ─── x=0 ─── x=0 ─── x=1 │ │ │ │ │ └─ Eventually converges│ │ │ │ During convergence: │ │ • Different clients may read different values │ │ • Same client may read different values on retry │ │ • Old values may reappear │ │ │ └─────────────────────────────────────────────────────────────────┘

Strong Eventual Consistency

┌─────────────────────────────────────────────────────────────────┐ │ STRONG EVENTUAL CONSISTENCY (SEC) │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Eventual consistency + Convergence guarantee: │ │ "Replicas that have received the same set of updates │ │ are in the same state" │ │ │ │ Implemented via CRDTs (Conflict-free Replicated Data Types) │ │ │ │ Example: G-Counter (Grow-only counter) │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Each node keeps its own count │ │ │ │ Total = sum of all node counts │ │ │ │ Merge = max of each node's count │ │ │ │ │ │ │ │ Node A: {A:5, B:0, C:0} → total = 5 │ │ │ │ Node B: {A:3, B:2, C:0} → total = 5 │ │ │ │ │ │ │ │ After merge: │ │ │ │ Both: {A:5, B:2, C:0} → total = 7 │ │ │ │ │ │ │ │ No conflicts, always converges! │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Benefits over plain eventual consistency: │ │ • Guaranteed convergence (same updates → same state) │ │ • No conflict resolution needed │ │ • No lost updates │ │ │ └─────────────────────────────────────────────────────────────────┘

5. Consistency in Practice

Database Consistency Levels

┌─────────────────────────────────────────────────────────────────┐ │ DATABASE CONSISTENCY OPTIONS │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ CASSANDRA: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Write Consistency: │ │ │ │ ONE: Write to 1 replica (fastest) │ │ │ │ QUORUM: Write to ⌈(N+1)/2⌉ replicas │ │ │ │ ALL: Write to all replicas (slowest) │ │ │ │ │ │ │ │ Read Consistency: │ │ │ │ ONE: Read from 1 replica │ │ │ │ QUORUM: Read from ⌈(N+1)/2⌉ replicas │ │ │ │ ALL: Read from all replicas │ │ │ │ │ │ │ │ For linearizability: QUORUM reads + QUORUM writes │ │ │ │ (assuming W + R > N) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ MONGODB: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Write Concern: │ │ │ │ w:1 Write to primary only │ │ │ │ w:majority Write to majority of replicas │ │ │ │ │ │ │ │ Read Concern: │ │ │ │ local Read from any replica (may be stale) │ │ │ │ majority Read data acknowledged by majority │ │ │ │ linearizable Linearizable reads (slow) │ │ │ │ │ │ │ │ Read Preference: │ │ │ │ primary Always read from primary │ │ │ │ secondary Read from secondary (for read scaling) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ DYNAMODB: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Eventually consistent reads (default, cheaper) │ │ │ │ Strongly consistent reads (2x cost) │ │ │ │ Transactional operations (for ACID) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

Quorum Systems

┌─────────────────────────────────────────────────────────────────┐ │ QUORUM MATH │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ N = total replicas │ │ W = write quorum (replicas that must acknowledge write) │ │ R = read quorum (replicas queried for read) │ │ │ │ For strong consistency: W + R > N │ │ (Read quorum overlaps with write quorum) │ │ │ │ Example: N=5 │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ │ │ │ │ W=3, R=3: Strong consistency (3+3=6 > 5) │ │ │ │ ┌───┐ ┌───┐ ┌───┐ ┌───┐ ┌───┐ │ │ │ │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ 5 │ │ │ │ │ └─┬─┘ └─┬─┘ └─┬─┘ └───┘ └───┘ │ │ │ │ └─────┴─────┘ │ │ │ │ Write quorum (W=3) │ │ │ │ ┌─────┴─────┐ │ │ │ │ │ │ │ │ │ │ ┌───┐ ┌─┴─┐ ┌───┐ ┌─┴─┐ ┌───┐ │ │ │ │ │ │ │ 2 │ │ 3 │ │ 4 │ │ │ │ │ │ │ └───┘ └───┘ └─┬─┘ └───┘ └───┘ │ │ │ │ └─────┘ │ │ │ │ Read quorum (R=3) │ │ │ │ Overlap: Node 2, 3 (at least one has write) │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ │ Common configurations: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ N=3, W=2, R=2: Balanced (tolerates 1 failure) │ │ │ │ N=3, W=1, R=3: Fast writes, slow reads │ │ │ │ N=3, W=3, R=1: Slow writes, fast reads │ │ │ │ N=5, W=3, R=3: High availability (tolerates 2 failures) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

Read-After-Write Consistency

┌─────────────────────────────────────────────────────────────────┐ │ READ-AFTER-WRITE PATTERNS │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Problem: User updates profile, refreshes, sees old data │ │ │ │ User: POST /profile {name: "Alice"} │ │ │ │ │ ▼ │ │ Primary: ✓ Write successful │ │ │ │ │ │ (replication lag) │ │ │ │ │ User: GET /profile │ │ │ │ │ ▼ │ │ Replica: {name: "Old Name"} ← User sees old data! │ │ │ │ Solutions: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ 1. Read from primary after write │ │ │ │ Simple but doesn't scale reads │ │ │ │ │ │ │ │ 2. Track write timestamp, read from caught-up replica │ │ │ │ Client: "My last write was at t=1000" │ │ │ │ Server: Route to replica with position ≥ 1000 │ │ │ │ │ │ │ │ 3. Sticky sessions │ │ │ │ Same user always hits same replica │ │ │ │ Problem: Replica failure loses consistency │ │ │ │ │ │ │ │ 4. Wait for replication │ │ │ │ Block read until replicas catch up │ │ │ │ Adds latency but guarantees consistency │ │ │ │ │ │ │ │ 5. Client-side caching │ │ │ │ Cache writes locally, merge with server reads │ │ │ │ Complex but good UX │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

6. Consistency Comparison

┌─────────────────────────────────────────────────────────────────┐ │ CONSISTENCY MODEL COMPARISON │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Model │ Guarantees │ Used By │ │ ───────────────┼──────────────────────┼─────────────────────── │ │ Linearizable │ Real-time + total │ Spanner, etcd, │ │ │ order │ ZK writes │ │ ───────────────┼──────────────────────┼─────────────────────── │ │ Sequential │ Total order, no │ ZK reads, │ │ │ real-time │ CPU caches │ │ ───────────────┼──────────────────────┼─────────────────────── │ │ Causal │ Causal order only │ COPS, MongoDB │ │ │ │ causal sessions │ │ ───────────────┼──────────────────────┼─────────────────────── │ │ Eventual │ Eventually same │ DNS, Cassandra │ │ │ value │ (default), DynamoDB │ │ ───────────────┼──────────────────────┼─────────────────────── │ │ Strong │ Same updates = │ Riak (CRDTs), │ │ Eventual │ same state │ Redis CRDT │ │ │ │ Decision factors: │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ • Correctness requirements (can you handle conflicts?) │ │ │ │ • Latency requirements (cross-region?) │ │ │ │ • Availability requirements (partition tolerance?) │ │ │ │ • Developer complexity (who resolves conflicts?) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘

Interview Questions

Conceptual Questions

  1. What is linearizability and why is it expensive?
    • Operations appear atomic at some point during execution
    • Requires coordination (consensus or single leader)
    • Cross-region latency makes it slow (100s of ms)
  2. What's the difference between linearizability and serializability?
    • Linearizability: Single-object, real-time ordering
    • Serializability: Multi-object transactions, equivalent to serial execution
    • Strict serializability: Both combined
  3. Explain eventual consistency and when it's acceptable.
    • All replicas converge given no new updates
    • Acceptable when: stale reads are OK, high availability needed
    • Examples: DNS, social media feeds, shopping cart
  4. What are session guarantees?
    • Read Your Writes, Monotonic Reads, Monotonic Writes, Writes Follow Reads
    • Provides causal consistency within a session
    • Easier to program against than raw eventual consistency

System Design Questions

  1. How would you implement read-your-writes consistency?
    • Track last write timestamp per user
    • Route reads to replicas caught up to that timestamp
    • Or read from primary for "important" reads
    • Or use client-side caching with server reconciliation
  2. Design a system that needs linearizability for some operations and eventual consistency for others.
    • User creation: Linearizable (unique username)
    • Profile reads: Eventually consistent (cached)
    • Balance updates: Linearizable (financial)
    • Activity feed: Eventual (social)
    • Use different storage systems or consistency levels per operation type

Summary

┌─────────────────────────────────────────────────────────────────┐ │ MODULE 5 KEY TAKEAWAYS │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ 1. Linearizability is the strongest useful consistency │ │ • Real-time ordering of all operations │ │ • Expensive: requires consensus or single leader │ │ • Use for: locks, unique constraints, financial data │ │ │ │ 2. Sequential consistency relaxes real-time requirement │ │ • All clients see same order │ │ • But order may not match wall clock │ │ │ │ 3. Causal consistency captures dependencies │ │ • Causally related events ordered │ │ • Concurrent events may differ between clients │ │ • Good balance of correctness and performance │ │ │ │ 4. Eventual consistency is weakest but fastest │ │ • Replicas converge eventually │ │ • Strong eventual: same updates = same state (CRDTs) │ │ │ │ 5. Quorum systems provide tunable consistency │ │ • W + R > N for strong consistency │ │ • Trade off latency vs consistency vs availability │ │ │ │ 6. Choose consistency per operation, not per system │ │ │ └─────────────────────────────────────────────────────────────────┘

Next Module: CAP Theorem and Beyond - Understanding the fundamental trade-offs in distributed systems.
All Blogs
Tags:consistencyeventual-consistencylinearizabilitydistributed-databases