Distributed Systems: Complete Course

From Fundamentals to Production Systems

A comprehensive 43-part course covering distributed systems from absolute basics to advanced production patterns. Designed for backend engineers, FAANG interview preparation, and engineers building high-scale systems. Written in a narrative book style that explains concepts thoroughly with real-world context.

Course Structure

Part 1: Foundations (Modules 1-6)

ModuleTopicKey Concepts
1Introduction to Distributed SystemsWhy distributed, types, fallacies, partial failures
2Network FundamentalsTCP/UDP, HTTP/2/3, gRPC, network partitions
3Serialization and Data ExchangeJSON, Protobuf, Avro, schema evolution
4Time, Clocks, and OrderingLamport clocks, vector clocks, HLC
5Consistency ModelsLinearizability, eventual, causal consistency
6CAP Theorem and BeyondCAP, PACELC, consistency-availability trade-offs

Part 2: Data Management (Modules 7-12)

ModuleTopicKey Concepts
7Replication StrategiesLeader-follower, multi-leader, leaderless
8Partitioning and ShardingHash/range partitioning, rebalancing, hotspots
9Consensus AlgorithmsPaxos, Raft, ZAB, leader election
10Distributed Transactions2PC, 3PC, distributed ACID
11Kafka Deep DivePartitions, consumer groups, exactly-once
12Conflict ResolutionLWW, vector clocks, CRDTs

Part 3: Scaling Patterns (Modules 13-17)

ModuleTopicKey Concepts
13Horizontal vs Vertical ScalingWhen to use which, auto-scaling
14Load BalancingL4/L7, algorithms, health checks
15Caching StrategiesWrite-through, write-behind, cache invalidation
16CDN and Edge ComputingGlobal distribution, edge functions
17Multi-Region ArchitectureActive-active, active-passive, data sovereignty

Part 4: Messaging and Streaming (Modules 18-22)

ModuleTopicKey Concepts
18Message QueuesRabbitMQ, SQS, at-least-once delivery
19Event SourcingEvent stores, projections, snapshots
20CQRS PatternRead/write separation, materialized views
21Stream ProcessingKafka Streams, Flink, windowing
22Failure DetectionHeartbeats, phi accrual, SWIM protocol

Part 5: Fault Tolerance (Modules 23-27)

ModuleTopicKey Concepts
23Circuit BreakersStates, fallbacks, half-open recovery
24Retry StrategiesExponential backoff, jitter, idempotency
25Bulkhead PatternResource isolation, thread pools, semaphores
26Rate LimitingToken bucket, sliding window, distributed limiting
27Load Balancing Deep DiveRound robin, least connections, consistent hashing

Part 6: Modern Infrastructure (Modules 28-30)

ModuleTopicKey Concepts
28Service MeshSidecars, mTLS, traffic management
29Distributed TracingSpans, context propagation, sampling
30Production-Ready SystemsComplete service template, middleware, health checks

Part 7: Advanced Patterns (Modules 31-37)

ModuleTopicKey Concepts
31Saga PatternChoreography, orchestration, compensating transactions
32Chaos EngineeringFailure injection, game days, building confidence
33Distributed LocksRedlock, fencing tokens, ZooKeeper, etcd
34CRDTsG-Counter, PN-Counter, OR-Set, convergence
35Gossip ProtocolsEpidemic broadcast, SWIM, membership
36Consistent HashingHash rings, virtual nodes, minimal disruption
37Bloom FiltersProbabilistic membership, false positives, applications

Part 8: Production Excellence (Modules 38-41)

ModuleTopicKey Concepts
38Deployment StrategiesBlue-green, canary, feature flags, GitOps
39Incident ResponseDetection, roles, mitigation, postmortems
40Performance TuningProfiling, bottlenecks, caching, optimization
41SecuritymTLS, authorization, secrets, encryption

Part 9: Mastery (Modules 42-43)

ModuleTopicContent
42System Design Case StudiesURL shortener, cache, queue, search, payments
43Interview Preparation100 questions, learning roadmap, resources

What Makes This Course Different

This course is written as a book, not a reference manual. Each chapter:
  • Explains the "why" before the "how"
  • Tells stories that illustrate concepts
  • Connects ideas to previous chapters
  • Discusses tradeoffs rather than prescribing solutions
  • Prepares you for real-world decisions, not just interviews
Early chapters (1-30) include Go code examples for practical implementation. Later chapters (31-43) focus on conceptual depth with narrative explanations.

Prerequisites

  • Basic programming knowledge (examples in Go, concepts are language-agnostic)
  • Understanding of basic data structures
  • Familiarity with REST APIs
  • Basic SQL knowledge

How to Use This Course

  1. Sequential Learning: Modules build on each other; follow the order
  2. Active Reading: Pause to consider how concepts apply to systems you know
  3. Implement Early Chapters: Build the code examples to internalize patterns
  4. Discuss Later Chapters: The advanced topics benefit from conversation and debate
  5. Use for Interview Prep: Chapter 43 provides 100 practice questions

Scaling Milestones Reference

UsersRPSDataArchitecture
1K101GBSingle server
10K10010GBRead replicas, CDN
100K1K100GBSharding, caching
1M10K1TBMicroservices, Kafka
10M100K10TBMulti-region, edge
100M1M100TBGlobal distribution
1B10M1PBCustom infrastructure

The Journey Ahead

Distributed systems are among the most challenging and rewarding areas of software engineering. They power everything from social networks to financial systems, from search engines to autonomous vehicles.
This course won't make you an expert overnight—nothing can. But it will give you the conceptual foundation to understand distributed systems deeply, the patterns to design them effectively, and the vocabulary to discuss them clearly.
Welcome to the journey.

Begin with Module 1: Introduction to Distributed Systems