System Design — Theory
System Design — Theory (deep concepts)
Section titled “System Design — Theory (deep concepts)”Always-asked tradeoffs
Section titled “Always-asked tradeoffs”| Axis | Trade |
|---|---|
| Consistency | latency / availability |
| Read vs write throughput | sharding strategy |
| Sync vs async | latency vs decoupling |
| Cache | freshness vs hit rate |
| Single-region vs multi-region | latency vs availability vs cost |
| SQL vs NoSQL | flexibility vs simplicity |
| Server-side vs client-side rendering | TTFB vs interactivity |
| Push vs pull | freshness vs efficiency |
CAP / PACELC (revisited from distributed-systems)
Section titled “CAP / PACELC (revisited from distributed-systems)”Under partition: pick C or A. Even normal: pick L (latency) or C (consistency).
Most modern DBs lean AP for survivability. Strong-consistent systems (Spanner, etcd) accept latency cost.
Sharding strategy
Section titled “Sharding strategy”- Range — by key range; risk: hot range (timestamps).
- Hash — even distribution; range queries hard.
- Geo — by region; user data near user; data residency.
- Tenant — per-customer shard; isolation.
Re-sharding is painful. Plan for it: consistent hashing, virtual nodes, migration paths.
Read scale
Section titled “Read scale”- Read replicas (sync or async).
- Materialized views.
- CDN cache.
- Application-tier cache (Redis).
- Per-request cache (memoize within request).
- CQRS — separate read schema optimized for queries.
Write scale
Section titled “Write scale”- Sharding.
- Async write paths (queue → worker).
- Write-behind cache.
- Batch writes.
- Append-only log instead of mutate-in-place.
Hot keys
Section titled “Hot keys”When one key gets disproportionate traffic:
- Add L1 cache before L2 (in-process).
- Replicate that key across many cache nodes.
- Append a random suffix
key.{1..N}and pick one (bucket spreads load). - Read replicas close to clients.
Idempotency
Section titled “Idempotency”Every retried request must be safe. Achieved via:
- Idempotency key (client-supplied).
- Natural keys (
INSERT ON CONFLICT DO NOTHING). - State machines (only apply transitions that move forward).
Required for any HTTP API likely to be retried by clients.
Backpressure
Section titled “Backpressure”When demand exceeds capacity, decide what gives:
- Reject (load shed).
- Queue (and grow buffer; risk OOM).
- Slow upstream.
- Degrade response quality.
Principle: fail fast and visibly beats silent latency growth.
Tail latency strategies
Section titled “Tail latency strategies”- Hedged requests.
- Replicate slow shards.
- Tighter timeouts on inner calls.
- Fewer round trips (combine, prefetch).
- Async paths for non-critical (“eventual” path).
Geo-distributed designs
Section titled “Geo-distributed designs”- Single-region: simplest, lowest latency for nearby users, fails as a unit.
- Multi-region active-passive: failover. RTO/RPO tradeoffs. Cost: replicating data + idle standby.
- Multi-region active-active: read locally, complex consistency. Common with eventually consistent storage.
- Edge / regional partition: each region serves its tenants exclusively (data locality, GDPR).
Spanner / DynamoDB Global Tables / CockroachDB are options for global strong consistency at cost.
Common pitfalls in design interviews
Section titled “Common pitfalls in design interviews”- Skipping clarifying questions.
- Overengineering for scale that wasn’t asked.
- Ignoring write path.
- No mention of failure modes.
- Not handling concurrent updates.
- Forgetting auth/observability/deploy.
- Picking exotic tech (Cassandra) for a simple problem.
- Not addressing the interviewer’s prompts.
Frequently-asked deep dives
Section titled “Frequently-asked deep dives”URL shortener
Section titled “URL shortener”- ID gen: random 7-char, base62 over an int counter, or distributed snowflake.
- DB: KV store (Redis/DynamoDB) or RDBMS for analytics.
- Read 100:1 to writes → cache and CDN heavy.
- Custom slugs collide → use
INSERT ON CONFLICT.
Twitter feed
Section titled “Twitter feed”- Fanout-on-write: pre-compute timeline at tweet time. Fast read, write fan = followers count (millions for celebrities — handle separately).
- Fanout-on-read: assemble at read. Lots of work for active users.
- Hybrid: fanout for normal users; pull-on-read for celebrity authors.
- Connection mgmt: WebSocket / sticky LB.
- Storage: per-conversation partition.
- Presence: ephemeral, Redis.
- Push delivery: APNS/FCM for offline.
- Retention + search.
Rate limiter
Section titled “Rate limiter”- Fixed window: simple, edge-of-window burst.
- Sliding window log (ZSET): exact, more memory.
- Token bucket: bursts allowed up to bucket size.
- Distributed: Redis Lua atomic check-and-decrement.
Notification system
Section titled “Notification system”- Fanout via queue.
- Per-user dedupe.
- Retry with TTL.
- Per-channel adapter (email, push, SMS).
- Quiet hours / preferences.
- Audit log.
Payment flow
Section titled “Payment flow”- Idempotency key per request.
- Saga: authorize → fulfill → capture (or cancel).
- Double-entry ledger for accounting.
- Reconciliation against payment provider.
- Watch for race conditions in balance updates (
SELECT FOR UPDATEor atomic increment).
Geo dispatch (Uber-like)
Section titled “Geo dispatch (Uber-like)”- Geohash / S2 / quadtree for spatial index.
- Match: nearest available drivers within radius.
- Driver location updates: high write QPS — use streaming + in-memory grid (Redis Geo, Tile38).
- Surge: time-window aggregation per cell.
News feed ranking
Section titled “News feed ranking”- Candidate generation (followed users, popular).
- Feature retrieval (recency, affinity, content type).
- Scoring (ML model serving with low p99).
- Dedup, diversify, pagination.
Distributed file sync
Section titled “Distributed file sync”- Chunk file, hash chunks, dedupe.
- Local cache + lazy sync.
- Conflict resolution (last-write-wins, branch on conflict).
- Delta sync.
Search
Section titled “Search”- Inverted index (Elasticsearch / Lucene).
- Indexing pipeline (Kafka → indexer with idempotency).
- Query: filters first, then scoring.
- Personalization layer.
Recommended reading
Section titled “Recommended reading”- Designing Data-Intensive Applications — Kleppmann.
- System Design Interview — Alex Xu (vol 1+2).
- High Scalability blog.
- Hello Interview / ByteByteGo / Educative system design courses.
- Engineering blogs of Stripe, Uber, Airbnb, Discord, Dropbox, Cloudflare.