Redis — Theory

Redis — Theory (interview deep-dive)

All in-memory, no disk seek.
No locks, context switches, contention.
Modern Redis (6+) uses I/O threads for network read/write, but command execution stays single-threaded.
CPU rarely the bottleneck; memory bandwidth and network are.

App reads cache; on miss, queries DB and populates cache. Pros: only caches what’s used. Cons: stale data possible; thundering herd on hot key miss.

Cache library auto-fetches on miss (transparent to app).

Every write goes to cache + DB synchronously. Cache always fresh; write latency higher.

Write to cache; async flush to DB. Fastest writes; data loss risk on crash.

Proactively refresh hot keys before TTL expires.

Stale data: shorter TTL, or invalidation on DB writes.
Cache stampede / thundering herd: many requests miss simultaneously, hammer DB.
- Single-flight: lock per key, others wait or use stale.
- Probabilistic early expiration (XFetch).
- Background refresh job.
Hot key: read replicas, local L1 cache, key sharding (key.{1..N}).
Big key (>10MB): hurts replication, evictions. Split.
Cache penetration: cache negative results with short TTL, or bloom filter.
Cache avalanche: add jitter to TTLs.

Single-instance: SET key value NX PX 30000.
Release: Lua DEL only if value matches (avoid releasing someone else’s lock).
Redlock (multi-master): controversial — Kleppmann critique re: clock skew. Use only with fencing tokens.
For most apps, single-instance lock with idempotent operations is enough.

Object overhead: keys cost ~50-100 bytes per entry. Many small keys → bloat.
HASH instead of separate keys when grouping fields of one object — much more compact.
OBJECT ENCODING key shows internal repr.
MEMORY USAGE key for size of one key.
redis-cli --bigkeys for sampling.

Multi-key ops: keys must be in same slot. Use hash tags: user:{42}:profile.
Lua scripts: same constraint.
Resharding moves slots; clients with MOVED redirects.
Failover takes ~10-30s typically.

Redis vs Memcached. Redis: more types, persistence, pub/sub, scripting, cluster. Memcached: pure KV, multithreaded.
Implement a leaderboard — ZSET, ZADD/ZRANGE.
Rate limiter for 100 req/min/user — fixed window vs sliding window.
Atomic check-and-set across cluster — Lua + single hash slot.
Cache invalidation on write — write-through vs invalidate-on-write.
In-flight commands during failover — possible loss with async replication.
Why is KEYS * dangerous? Blocking O(N). Use SCAN.
Find unused keys — --bigkeys, --memkeys, OBJECT IDLETIME.