Performance & Profiling — Theory

Latency math (Little’s Law)

L = λ × W

Implications:

C(N) = N / (1 + α(N-1) + β·N(N-1))

Throughput vs N (workers/cores). Two coefficients:

Real systems plateau and then regress as N grows because β > 0. More cores ≠ more throughput past a point.

Speedup limited by serial fraction:

S(N) = 1 / (s + p/N)

If 5% serial, max speedup ≈ 20× regardless of how many cores.

Profiling is the third step, not the first.

Many threads waiting on one mutex.
Fix: reduce critical section, switch to lock-free data structure, partition state.

Load testers measure response time only when they’re already injecting; they pause if a response is slow.
This UNDER-reports tail latency.
Use wrk2, k6 constant-arrival-rate for honest results.

You typically need both: benchmark to set targets and quantify gain, profile to identify what to fix.

Compiler optimizes dead code away.
Cold caches.
JIT not warmed up.
Power scaling / thermal throttling on laptops.
Use library that handles these (Java JMH, Go testing.B, Rust criterion, JS benchmark.js).

p99 latency degraded — investigate. Reproduce; check tracing for slow span; look for resource saturation; recent changes; downstream slowness; check GC; check DB.
CPU at 30% but latency high — explanation. I/O bound, lock contention, slow downstream, GC stalls, network.
Service slow only at p999. Rare events: GC, slow downstream, resource cliff. Look at tracing tail samples.
GC pause causing tail latency in Go/Java. Smaller heap, tuning (GOGC, -XX:MaxGCPauseMillis), allocation reduction (pool, reuse buffers), arenas.
What’s a flame graph? Stacked, aggregated stack samples; wide = hot.
CPU vs wall-clock profile — when each? CPU shows compute hotspots; wall-clock shows where it waits (I/O, locks).
Off-CPU profiling — what? Time spent NOT running (blocked on I/O, lock, sleep). Useful for I/O-bound services.
Async event loop blocked — symptoms? Latency for unrelated requests rises. Detect via event loop lag instrumentation; fix by offloading sync work to worker threads.