Microservices — Theory

Microservices — Theory (interview deep-dive)

Saga pattern

Distributed transaction as a chain of local transactions. If any step fails, run compensating transactions to undo previous steps. Trades immediate consistency for eventual consistency.

Two flavors

Choreography — services react to each other’s events. No central coordinator.
- Pros: simple, decoupled.
- Cons: hard to reason about flow, cyclical dependencies, hard to debug.
Orchestration — central orchestrator calls services in order, handles compensation.
- Pros: explicit flow, easier to monitor.
- Cons: orchestrator is SPOF / additional service.
- Tools: Temporal, Cadence, AWS Step Functions, Camunda.

Compensation requirements

Each step needs an inverse (refund payment, restock inventory, cancel reservation).
Compensations must be idempotent and commutative (may run multiple times in any order).
Some actions are not truly reversible (e.g. send email) → mitigate with confirmation step or soft-delete.

Outbox pattern

Problem: dual-write — update DB and publish to broker atomically? You can’t (two-phase commit is rare/painful).

Solution:

In same DB transaction, INSERT into outbox table the event payload.
A relay process polls outbox (or uses CDC like Debezium reading the WAL) and publishes to broker.
Mark row published or delete after ack.

Guarantees at-least-once delivery. Consumers must be idempotent (dedupe by event id).

Inbox pattern

Mirror on consumer side: store seen event ids in inbox table inside the same txn that processes the event. On retry, INSERT IGNORE / ON CONFLICT skips duplicates.

Circuit breaker

States:

Closed — normal. Track failure rate.
Open — failure threshold breached → fail fast for cooldown period.
Half-open — after cooldown, allow N probes. Success closes; failure re-opens.

Tunables: failure threshold, time window, cooldown, probe count. Per-dependency, not global.

Implementation: Hystrix (deprecated), Resilience4j, Polly (.NET), opossum (Node), gobreaker (Go), Envoy / service mesh.

Distributed consistency models

Strong — all reads see most recent write. Linearizable. Costly.
Eventual — converges if writes stop. Default for replicated systems.
Causal — operations causally related are seen in order.
Read-your-writes — user sees their own writes immediately (sticky session, write-then-read from primary).

CAP theorem

Under network partition, you must choose Consistency or Availability.

CP systems sacrifice availability under partition: HBase, MongoDB (with majority writes), etcd, ZooKeeper.
AP systems remain available, may serve stale data: Cassandra, DynamoDB (with eventual reads), Riak.
“CA” only describes non-partitioned operation.

PACELC extends: Even when no Partition, you choose between Latency and Consistency.

Idempotency

A request is idempotent if executing it N times has same effect as once.

Naturally idempotent: GET, PUT, DELETE.
POST/PATCH: implement with idempotency key header — server stores response keyed by (client_id, key) for window (24h common).

Failure modes to design for

Network partition.
Cascading failure — one slow service blocks others. Fix with bulkheads, timeouts, circuit breakers.
Retry storm — synchronized retries amplify load. Fix: jitter, backoff, breaker.
Thundering herd — cache miss → all clients hit DB. Fix: single-flight, request coalescing.
Slow consumer — backpressure or drop.
Poison pill — bad message blocks queue. Fix: DLQ + retry topics.

Service mesh — when worth it

Mesh adds latency (proxy hop) and ops complexity. Worth it when:

mTLS everywhere is required (zero trust).
Fine-grained traffic policies (canary, weighted routing) needed.
Cross-language consistency in retries / timeouts / metrics.
Observability needs more than per-service instrumentation provides.

Often overkill for <20 services. Try lighter (Linkerd) before Istio.

API versioning

URL path: /v1/, /v2/. Simple, breaks fully decoupled URIs.
Header: Accept: application/vnd.acme.v2+json. Cleaner but harder to test.
Body field: version: 2. Flexible.

Strategy: never break v1. Add fields, don’t rename or remove. Use feature flags or new endpoints for breaking change. Sunset old version with notice.

Common interview questions

Monolith → microservices migration approach? Strangler fig: introduce gateway, peel off bounded contexts one by one, route traffic gradually.
How to handle cross-service queries? Avoid; if needed, use API composition (gateway aggregates) or CQRS read model materialized from events.
Service A calls B which calls C — how to debug latency? Distributed tracing (OTel). Trace id in every log.
Two services need same data? They each own a subset; one publishes events, other materializes its own view.
How to evolve event schema without breaking consumers? Schema registry (Avro/Protobuf), additive changes, never remove fields, version explicitly.
What’s the difference between orchestration and choreography? When to use each?
Outbox vs 2PC vs idempotent consumers? Outbox is pragmatic at-least-once; 2PC blocking & rare; idempotent consumers required regardless.
How to handle a slow downstream that’s flooding your queue? Backpressure, rate-limit, breaker, DLQ.
What is the database-per-service rule, and what do you do when joins are needed across? Composition at API layer or async event-driven view.

Common pitfalls

Distributed monolith — services share DB, deploy lockstep, can’t scale independently.
Synchronous chains — A→B→C→D over HTTP. Latency multiplies, single failure cascades.
Premature decomposition — splitting before you understand the domain. Start with module boundaries inside a monolith.
Ignoring data ownership — multiple services writing to same table.
No observability — debugging without distributed tracing is impossible at scale.

Deep dive — Saga orchestration vs choreography

A saga solves business transactions spanning multiple services, each with its own database (the Database per Service pattern), without using 2PC. Richardson defines it as “a sequence of local transactions. Each local transaction updates the database and publishes a message or event to trigger the next local transaction in the saga.” If any step fails, the saga executes compensating transactions that semantically undo preceding local transactions. Sagas give up Isolation from ACID — they are ACD, not ACID — so concurrent sagas can observe each other’s intermediate state (lost updates, dirty reads, fuzzy reads).

Coordination styles

Choreography — each service reacts to domain events from peers; no central brain.
Orchestration — a central orchestrator (Temporal, Camunda, AWS Step Functions, custom state machine) sends commands and tracks state.

Microsoft and Richardson both note orchestration scales better for complex flows: avoids cyclic event dependencies and centralises observability. Choreography is fine for 2–3 services and very loose coupling but quickly becomes “events flying around” with no single owner.

Diagram (orchestration, Order saga)

Client → OrderService.createOrder()
            │
            ▼
        OrderSagaOrchestrator (state: PENDING)
            │  reserveCredit
            ▼
        CustomerService ──ok──► (state: CREDIT_RESERVED)
            │  reserveInventory
            ▼
        InventoryService ──FAIL──► compensate:
                                     releaseCredit → CustomerService
                                     markRejected  → OrderService

Gotchas

Compensations are not rollbacks. cancelPayment ≠ undoing a DB row; it’s a new business event (refund), often visible to the customer.
Pivot / non-compensable steps (e.g. “ship parcel”) must be the last compensable step, or after all compensables succeed. Microsoft: “pivot transaction = point of no return.”
Isolation anomalies require Richardson’s countermeasures: semantic lock (status flag like PENDING), commutative updates, pessimistic view (reorder steps so risky reads happen after pivot), re-read value, version file, by value.
Choreography → cyclic dependency: ServiceA listens to ServiceB which listens to ServiceA. Makes integration testing nearly impossible.
Idempotency is mandatory because retries will redeliver commands; every step must be safe to apply twice.

Q: Why not just use 2PC and avoid sagas?

2PC requires every participant to support XA, blocks all participants while holding locks during the prepare-commit window, and turns coordinator failure into a cluster-wide stall. Most modern stores (Kafka, Cassandra, DynamoDB, most REST services) don’t support XA at all. Sagas trade ACID isolation for availability and loose coupling.

Q: How do you choose orchestration vs choreography?

Default to orchestration once you have ≥3 participants or any branching logic. The orchestrator gives you a single place to read the workflow, durable state, retries, timeouts, and compensation logic. Choreography is acceptable for very simple, linear, two-service flows.

Sources: microservices.io/patterns/data/saga.html; learn.microsoft.com/azure/architecture/patterns/saga.

Deep dive — Two-Phase Commit (and why microservices avoid it)

2PC is an atomic-commit protocol with a coordinator and N participants.

Phase 1 (prepare/voting): coordinator sends PREPARE; each participant durably writes undo/redo log entries, takes locks, replies YES or NO.
Phase 2 (commit): if all voted YES the coordinator writes a commit record and broadcasts COMMIT; otherwise ABORT. Participants apply or roll back, release locks, ack.

The X/Open XA specification is the standard interface, implemented by some RDBMS and traditional message brokers (IBM MQ, Tibco).

Avoided in microservices because:

Blocking — once a participant has voted YES it must hold locks until it hears the decision; if the coordinator dies between phases the participant is stuck “in-doubt” indefinitely.
Availability — every participant must be up at commit time; cluster availability is the product of participant availabilities.
Performance — extra round trips plus held locks crater throughput.

Still appears in intra-database scenarios (Postgres PREPARE TRANSACTION, SQL Server MSDTC) and in legacy banking stacks running XA over Tuxedo/MQ.

Coordinator                 P1            P2
    │── PREPARE ────────────►│             │
    │── PREPARE ──────────────────────────►│
    │◄──── YES (locked) ─────│             │
    │◄────────────────── YES (locked) ─────│
    │ write COMMIT record (durable)
    │── COMMIT ─────────────►│             │
    │── COMMIT ──────────────────────────►│
    │◄──── ACK ──────────────│             │
    │◄────────────────── ACK ─────────────│

Gotchas

Coordinator failure between phases = in-doubt transactions; recovery requires coordinator log to come back.
Heuristic decisions — admins manually committing/aborting in-doubt txns can split-brain the data.
No XA support across heterogeneous brokers — Kafka, RabbitMQ (AMQP 0.9), HTTP, gRPC don’t speak XA.
Lock duration = network RTT × 2 + slowest participant; under load this serializes the system.
Even where XA exists, cloud-managed services (RDS, Aurora) frequently disable it.

Deep dive — Outbox + Inbox patterns

The dual-write problem: a service must update its DB and publish a message — two systems, so a crash between them leaves them inconsistent. Richardson: “it is not viable to use a traditional distributed transaction (2PC) that spans the database and the message broker.”

Outbox pattern fix: insert the event into an outbox table inside the same local DB transaction as the business write. A separate message relay ships outbox rows to the broker and marks them sent. Two flavours:

Polling Publisher — SELECT … WHERE sent_at IS NULL.
Transaction Log Tailing / CDC — Debezium reads Postgres WAL or MySQL binlog and streams INSERTs to Kafka. No polling load, ordered by commit LSN.

The relay only guarantees at-least-once delivery (can publish then crash before marking sent), so consumers must be idempotent.

Inbox / Idempotent Consumer pattern — dual on the consumer side: maintain a processed_messages(subscriber_id, message_id) table with a unique constraint; insert the message ID inside the same local transaction as the business write; unique constraint causes duplicates to fail and roll back harmlessly.

Gotchas

Polling adds DB load and latency; CDC is preferred at scale but adds operational complexity (Kafka Connect, replication slots).
Postgres replication slots leak WAL if the connector is offline — disk fills up, DB stops accepting writes.
Ordering — per-aggregate ordering preserved if you key Kafka messages by aggregate_id; cross-aggregate ordering is not.
Outbox table grows forever — schedule a cleanup job for rows older than broker retention.
Schema change on outbox payload still needs same compatibility discipline as event topics.

Q: Why not just publish to Kafka right after the DB commit?

The process can crash between the commit and the publish — DB has the order, broker doesn’t. Or the publish succeeds and the commit fails (if you publish first), leaving downstream consumers acting on a phantom event. Outbox makes the publish atomic with the business write because both are rows in the same local transaction. Alternative when both sides are Kafka: Kafka transactions (KIP-98, exactly-once semantics) let a producer atomically write to multiple topics + commit consumer offsets in one transaction — but it only spans Kafka, not Kafka + your RDBMS.

Q: Outbox gives at-least-once. How do you handle duplicates downstream?

Inbox pattern — every consumer keeps a processed_messages(consumer_id, message_id) table with a unique constraint and inserts inside the same local txn as its business write. Duplicates fail the insert, rollback, message is acked. Alternative: design business operations to be naturally idempotent (PUT semantics, set state rather than increment), or store the message ID on the affected aggregate.

Sources: microservices.io/patterns/data/transactional-outbox.html, …/transaction-log-tailing.html, …/communication-style/idempotent-consumer.html.

Deep dive — Service discovery

Two flavours.

Client-side discovery (Netflix Eureka + Ribbon, HashiCorp Consul + smart client) — the client queries the registry, gets a list of healthy instances, load-balances itself. Pros: fewer hops, client picks best instance. Cons: registry-coupling in every language; per-language client libraries.

Server-side discovery (AWS ELB/ALB, Kubernetes Service, Istio) — the client hits a stable virtual IP/DNS name; an intermediary consults the registry and routes. Pros: language-agnostic, simple client. Cons: extra hop, LB itself needs HA.

Registration: self-registration (instance heartbeats itself into Eureka/Consul — fragile if process crashes) or third-party / platform registration (platform watches lifecycle events and registers automatically — Kubernetes does this via kubelet → API server → Endpoints/EndpointSlice → kube-proxy).

Kubernetes is server-side + platform-registered: a Service of type ClusterIP gets a stable VIP and DNS name svc.namespace.svc.cluster.local; kube-proxy programs iptables (or IPVS) on every node. Headless services (clusterIP: None) skip the VIP and return all Pod IPs via DNS — required for stateful sets.

Gotchas

Stale registrations — Eureka self-preservation mode keeps dead instances; tune heartbeat/eviction.
kube-proxy iptables scales poorly past ~5–10k services; switch to IPVS or use a service mesh.
DNS caching in JVM (networkaddress.cache.ttl=-1 default) pins resolved Pod IPs forever — set to 30s.
ClusterIP is L4 only; for retries/circuit breaking/mTLS use a mesh or smart client.
Headless service consumers must implement their own LB; gRPC clients need dns:/// resolver to honour multiple A records.

Deep dive — API Gateway + BFF

An API Gateway is the single entry point that does routing, request fan-out/composition, protocol translation (HTTP↔gRPC), authn/authz (often JWT validation), rate limiting, request/response transformation, and observability injection. Richardson notes it shields clients from “how the application is partitioned into microservices.”

Common implementations: Kong, AWS API Gateway, Apigee, Envoy/Istio gateway, Spring Cloud Gateway.

BFF (Backend For Frontend), coined by Phil Calçado at SoundCloud, popularised by Newman — a specialisation: instead of one omnibus gateway, each UI gets its own gateway tuned to its needs (web-bff, ios-bff, android-bff, partner-bff). Newman: “one experience, one BFF.” Mobile networks have higher latency and tighter screen budgets, so the mobile BFF aggregates more aggressively and returns trimmer payloads. Owned by the same team that owns the UI.

Gotchas

Distributed monolith trap — gateway accumulates business logic; teams must change the gateway for every feature. Keep gateways “thin” (Newman’s three-strikes rule before extracting shared code).
Code duplication across BFFs is acceptable; sharing too much creates accidental coupling.
Auth at the gateway only — downstream services still need to verify the token (zero-trust); never trust gateway-stripped auth blindly.
Aggregation latency = max(downstream latency); use parallel calls + per-call timeouts + circuit breakers.
Don’t expose internal service IDs through the gateway — translate to public-stable identifiers.

Q: Where do you put cross-cutting concerns like auth, rate limiting, tracing?

At the gateway/BFF for the coarse layer — token validation, IP rate limits, trace context injection, request logging. But downstream services must independently re-validate the JWT (zero-trust) and apply their own fine-grained authz (RBAC/ABAC) because the gateway shouldn’t know domain rules. Tracing should be propagated end-to-end via W3C traceparent so the gateway is just the first span.

Deep dive — Distributed tracing (OpenTelemetry)

OpenTelemetry (OTel) is the CNCF standard for traces/metrics/logs. A trace is the path of a request across services; a tree of spans. Each span has a name, start/end timestamps, a SpanContext (immutable: trace ID + span ID + trace flags + trace state), key-value attributes, point-in-time events, links to other spans (for async fan-out), and a status (Unset/Ok/Error).

Context propagation uses the W3C traceparent header: 00-<32-hex trace-id>-<16-hex span-id>-<2-hex flags>, e.g. traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01 (the trailing 01 = sampled). The companion baggage header carries app-level key/values (tenant ID, feature flag).

Sampling

Trades fidelity for cost:

Head-based (decided at the root span by trace-ID hash + percentage) is cheap, deterministic, stateless — but you can’t say “always keep error traces” because you don’t know yet.
Tail-based sees the whole trace before deciding (keep all errors, all >1s, sample 1% of the rest), giving you the spans you actually need — but the collector must buffer every trace until complete (stateful, expensive, “dozens or hundreds of compute nodes” per OTel docs).

Production systems often combine: head-sample at the edge to protect the pipeline, then tail-sample in the collector. Collectors speak OTLP as the vendor-neutral wire format. Tools: Jaeger, Grafana Tempo, Honeycomb, Datadog APM, AWS X-Ray.

Correlation ID vs Trace ID

A correlation ID (a.k.a. request ID) is a single opaque ID minted at the gateway, propagated as X-Request-Id and logged on every line — great for grep across logs even when tracing is disabled. A trace ID is structured (W3C 16-byte) and ties spans together. Best practice: emit both, with the trace ID also being your log correlator (pino, zap, structlog auto-inject it from active span context).

Gotchas

Header propagation breaks at every async boundary (Kafka/SQS) — manually inject traceparent into message headers and extract on consume.
Sampling at the edge changes downstream sampling because of the sampled bit; an unsampled root means children won’t be exported either (parent-based sampler).
High-cardinality attributes (user ID on every span) blow up backend storage cost; stick to low-cardinality dimensions.
Clock skew between hosts can make child spans appear to start before parents — durations get weird.
Don’t log PII as span attributes — they get shipped to a third-party APM.

Deep dive — Cross-service contracts

Three layers + a verification layer.

OpenAPI / Swagger for sync REST

OpenAPI 3.x is the spec for HTTP APIs. Contract-first (write the YAML, generate server stubs and client SDKs with openapi-generator) is preferred for cross-team work because the contract is reviewable independently of implementation. Code-first (annotate handlers, generate the spec) is faster for solo teams but risks the spec drifting from reality. SDK codegen yields language-native clients sharing a single source of truth.

gRPC + Protobuf — schema-first, strict evolution rules

Protobuf is the schema source of truth; servers and clients are generated. Per protobuf.dev, the cardinal rule: never reuse a field number. Deletions must use reserved.

Safe changes: add new optional fields, add enum values (with an UNKNOWN = 0 default), widen integer types within the varint-compatible set (int32 ↔ int64 ↔ uint32 ↔ uint64 ↔ bool), string ↔ bytes (UTF-8). Unsafe: rename field number, change wire type outside the safe table, change repeated ↔ singular semantically.

Trap: sint32/sint64 use zigzag varint encoding and are wire-compatible only with each other — never swap them with int32/int64 even though both are varint family.

Schema Registry for events (Confluent — Avro/Protobuf/JSON Schema)

Producers register a schema; the registry returns an ID; producers write [magic byte][schema id][payload]. Consumers fetch the schema by ID and deserialize.

Compatibility modes:

Mode	Meaning	Allowed changes	Upgrade order
`BACKWARD` (default)	New schema can read old data	Add optional fields, delete fields	Consumers first
`FORWARD`	Old schema can read new data	Add fields, delete optional	Producers first
`FULL`	Both	Add/delete optional fields only	Either order
`*_TRANSITIVE`	Same, but vs all prior versions	—	—
`NONE`	No checks	Anything	Coordinated

Confluent’s argument: a registry is the only thing standing between you and a 15TB reprocess when someone changes a date format.

Consumer-Driven Contract Testing — Pact

Pact inverts integration testing: the consumer writes a unit test that records what it expects from the provider (request → response). Pact generates a JSON pact file; the file is uploaded to the Pact Broker. The provider runs verify in its own CI: stands up just itself, replays each recorded request, checks the response shape matches. Per Pact docs: “only parts of the communication actually used by consumers get tested” — providers can change unused fields freely.

The killer feature is can-i-deploy: before deploying, the pipeline calls the broker to ask “is the version I’m about to ship verified-compatible with everything currently in prod?” Exit code 0 → deploy; exit code 1 → block. This breaks the bottleneck of coordinated multi-service releases.

Gotchas (contract layer overall)

OpenAPI annotations drift — enforce CI check that regenerates the spec and diffs.
Protobuf field-number reuse is the #1 outage cause for gRPC services — make reserved a code-review blocker.
Schema Registry default BACKWARD — fine for consumers-first rollout, but if producers deploy first (common in monorepos) you need FORWARD or FULL.
Pact ≠ functional test. Verifies shape, not business correctness. Don’t drop integration tests for happy-path business flows.
Pact provider state requires the provider to have a hook to set up that state — frequently forgotten and pacts pass for the wrong reason.

Q: How would you stop one team breaking another team’s clients when shipping a backward-incompatible change?

Three layers. (1) Schema enforcement at build time: OpenAPI/protobuf in a shared repo, lint rule fails on field-number reuse, removed-without-reserved, removed required field. (2) Schema Registry for Kafka topics in BACKWARD_TRANSITIVE mode by default — broker physically rejects incompatible schemas. (3) Pact + can-i-deploy in CI — consumer’s recorded expectations must verify against provider’s HEAD before either side merges, and broker gates production deploys. Combination means a breaking change fails at PR time, not in prod.

Q: When would you use Pact vs end-to-end integration tests?

Pact for inter-service contracts (request/response shape, headers, status codes) — fast, isolated, runs in each service’s pipeline without standing up the whole stack, breaks at PR time on the team that introduced the change. E2E for critical user journeys where business correctness across services matters. Pact verifies “you can talk to me”; E2E verifies “we deliver value together.”

Deep dive — Versioning strategies

Two axes: how you signal the version and how long you support the old one.

Four signaling styles:

URL path (/v1/orders) — operationally simplest, easy to route at the gateway, cache, grep in logs; common for internal APIs.
Custom header / date-based (Stripe-Version: 2024-06-20, X-GitHub-Api-Version) — keeps URLs stable, lets a single client pin a date.
Media-type versioning (Accept: application/vnd.github.v3+json) — HATEOAS-pure; couples version to content negotiation.
Query param (?v=2) — discouraged; easy to miss in routing, logging, cache keys.

Whichever you pick, the rule is never break existing clients: only additive changes inside a major version; new major version runs alongside the old one for a deprecation window.

For events, the Tolerant Reader pattern (Fowler, citing Postel’s law: “be conservative in what you do, be liberal in what you accept”) is the pragmatic way to keep evolving without forcing lockstep upgrades: consumers ignore unknown fields, never assume strict schema, deserialize into maps where possible. Combine with the Schema Registry mode that matches your rollout risk.

Communicate deprecation explicitly. RFC 8594 defines the Sunset HTTP response header. Pair with the Deprecation header (draft) and a Link header pointing to migration docs.

HTTP/1.1 200 OK
Deprecation: Sun, 01 Jun 2026 00:00:00 GMT
Sunset:      Sun, 01 Jun 2027 00:00:00 GMT
Link: <https://api.example.com/docs/migrate-v2>; rel="sunset"

Gotchas

No version forever — define the deprecation window up front (typical: 6–12 months) and publish it. Otherwise v1 lives forever.
Don’t bump major versions for additive changes — #1 reason teams end up at v17.
Sunset is a hint, not a guarantee — RFC 8594 explicitly says clients may not honour it; communicate via email + portal too.
Tolerant readers + strict codegen don’t mix — generated client that throws on unknown fields breaks the model. Configure deserializer to ignore extras (Jackson FAIL_ON_UNKNOWN_PROPERTIES=false).
Versioning event payloads in the topic name (orders.v1, orders.v2) is sometimes cleaner than schema versioning when the change is genuinely breaking.

Deep dive — Anti-patterns (Fowler/Newman/Hard Parts)

Anti-pattern	What it looks like	Why it bites
Distributed monolith	Services that must be deployed together; a release plan that lists 6 services in order.	All the operational cost of microservices with none of the autonomy benefit.
Shared database	Two services read/write the same tables.	Schema changes ripple invisibly; replaced API coupling with worse hidden coupling.
Chatty interfaces	One UI action causes 40 inter-service HTTP calls.	Fowler’s First Law: “you can’t encapsulate the remote/in-process distinction.” Latency stacks.
Sync RPC for everything	Every state change is a chain of blocking HTTP calls.	Cascading failures, head-of-line blocking, requires every service up at once.
No observability	Logs but no traces; no correlation ID; no SLOs.	First production incident is unreviewable.
Saga without idempotency	Compensations re-run on retry and double-refund.	At-least-once is reality; idempotency is the price of admission.
Versionless events	Producer adds a field, every consumer breaks.	Schema Registry + tolerant readers + compatibility mode are not optional.
Stamp coupling (Hard Parts)	Service A sends a giant object to B and B uses two fields.	Any change to the bag breaks B even though B doesn’t care.
Reuse via service (Hard Parts ch. 8)	A “shared utilities” service every other service calls.	Adds a hop and a single point of failure for a library-shaped concern.
Joint ownership creep (Hard Parts ch. 9)	Two services writing the same tables “just for now.”	Schema changes need both teams in lockstep.
No fitness functions (Fundamentals ch. 6)	Coupling/perf/security rules exist as Confluence pages.	Decays silently between reviews. Encode in CI or it doesn’t exist.

Deep dive — Service granularity (Hard Parts)

“Microservice = small service” is the trap. Hard Parts frames sizing as a tug between granularity disintegrators (forces that argue for breaking a service smaller) and granularity integrators (forces that argue for keeping it bigger). Pick the smallest service that survives the integrators.

Disintegrators — reasons to split

Service scope and function — single service does several unrelated things (single-responsibility violated). Look for “and” in the service name.
Code volatility — one part churns weekly while the rest is stable; split out the churn so deploys don’t risk the stable surface.
Scalability and throughput — one operation needs to scale to 10× the rest.
Fault tolerance — one operation (PDF render, external integration) crashes the process; isolate to keep the rest alive.
Security access — one operation handles PCI/PII and the rest doesn’t; separation reduces audit surface.
Extensibility — adding new variants means edits everywhere; split so each variant is a service.

Integrators — reasons to keep together

Database transactions — if two operations must commit atomically, splitting them buys you a saga (with all the isolation anomalies above).
Workflow / orchestration — two services that always call each other in lockstep add a network hop with no autonomy gain.
Shared code volatility — both services change together; one logical change ≈ two PRs.
Data dependencies — both operations read the same data; splitting means inter-service joins.

Trade-off framing (interview-ready): “The right granularity is the smallest service where the integrator forces don’t outweigh the disintegrator forces. When in doubt, start coarser and split when a disintegrator becomes painful — splitting is reversible; coupling created by a premature split usually isn’t.”

Gotchas

Splitting on noun (entity) is not always right. Hard Parts: split by function (verb) at least as often as by entity.
Distributed monolith is what you get when disintegrators are weak but you split anyway.
Reuse ≠ a service. Hard Parts ch. 8: shared logic belongs in a library or sidecar before a service.

Q: How small is too small?

Too small is when every business operation becomes a distributed transaction across 3+ services. The grain test: if you can’t describe one service’s responsibility in a single sentence without “and,” it’s too big; if a typical user request fans out to 6+ services for a single state change, you’re too small. Iterative — start at the bounded-context grain (Evans, DDD) and split when a disintegrator force becomes concrete.

Deep dive — Data ownership patterns (Hard Parts)

Once you split services, the next decision is who owns which tables. Hard Parts catalogues four ownership patterns:

Pattern	Definition	Use when	Cost
Sole ownership	One table, one service writes.	Default. Clean DDD alignment.	Forces other services to read via API or projection.
Joint ownership	Multiple services write the same table.	Two services unavoidably co-own (rare).	Locking, schema-change coordination — usually a smell.
Common ownership	Reference / lookup table read by many, written by one (or batch).	Static or slow-moving reference data (country codes, tax rates).	Cache invalidation; staleness window.
No ownership / “data domain”	A domain of tables is owned by a service group, not a single service.	Legacy migration.	Cross-service coupling re-enters by the back door.

The honest hierarchy: prefer sole ownership for everything; promote to common ownership for reference data with read-mostly access; treat joint and no-ownership as transitional states during decomposition.

Eight saga shapes (Hard Parts ch. 12)

Three axes — communication (sync/async), consistency (atomic/eventual), coordination (orchestrated/choreographed):

Epic Saga (sync, atomic, orchestrated) — closest to 2PC; high coupling, high reliability, low scalability.
Phone Tag Saga (sync, atomic, choreographed) — atomic point-to-point chain; very fragile.
Fairy Tale Saga (sync, eventual, orchestrated) — common “default” saga. Good balance.
Time Travel Saga (sync, eventual, choreographed) — works for simple flows.
Fantasy Fiction Saga (async, atomic, orchestrated) — async messages but you want atomicity; usually impossible.
Horror Story (async, atomic, choreographed) — what teams accidentally build. Avoid.
Parallel Saga (async, eventual, orchestrated) — orchestrator fan-outs to parallel async steps. Production-grade.
Anthology Saga (async, eventual, choreographed) — pure event-driven. High autonomy, hardest to debug.

Almost always Fairy Tale (default) or Parallel (when fan-out matters).

Data access without ownership — three reads

When service A needs data owned by B:

(a) Inter-service call (sync API or query) — simplest, but couples availability and adds latency.
(b) Column schema replication (B publishes events, A maintains a local read model — a derived store in DDIA’s language) — best latency and isolation; requires schema-evolution discipline.
(c) Data Domain (shared schema/database between a small group of cooperating services) — last resort; gives up most service autonomy.

Gotchas

Shared DB across services is not “joint ownership” — it’s the shared-database anti-pattern. Joint ownership means deliberate, table-scoped, governed.
Derived stores drift — without a rebuild path (replay topic from earliest), a projection that misses messages becomes wrong forever. Always keep events retained long enough to rebuild, or snapshot the projection.
Cross-service queries (“join orders + customers in one request”) are the #1 reason teams revert to a shared DB. Solve with server-side composition in a BFF backed by per-service caches, not with cross-DB joins.
Reference data via API on every request kills latency. Replicate it to a local table or in-memory cache with a TTL or invalidation event.

Q: Two services need the same data. Who owns it?

One of them. Pick by who writes it — write authority defines ownership. The other gets it via (a) a derived projection updated from events, (b) an API call, or (c) a shared read-only common-ownership table for true reference data. Never let both write — that’s joint ownership and it bites at schema-migration time.

Q: Walk me through how you’d split a monolithic order DB.

(1) Map components → services using bounded contexts (DDD). (2) Identify each table’s sole owner by who writes most. (3) For each cross-service read, decide: API, replicated projection, or remain co-owned temporarily. (4) Implement the Strangler Fig at the data layer — outbox-publish from the monolith, new service consumes its projection, traffic shifts gradually behind a BFF. (5) Cut the FK once consumers are on the projection. Iterative — never big-bang.

Deep dive — Architecture characteristics, fitness functions, ADRs (Fundamentals)

Architecture is the set of decisions hard to change later.

Architecture characteristics (-ilities) — non-functional requirements that drive structure: availability, scalability, elasticity, performance, security, deployability, testability, observability, fault tolerance, recoverability, evolvability, configurability, interoperability, learnability, compliance. The book’s framing: pick the top three for your system — you cannot optimise for all. For a regulated/enterprise system the natural top three are often auditability, availability, and evolvability — driving choices like event-sourced audit logs, multi-AZ Postgres, and strict contracts.

Fitness functions (Ford’s Building Evolutionary Architectures) — automated, objective integrity checks for an architecture characteristic; a unit test for the architecture itself:

Performance fitness — load test in CI that fails the build if p95 latency exceeds 200 ms.
Coupling fitness — ArchUnit / depCheck rule that fails if domain.orders imports domain.billing directly.
Security fitness — CI step that fails if any new endpoint lacks auth middleware.
Resilience fitness — chaos test that kills a pod and asserts SLO holds.

If a characteristic has no fitness function, it’s a wish, not a requirement.

Architecture Decision Records (ADRs) — Michael Nygard’s lightweight markdown format: Context, Decision, Status (Proposed/Accepted/Superseded), Consequences, and optionally Alternatives Considered. They live next to the code (/docs/adr/0007-saga-orchestrator-temporal.md) and accumulate as the record of why.

Gotchas

“All the -ilities” is a non-answer. If everything is critical, nothing is. Force a ranked top three.
A fitness function that isn’t in CI is documentation. It must fail the build.
ADRs decay without a Superseded by link — keep them immutable but cross-linked when the decision changes.
Coupling fitness vs DRY — sometimes duplicating a bit of code is the right call to keep services decoupled.

Q: How do you justify an architecture decision to a sceptical stakeholder six months later?

An ADR with Context and Consequences sections written at decision time. The Context captures constraints that existed (regulatory deadline, team skills, traffic estimate); the Consequences are the trade-offs accepted. If the world changed, you write a new ADR that Supersedes the old one — never edit history.

Q: How do you stop architectural decay?

Fitness functions in CI. If a characteristic matters, write an automated test for it that runs on every PR. Coupling rules (ArchUnit/depCheck), perf budgets (Lighthouse, k6), security scans (Semgrep, Trivy), contract checks (Pact can-i-deploy). The system can only stay evolvable if the rules that keep it evolvable are enforced by a robot, not by review-time vigilance.

Deep dive — End-to-end correctness (DDIA)

Three ideas from DDIA elevate the outbox / saga story.

Stream-table duality

Every database table is the snapshot of a log of changes; every log can be folded into a table by replaying it. DDIA: “the log is the source of truth; the table is a cache of the latest value per key.” Practical implication: when service A needs data owned by B, the right architecture is B publishes its change log; every consumer materialises its own view. The log is durable, replayable, and ordered.

Exactly-once is end-to-end

“Exactly-once” inside a broker (Kafka idempotent producer + transactions, KIP-98) only protects Kafka→Kafka. The moment a write touches another system (your DB, an email API, a payment gateway), the only way to get exactly-once effect is idempotency at every step + transactional outbox (Postgres → Kafka) + idempotent consumer / inbox (Kafka → Postgres). Kleppmann’s framing: pretend exactly-once delivery is impossible (it is), and design every step to be safe to apply twice.

The end-to-end argument (Saltzer/Reed/Clark 1984)

Reliability properties belong at the endpoints, not in the middle. TCP gives ordered byte delivery but doesn’t help if your app’s parsing is wrong; mTLS authenticates the workload but doesn’t help if your handler trusts a header; broker exactly-once doesn’t help if the consumer double-charges. The implication: don’t trust the platform/mesh/broker to give you correctness — verify at the receiver, every time. Idempotency keys, message IDs in an inbox table, and final-state verification (read-your-write) are the actual guarantee.

Gotchas

“We turned on Kafka EOS, so we’re exactly-once” — false. EOS spans Kafka. The DB write is separate; you still need outbox/inbox.
Replaying a topic to rebuild a projection requires the consumer to be deterministic — no Date.now() in projection logic, no calls to non-idempotent external services.
Idempotency keys must be derived from the request, not generated server-side — otherwise retries can’t match.
Tombstones (null-value Kafka records on a compacted topic) are how deletes propagate — consumers must handle nulls.
End-to-end correctness ≠ end-to-end latency. Idempotency + replay get you correct outcomes but can take seconds to converge.

Q: Build a “transfer money between accounts at two services” feature. What guarantees can you give?

Eventual consistency with idempotent compensating actions. Concretely: orchestrated saga (Temporal) calling DebitA, then CreditB. Each step writes to its service’s DB + outbox in one transaction; the orchestrator retries on transient failure with the same idempotency key, so steps are safe to apply twice. On failure after the pivot, run compensations (CreditA) — also idempotent. The client sees an acknowledgment that the transfer is in progress (not “done”) and a webhook/poll endpoint to confirm. Exactly-once effect on the ledger, no Kafka EOS / 2PC needed. Trade-off: temporary visibility of intermediate state, mitigated by a PENDING status flag (Richardson’s semantic lock).

Q: Why is exactly-once delivery a “myth”?

Networks lose acks, processes die between commit and ack, retries are mandatory under partition — therefore at-least-once is the realistic delivery guarantee. What you can give is exactly-once effect by making every operation idempotent and providing end-to-end deduplication (idempotency keys + inbox).

Deep dive — Service mesh

A service mesh moves cross-cutting L7 concerns — mTLS, retries, timeouts, circuit breaking, traffic splitting (canary/blue-green), per-route auth, observability — out of application code and into a sidecar proxy (Envoy for Istio, linkerd2-proxy Rust microproxy for Linkerd) injected next to every Pod.

Two planes:

Data plane — sidecars handling actual traffic.
Control plane — istiod/Linkerd’s destination/identity/policy services distributing config, certs, discovery to sidecars via xDS.

The sell: language-agnostic networking. Go, Python, and Node services all get mTLS, retries, traces without writing TLS or retry code. The cost: every Pod runs two containers, every hop adds 1–2 ms latency, control plane is a non-trivial cluster citizen. Ambient mode (Istio ambient, no sidecar — node-level ztunnel + per-namespace waypoint proxy) and proxyless gRPC (xDS directly in the gRPC client) are 2024–2026 efforts to reduce that cost.

Mesh vs gateway vs SDK

API Gateway = north-south edge (untrusted clients → mesh boundary).
Service mesh = east-west inside the cluster (service ↔ service).
Client SDK (Netflix Ribbon-era) = mesh concerns embedded in each language’s library.

Gotchas

mTLS != authz. Mesh mTLS proves workload identity (SPIFFE SVID); you still need AuthorizationPolicy (Istio) / Server+ServerAuthorization (Linkerd) for who-can-call-whom.
Sidecar startup race — app can start before its sidecar is ready and fail outbound calls; use holdApplicationUntilProxyStarts: true (Istio) or native sidecar containers (K8s 1.28+).
Retry storms — mesh retries layered on app retries multiply; budget retries at exactly one layer.
Header propagation — the mesh does not automatically propagate traceparent across service boundaries; the app must forward it.
Migrating a brownfield cluster — gradual namespace injection beats big-bang.

Q: When would you not adopt a service mesh?

Small team, <10 services, single language — SDK approach (gRPC interceptors, OTel auto-instrumentation, retry libraries) is cheaper. Adopt a mesh when (a) you have polyglot services and want uniform networking guarantees, (b) compliance demands mTLS-everywhere, or (c) you need fine-grained traffic shifting (canary by header, mirrored traffic, fault injection) painful to build in-app.

Closing principle

Newman and Richardson agree: microservices are an organisational strategy disguised as an architecture. They pay off when you have multiple teams who need to deploy independently, and they cost you when you don’t. The patterns above exist to recover the guarantees you gave up when you stopped having one database and one deploy.

Ford & Richards’s Hard Parts makes the same point sharper: every microservice decision is a trade-off, and the engineer’s job is to surface those trade-offs explicitly (granularity disintegrators vs integrators, data ownership patterns, the eight saga shapes), encode the chosen properties as fitness functions, and record the decisions as ADRs. The answer to “manage cross-dependencies across multiple service domains” is schema/contract governance (OpenAPI + protobuf + Schema Registry + Pact + can-i-deploy), backed by clear service ownership (Database per Service, sole-ownership default, derived projections for cross-service reads), and end-to-end correctness via idempotency + outbox + inbox rather than blind faith in broker-level “exactly-once.”