Skip to content

Testing Strategy — Basics

/\ E2E (few, slow, brittle)
/ \
/----\ Integration / Contract (some)
/------\
/--------\ Unit (many, fast, isolated)

Idea: many fast unit tests at the base, fewer mid-level integration tests, very few slow E2E tests at the top.

Modern variant: Honeycomb / Trophy — heavier on integration than pyramid suggests, because microservices logic lives at boundaries. (Kent C. Dodds’ “Testing Trophy”.)

  • One function/class in isolation. No I/O.
  • Fast (ms). Many. Run on every save.
  • Mock external collaborators if needed; prefer pure logic.
  • Multiple modules working together. Hits real DB / Redis / queue (often via Testcontainers).
  • Slower (100ms-1s). Catches wiring, schema, transaction bugs.
  • Verifies that a producer satisfies a consumer’s expected request/response shape — without running both at once.
  • Tools: Pact, Spring Cloud Contract.
  • Critical in microservices to prevent breaking downstream.
  • Full system, real or near-real environment. Often through UI.
  • Slow, flaky. Use sparingly for golden paths.
  • Tools: Playwright, Cypress, Selenium.
  • Smoke — quick post-deploy checks (“up?”).
  • Acceptance / behavior — written in business language (Cucumber, Gherkin).
  • Property-based — generate inputs, check invariants (fast-check, hypothesis, jqwik).
  • Mutation testing — change code, see if tests catch (Stryker, mutmut, PIT).
  • Load / performance — k6, Locust, JMeter, wrk.
  • Chaos — inject failures (Chaos Mesh, Toxiproxy, Gremlin).
  • Security — static (SAST), dynamic (DAST), dependency (Snyk, Dependabot).
TypePurpose
Dummypassed but not used (filler)
Stubreturns canned response
Spystub + records calls
Mockpreprogrammed expectations; verifies them
Fakeworking impl, simplified (in-memory DB)

Prefer fakes for collaborators when feasible (fast, real-ish behavior). Use mocks sparingly — they couple tests to implementation.

Red → Green → Refactor.

  1. Write a failing test capturing the next behavior.
  2. Write minimum code to pass.
  3. Refactor without changing behavior.

Benefits: forces design from caller’s perspective, ensures coverage, gives confidence to refactor.

  • Statement / line coverage = baseline metric.
  • Branch coverage = better.
  • 100% is not a goal — diminishing returns. Target ~80% with judgment.
  • Mutation score is more meaningful than coverage.
  • Arrange / Act / Assert (AAA) or Given / When / Then.
  • Descriptive name: creates_user_when_email_is_valid not test1.
  • One concept per test.
  • Independent — any order, any subset.
  • Test fixtures: factories for realistic data (factory_bot, faker).
  • Snapshot tests: serialize output, diff against committed snapshot. Useful for HTML, large JSON, GraphQL responses.
  • Golden file tests for ETL / data pipelines.
  • Test parallelism: separate DB schemas/transactions per worker.
  • Flaky test policy: quarantine, fix, re-add. Don’t ignore.
  • Run unit + most integration on every push.
  • Run E2E + heavy load tests on main branch / nightly.
  • Fail fast: lint → unit → integration → e2e.
  • Cache deps for speed.
  • Test against the same DB version as prod.
  • Reproduce on local: containerized DB, test commands aligned.