Testing Strategy — Basics
Testing Strategy — Basics
Section titled “Testing Strategy — Basics”The test pyramid
Section titled “The test pyramid” /\ E2E (few, slow, brittle) / \ /----\ Integration / Contract (some) /------\ /--------\ Unit (many, fast, isolated)Idea: many fast unit tests at the base, fewer mid-level integration tests, very few slow E2E tests at the top.
Modern variant: Honeycomb / Trophy — heavier on integration than pyramid suggests, because microservices logic lives at boundaries. (Kent C. Dodds’ “Testing Trophy”.)
Test types
Section titled “Test types”- One function/class in isolation. No I/O.
- Fast (ms). Many. Run on every save.
- Mock external collaborators if needed; prefer pure logic.
Integration
Section titled “Integration”- Multiple modules working together. Hits real DB / Redis / queue (often via Testcontainers).
- Slower (100ms-1s). Catches wiring, schema, transaction bugs.
Contract (consumer-driven)
Section titled “Contract (consumer-driven)”- Verifies that a producer satisfies a consumer’s expected request/response shape — without running both at once.
- Tools: Pact, Spring Cloud Contract.
- Critical in microservices to prevent breaking downstream.
End-to-end (E2E)
Section titled “End-to-end (E2E)”- Full system, real or near-real environment. Often through UI.
- Slow, flaky. Use sparingly for golden paths.
- Tools: Playwright, Cypress, Selenium.
- Smoke — quick post-deploy checks (“up?”).
- Acceptance / behavior — written in business language (Cucumber, Gherkin).
- Property-based — generate inputs, check invariants (fast-check, hypothesis, jqwik).
- Mutation testing — change code, see if tests catch (Stryker, mutmut, PIT).
- Load / performance — k6, Locust, JMeter, wrk.
- Chaos — inject failures (Chaos Mesh, Toxiproxy, Gremlin).
- Security — static (SAST), dynamic (DAST), dependency (Snyk, Dependabot).
Test doubles (Meszaros taxonomy)
Section titled “Test doubles (Meszaros taxonomy)”| Type | Purpose |
|---|---|
| Dummy | passed but not used (filler) |
| Stub | returns canned response |
| Spy | stub + records calls |
| Mock | preprogrammed expectations; verifies them |
| Fake | working impl, simplified (in-memory DB) |
Prefer fakes for collaborators when feasible (fast, real-ish behavior). Use mocks sparingly — they couple tests to implementation.
Red → Green → Refactor.
- Write a failing test capturing the next behavior.
- Write minimum code to pass.
- Refactor without changing behavior.
Benefits: forces design from caller’s perspective, ensures coverage, gives confidence to refactor.
Coverage
Section titled “Coverage”- Statement / line coverage = baseline metric.
- Branch coverage = better.
- 100% is not a goal — diminishing returns. Target ~80% with judgment.
- Mutation score is more meaningful than coverage.
Test naming & structure
Section titled “Test naming & structure”- Arrange / Act / Assert (AAA) or Given / When / Then.
- Descriptive name:
creates_user_when_email_is_validnottest1. - One concept per test.
- Independent — any order, any subset.
Production-grade tactics
Section titled “Production-grade tactics”- Test fixtures: factories for realistic data (factory_bot, faker).
- Snapshot tests: serialize output, diff against committed snapshot. Useful for HTML, large JSON, GraphQL responses.
- Golden file tests for ETL / data pipelines.
- Test parallelism: separate DB schemas/transactions per worker.
- Flaky test policy: quarantine, fix, re-add. Don’t ignore.
CI considerations
Section titled “CI considerations”- Run unit + most integration on every push.
- Run E2E + heavy load tests on main branch / nightly.
- Fail fast: lint → unit → integration → e2e.
- Cache deps for speed.
- Test against the same DB version as prod.
- Reproduce on local: containerized DB, test commands aligned.