GraphQL — Theory

GraphQL — Theory (interview deep-dive)

The N+1 problem

Resolvers run independently per field. A query like:

query { posts { id author { name } } }

If posts returns 100, the author resolver runs 100 times — 100 separate DB queries. Classic N+1.

Solution: DataLoader

Batches calls within a single tick of event loop.
Caches per-request.
Each resolver calls loader.load(id) returning a Promise; loader collects all ids, makes ONE call, distributes results.

const userLoader = new DataLoader(async (ids: readonly string[]) => {
  const users = await db.users.findMany({ id: { in: [...ids] } });
  return ids.map(id => users.find(u => u.id === id) ?? null);
});

// Resolver
author: (parent, _, ctx) => ctx.loaders.user.load(parent.authorId),

Always create dataloaders per request in context. Never reuse across requests (would leak data between users).

Query complexity / depth limiting

GraphQL allows arbitrary deep queries. Hostile or accidental:

query { user { posts { author { posts { author { posts { ... } } } } } } }

Mitigations:

Depth limit (e.g., 7).
Query complexity scoring — assign cost per field/list, reject above threshold.
Persisted queries — only allow pre-registered queries from clients.
Cost analysis libs: graphql-cost-analysis, @graphql-armor.
Rate limit by complexity (Shopify’s approach).

Caching

HTTP-level caching is hard because GraphQL is POST + same URL.
Use persisted queries with GET + query hash → CDN cacheable.
Client-side normalized cache (Apollo / Relay) — caches by entity id.
Server-side response cache via @cacheControl(maxAge: 60) directive (Apollo).

Need stable id per object; recommend Node interface (Relay style):

interface Node { id: ID! }
type User implements Node { id: ID! ... }

Mutations

Conventionally return a payload type, not just the modified entity:

type CreatePostPayload {
  post: Post
  errors: [UserError!]!
}

Errors-as-data is a pattern: domain errors via errors field; only system failures via top-level errors.
Mutations are not automatically idempotent — same as POST. Use idempotency keys for retries.

Subscriptions

Two transports:
- graphql-ws (modern, replaces deprecated subscriptions-transport-ws).
- SSE (server-sent events) — simpler, one-way, behind regular HTTP.
Backed by pub/sub layer: Redis, Kafka, RabbitMQ.
Subscription resolvers return async iterators.

Federation (Apollo)

Compose subgraphs into one supergraph.

Each service owns its types and fields. Fields can be extended across services.
Gateway parses query, plans which subgraphs to call, merges results.
Use @key, @external, @requires, @provides directives.
Alternative: schema stitching (older), GraphQL Mesh, Hive.

When to use: many backend teams, single GraphQL surface needed. Costly — adds complexity. For 2-3 services, skip.

Authorization patterns

Field-level auth via directives: @auth(requires: ADMIN).
Type-level in resolver: check ctx.user.role.
Per-row: filter at data source layer (DB row policies, ORM scopes).
Don’t rely on client to omit fields — server enforces.

Risk: sensitive field reachable through some indirect path? Audit the schema.

Type-system “gotchas”

Nullability: every field nullable means clients defensive everywhere; non-null cascades — one sub-error nulls the whole branch up to nearest nullable. Default to nullable at top, non-null for required scalars.
Interfaces vs unions: interface = shared fields + can be polymorphic; union = “this or that”, no shared fields required.
Input types must use input (different from object types).

Performance traps

Over-fetching internally — resolver fetches whole document but client requested 2 fields. Pass requested fields down (info.fieldNodes) or use field-aware ORM.
Hot path resolvers — running multiple loaders sequentially when parallel is possible. Use Promise.all.
Bulky payloads — pagination, fragments, projection.
Resolvers awaiting in loops.

Common interview questions

N+1 — explain and solve. DataLoader batching/caching per request.
How does GraphQL caching differ from REST? No URL-based HTTP cache; client-side normalized cache by entity id; persisted queries enable HTTP/CDN caching.
When NOT to use GraphQL? Single client + simple CRUD. Public APIs needing HTTP semantics. File uploads.
How to prevent denial-of-service via deep/complex queries? Depth limit, complexity scoring, persisted queries, rate-limit by cost.
Where does authorization live? Field-level directives or central middleware that inspects info.fieldNodes; data-layer enforcement.
Mutations vs queries — what’s the actual difference? Mutations execute serially (top level), queries in parallel. Both can return data.
How would you test resolvers? Unit-test resolver functions with mocked context; integration-test via real schema with graphql.execute.
What is the schema registry, and why? Track schema versions, breaking changes, downstream consumers.

REST vs GraphQL vs gRPC (one-liner each)

REST — resource oriented, HTTP semantics, cacheable.
GraphQL — flexible field selection, single endpoint, type-system contract.
gRPC — binary, fast, schema-first, ideal internal.