MongoDB — Theory
MongoDB — Theory (interview deep-dive)
Section titled “MongoDB — Theory (interview deep-dive)”When document model wins / loses
Section titled “When document model wins / loses”Wins:
- Polymorphic / sparse data — fields vary per record.
- Hierarchical data accessed together (orders w/ line items, blog posts w/ comments).
- High write throughput on simple CRUD.
- Rapid iteration when schema isn’t settled.
Loses:
- Complex multi-entity transactions and joins.
- Strong relational integrity / FK constraints.
- Reporting / analytics with arbitrary joins.
- When data fundamentally is relational.
Schema design — embed vs reference
Section titled “Schema design — embed vs reference”Embed when:
- Data accessed together always.
- Child cannot exist independently.
- Subdoc count is bounded.
- Read-heavy.
Reference when:
- Many-to-many relationships.
- Subdoc grows unbounded (comments → millions).
- Subdoc accessed independently.
- Write contention high (entire doc rewrites).
Patterns: subset pattern (embed N most recent, reference rest), bucket pattern (group time-series), computed pattern (precompute aggregates).
Storage engine — WiredTiger
Section titled “Storage engine — WiredTiger”- Default since 3.2. Document-level locking, snapshot isolation, MVCC.
- Compression: snappy (default), zlib, zstd.
- Cache: 50% of RAM minus 1GB.
- Checkpoints every 60s. Journal for durability between checkpoints.
Replica sets — internals
Section titled “Replica sets — internals”- Primary accepts writes, replicates to secondaries via oplog (capped collection in
localDB). - Secondaries pull oplog, apply ops idempotently.
- Election: secondary detects primary down → starts election → majority votes → new primary. Raft-like protocol since 3.2.
- Need majority of voting members alive to elect / accept writes.
- Read from secondary: eventual consistency unless using majority read concern.
Write concern & read concern
Section titled “Write concern & read concern”- Write concern
w: 'majority', j: true: ack only when majority replicated AND journaled. Production default for important data. - Write concern
w: 1: primary ack only. Risk: rolled back on failover. - Read concern
majority: read data acked by majority, can’t be rolled back. - Read concern
linearizable: wait until all writes seen by all members. Slow.
Sharding deep-dive
Section titled “Sharding deep-dive”- Each shard is a replica set.
- Chunks of contiguous shard key range, default 128MB. Balancer moves chunks.
- Choosing shard key:
- Cardinality: many distinct values.
- Frequency: low — no value dominates.
- Monotonicity: avoid (timestamps, ObjectId) → all writes hit one shard. Use hashed instead.
- Query alignment: queries should include shard key for routed (single-shard) queries; otherwise scatter-gather.
- Cannot change shard key easily (5.0+ refinable, but limited).
- Zone sharding: pin chunks to specific shards (e.g., regional data residency).
Indexes — gotchas
Section titled “Indexes — gotchas”- Index intersection rare; usually single index used per query.
- Compound index: leftmost-prefix wins.
{a:1, b:1, c:1}covers{a},{a,b},{a,b,c}, NOT{b}or{b,c}. - ESR rule: Equality, then Sort, then Range field order in compound.
- Multikey index can’t include another array field.
- Each write updates every index — too many = slow writes.
- Index builds:
background: true(default in 4.2+), or rolling builds in replica sets.
Aggregation tips
Section titled “Aggregation tips”- Push
$matchand$projectearly to reduce data through pipeline. $lookupcan be slow if foreign collection unindexed — index the joined field.- Use
$facetto compute multiple aggregations in one pass. - Allow disk use for big sorts:
{ allowDiskUse: true }. - View execution stats:
db.coll.aggregate([...]).explain('executionStats').
Transactions
Section titled “Transactions”- Multi-doc since 4.0 (replica) / 4.2 (sharded).
- Costly — runs at snapshot isolation, requires majority write concern, default 60s timeout.
- Best practice: design data to avoid needing them (single-doc atomic updates, embed).
- Retry on
TransientTransactionError.
Common interview Qs
Section titled “Common interview Qs”- When NOT to use Mongo? Heavy joins, strict ACID across collections, relational reporting.
- How do you model 1:N where N is unbounded (e.g. user → posts)? Reference, not embed.
- Why is
$skipfor pagination slow at large offsets? Linear scan. Use range queries on indexed sort key (keyset). - How does Mongo guarantee durability? Journal (write-ahead log), then checkpoint to data files.
- What’s a covered query? All fields needed are in index — no document fetch.
- Replica set with 3 nodes: 1 dies — still writable? Yes (2/3 majority). 2 die — read-only.
- How to do a zero-downtime schema change? Add new field, dual-write, backfill, switch reads, drop old.
- Sharding with date-based key — what’s wrong? Monotonic — hot shard. Use hashed.
- Why would
$lookupbe discouraged in OLTP path? Latency unpredictable; no index intersection. Better to embed or denormalize.
Compared to PostgreSQL
Section titled “Compared to PostgreSQL”| PostgreSQL | MongoDB | |
|---|---|---|
| Schema | strict, evolvable | flexible |
| Joins | native, fast | $lookup, slower |
| Transactions | mature, fast | mature 4.0+, costlier |
| Indexing | rich (GIN, partial, expr, BRIN) | rich (compound, multikey, text, geo) |
| Replication | streaming + logical | replica set, automatic failover |
| Sharding | manual / Citus | native |
| JSONB | indexable, queryable | native model |
PG with jsonb covers many Mongo use cases. Pick Mongo when document model + automatic sharding are first-class needs.
Deep dive — ACID in MongoDB
Section titled “Deep dive — ACID in MongoDB”MongoDB became ACID for single documents always, multi-document transactions on replica sets in 4.0 (June 2018), and distributed multi-document transactions on sharded clusters in 4.2.
Transactions tie to a session, default to a 60-second timeout, and prefer to be single-shard — cross-shard transactions use 2-phase commit and are materially slower. The recommended callback API (session.withTransaction) auto-retries TransientTransactionError and UnknownTransactionCommitResult.
Defaults are weak. MongoDB transactions inherit readConcern:"local" and w:1 from the operation. You must set readConcern:"snapshot" + writeConcern:"majority" per transaction for true snapshot isolation. Jepsen 4.2.6 (2020) found read-skew/G1c anomalies even then, later attributed to a retry bug (SERVER-48307).
Retryable writes (separate from transactions, default since 3.6) automatically retry single-document writes that hit network errors — they make at-most-once delivery practical on flaky networks.
Gotchas
- MongoDB transactions default to
readConcern:"local"+w:1— they ignore collection/db-level safety settings. Set them per transaction. readConcern:"snapshot"only gives true snapshot isolation when paired withwriteConcern:"majority".- Default Mongo transaction timeout is 60 seconds; long-running ones abort.
- Mongo cross-shard transactions are slow (2PC) — design schema so transactions are single-shard whenever possible.
Q: Is MongoDB ACID?
Yes, since 4.0 for replica-set multi-doc transactions and 4.2 for sharded clusters, but with caveats: weak defaults, snapshot isolation requires explicit snapshot + majority write concern, and Jepsen 4.2.6 found read-skew/G1c anomalies later attributed to a retry bug. Single-document writes have always been atomic.
Q: How would you implement money transfer in MongoDB?
Either single-document (embed both balances — fine for joint accounts), or a multi-doc transaction with readConcern:"snapshot", writeConcern:"majority", and the callback API for retries. Make sure both accounts live on the same shard (same shard-key prefix) to avoid cross-shard 2PC. For high throughput, prefer event-sourced ledgers in Postgres.
Sources: mongodb.com/docs transactions, Jepsen MongoDB 4.2.6.
Deep dive — ESR rule (Equality, Sort, Range)
Section titled “Deep dive — ESR rule (Equality, Sort, Range)”MongoDB’s compound-index design rule is ESR: list Equality predicates first, then Sort fields, then Range predicates.
Reasoning:
- Equality keys collapse the index scan to a contiguous range immediately.
- Sort keys placed after equality keys let the index serve the sort directly (no in-memory
SORTstage). - Range keys placed last let the scan terminate cleanly without breaking the sort.
Reverse the order and you either re-sort in memory or scan more index entries than necessary.
When $in is combined with .sort(), fewer than 201 elements are “exploded” into separate index scans merged via SORT_MERGE (equality-like for ESR); ≥201 elements falls back to a range-style scan and the sort is no longer index-served. $ne, $nin, $regex, $gt/$gte/$lt/$lte are ranges.
Index types
Section titled “Index types”| Type | Notes |
|---|---|
| Single-field | Default |
| Compound | Up to 32 fields, leftmost-prefix rule |
| Multikey | Created automatically when indexed field is an array. Compound index can contain at most one array field |
| Text | Full-text, one per collection |
| 2dsphere | Geospatial on GeoJSON |
Wildcard ({ "$**": 1 }) | Indexes all fields. Heavy on writes. Can’t sort, can’t be shard key |
| Hashed | Used for sharding to break monotonic-key hot spots. Only one hashed field per compound index |
Index intersection exists but the planner rarely picks it over a well-designed compound index — don’t rely on it.
Gotchas
- Putting a range field before sort = in-memory sort (and a 32 MB cap unless
allowDiskUse). - Compound indexes hold at most one array field — multikey can’t combine two arrays.
$inflips between equality and range at 201 elements — large$inlists silently degrade ESR plans.- Wildcard indexes don’t support sorting and can’t be used as shard keys; they’re heavy on writes.
- If you have both
{a:1}and{a:1, b:1}, drop{a:1}— the compound serves alla-prefix queries.
Q: Walk me through ESR with an example.
Query: find all orders for tenantId=42 with total > 100, sorted by createdAt. Index: { tenantId:1, createdAt:1, total:1 }. tenantId (equality) shrinks to one tenant’s keys; createdAt (sort) is already ordered in the index, so no in-memory sort; total (range) filters at scan time and stops cleanly. Reversing to {tenantId:1, total:1, createdAt:1} would force MongoDB to sort in memory.
Q: When does ESR break down?
When the range predicate is so selective it cuts the working set far more than the sort matters. Then ERS (range before sort) wins because you sort fewer documents — but you eat an in-memory sort. Verify with explain("executionStats") comparing totalKeysExamined, totalDocsExamined, and whether a SORT stage appears.
Sources: mongodb.com ESR rule, mongodb.com compound indexes.
Deep dive — aggregation pipeline
Section titled “Deep dive — aggregation pipeline”The aggregation pipeline is a sequence of stages where each stage’s output is the next stage’s input.
Common stages
Section titled “Common stages”| Stage | Purpose |
|---|---|
$match | Filter — push as early as possible to use indexes |
$project / $addFields / $set | Reshape/compute |
$group | Aggregate by key; needs all data in memory unless allowDiskUse:true (100 MB per-stage limit) |
$sort | Uses an index if first stage or follows index-using $match |
$limit / $skip | |
$lookup | Left-outer join from another collection |
$unwind | Deconstruct arrays into multiple documents |
$facet | Run several sub-pipelines on the same input — “search results + counts + facets” |
$bucket / $bucketAuto | Histograms |
$graphLookup | Recursive lookup (org chart) |
$merge / $out | Write results back to a collection. $merge upserts, $out replaces |
Pipeline optimization
Section titled “Pipeline optimization”The query planner does meaningful pipeline optimization:
- Reorders
$matchahead of$project/$addFieldswhen filters don’t depend on computed fields. - Moves
$matchahead of$sort. - Coalesces adjacent
$limits (takes the smaller), sums adjacent$skips. - Fuses
$sort + $limitinto a top-N with bounded memory. - Coalesces
$lookup + $unwind + $matchinto a single$lookupwith an inner pipeline to avoid materializing huge intermediate arrays.
Since 5.2 / 6.0 the slot-based execution engine (SBE) accelerates $group and certain $lookup shapes. The biggest practical pitfall is $lookup on an unindexed foreign key on the foreign collection — it becomes O(N×M).
Gotchas
$lookupwithout an index on the foreign field’s join key is O(N×M) — always index it.$unwindafter$lookupcan multiply documents enormously; use$lookup’s inline pipeline +$unwindimmediately so the planner can coalesce.$groupis memory-bound (100 MB per stage) — useallowDiskUse:truefor big cardinalities.$outreplaces the target collection atomically — wrong target = wiped collection. Prefer$mergefor incremental updates.$sortonly uses an index if it can run as the first stage or directly follows an index-using$match; once a$group/$project/$unwindhas run, sorts go to memory/disk.
Q: How do you make a slow $lookup fast?
Index the foreign collection on the join field (and any inner-pipeline filter fields, ESR-style). Push filtering into the $lookup’s pipeline so the planner coalesces with the following $unwind/$match. If the foreign collection is huge, consider denormalizing — embedding a snapshot of the joined fields and updating them on writes is the Mongo-idiomatic answer.
Q: Read this explain — what stages should I look for?
winningPlan.stage of IXSCAN (good) vs COLLSCAN (bad). totalKeysExamined close to nReturned (selective). No SORT stage if you indexed correctly. For $lookup, the new EQ_LOOKUP stage means SBE accelerated it. Watch executionTimeMillis per stage.
Sources: mongodb.com aggregation-pipeline-optimization, MongoDB University M121.
Deep dive — sharding details
Section titled “Deep dive — sharding details”MongoDB is built around the replica set: one primary, multiple secondaries, automatic failover via a Raft-like election (must have a majority of voting members up).
Sharding distributes a collection across shards by shard key — chosen up front and historically immutable, though 4.4 added refinable shard keys (you can append fields) and 5.0 added full resharding.
Three shard-key properties matter:
- High cardinality — else few chunks.
- Low frequency of any single value — else jumbo chunks.
- Non-monotonic — else the active chunk is always on one shard (hot shard).
Hashed shard keys solve the monotonic problem at the cost of range-query locality.
Gotchas
- Mongo shard-key choice is the single biggest performance decision and was historically immutable — 4.4 added refinable, 5.0 added resharding, but resharding is expensive on large collections.
- Monotonic shard keys (
_idObjectId, timestamps, sequences) cause hot-shard write bottlenecks — use hashed sharding or a compound key. - Reading from secondaries gives stale data; for read-your-writes, use
readConcern:"majority"andwriteConcern:"majority"together, or read from primary.
Q: I’ve sharded on _id (an ObjectId) and writes are bottlenecked on one shard. Why?
ObjectId is monotonically increasing (timestamp + counter), so all current writes target the chunk holding the max range, which lives on one shard. The fix is hashed sharding on _id (or the natural key), or a compound shard key whose leading field has high enough cardinality to spread writes (e.g. {tenantId:1, _id:1}). Since 5.0 you can reshardCollection without downtime, but it’s expensive — prefer to get this right initially.
Sources: mongodb.com shard-key, Kleppmann DDIA ch. 5–6.