Skip to content

ORM (Sequelize) — Theory

Source: sequelize.org/docs/v6/other-topics/migrations

Sequelize gives you two ways to bring schema state into existence: sequelize.sync() and sequelize-cli migrations. sync() introspects your model definitions and issues CREATE TABLE IF NOT EXISTS; sync({ alter: true }) diffs and patches; sync({ force: true }) drops and recreates. None of those operations is versioned, none is reversible, and none survives a model file refactor — the moment you remove an attribute from a model, sync forgets the column ever existed. That’s why force: true is for ephemeral dev/test DBs only.

Migrations solve that by making the schema a sequence of immutable, ordered scripts under migrations/. Under the hood, sequelize-cli delegates to umzug, a generic migration runner that records executed migration filenames in a SequelizeMeta table. Each file exports up(queryInterface, Sequelize) and down(queryInterface, Sequelize). queryInterface is a dialect-aware DDL/DML facade — createTable, addColumn, changeColumn, removeColumn, addIndex, addConstraint, bulkInsert. CLI commands: db:migrate, db:migrate:undo, db:migrate:undo:all --to <filename>, plus seeders. Seeders are not tracked in SequelizeMeta by default — re-running re-inserts unless you set seederStorage: 'sequelize'.

Source: sequelize.org/docs/v6/core-concepts/paranoid

Setting paranoid: true on a model converts .destroy() from a DELETE into an UPDATE "table" SET "deletedAt" = NOW(). Read queries automatically gain a WHERE "deletedAt" IS NULL predicate, so soft-deleted rows disappear from normal finders. Column name defaults to deletedAt, configurable. Paranoid requires timestamps: true; turning timestamps off disables paranoid silently.

Soft deletion gives reversible delete (Model.restore()), audit trail of removal time, and preserved referential integrity (FKs still resolve). Escape hatches: force: true on .destroy() for a real DELETE, and paranoid: false on any finder to include tombstoned rows. The latter is essential for admin/audit screens.

Source: sequelize.org/docs/v6/other-topics/hooks

Hooks are functions Sequelize invokes around model operations. Instance hooks pair around each operation — beforeValidate/afterValidate, then beforeCreate/afterCreate (or beforeUpdate/afterUpdate, beforeDestroy/afterDestroy, beforeSave/afterSave) — and the bulk hooks wrap the bulk operation as a whole.

Registration: in model init options (hooks: {}), via Model.addHook('beforeCreate', fn) (named, removable), or via Model.beforeCreate(fn) shortcut.

Bulk operations do NOT fire instance hooks by default. bulkCreate([...]), Model.update({...}, { where }), and Model.destroy({ where }) only fire bulk hooks. To force per-row hooks pass individualHooks: true — but Sequelize loads every matching row into memory and iterates, which is catastrophic for large updates.

Hooks also do not fire for raw queries (sequelize.query()), DB-level cascades (ON DELETE CASCADE), DB-level SET NULL/SET DEFAULT triggers, or queryInterface calls. Always thread the transaction (options.transaction) through to inner queries so side effects participate in atomicity.

Source: sequelize.org/docs/v6/advanced-association-concepts/eager-loading

N+1 is the canonical ORM performance bug: fetch N parents (1 query), access parent.children in a loop, trigger N more queries. Sequelize’s defense is include, which translates to a JOIN. By default include is a LEFT OUTER JOIN; adding required: true (or any where inside the include) promotes it to INNER JOIN. Use attributes to project columns; use as to disambiguate aliased associations.

The trap: eager loading multiple hasMany associations creates a Cartesian explosion. User.findAll({ include: [Post, Comment] }) where each user has 100 posts and 50 comments yields users × 100 × 50 rows; Sequelize de-duplicates client-side. The cure is separate: true: one query per association using WHERE parentId IN (...). Works for hasMany, supports limit/order per child, essentially a controlled N+1 collapsed to N=1.

For top-level filtering on associated columns without forcing INNER JOIN, use the '$alias.col$' syntax in the outer where.

Source: sequelize.org/docs/v6/other-topics/transactions

Two transaction styles. Managed transactions take a callback: sequelize.transaction(async (t) => {...}) auto-commits on resolve, auto-rollbacks on throw — the correct default. Unmanaged (const t = await sequelize.transaction(); ... await t.commit()) hands you the lifecycle for cases where the commit point isn’t lexically scoped (e.g. a saga awaiting external confirmation). In both styles you must pass { transaction: t } to every query — otherwise it runs on a separate connection and isn’t atomic.

To avoid the threading nightmare, opt into CLS (continuation-local storage): Sequelize.useCLS(namespace) causes any query inside the async context of an active transaction to auto-attach.

Isolation: Transaction.ISOLATION_LEVELS.{READ_UNCOMMITTED, READ_COMMITTED, REPEATABLE_READ, SERIALIZABLE}. Row-level locking: { lock: t.LOCK.UPDATE } (SELECT ... FOR UPDATE) and t.LOCK.SHARE; add skipLocked: true for work-queue consumers. Nested sequelize.transaction() calls inside an existing transaction become savepoints — partial rollback without aborting the outer.

Source: sequelize.org/docs/v6/core-concepts/raw-queries

sequelize.query(sql, options) is the escape hatch when the query builder can’t express what you need. By default returns [results, metadata]; specifying type: QueryTypes.SELECT (or INSERT, UPDATE, BULKUPDATE, RAW) reshapes the return. Hydrate rows into model instances with { model: User, mapToModel: true }. Use { raw: true } for plain objects at max throughput.

Parameterization is where security lives:

  • Replacements (:name or ? + replacements) — escaped client-side by Sequelize, interpolated into the SQL string before transmission.
  • Bind parameters ($name or $1 + bind) — sent to the DB as separate values via the wire protocol; SQL string never contains user input.

Always prefer bind over replacements for values. Replacements still exist because they cover dynamic identifiers (table/column names) that bind parameters can’t substitute.

Reach for raw when the ORM is the wrong tool: window functions (ROW_NUMBER() OVER (...)), recursive CTEs, full-text search (to_tsvector @@ to_tsquery), LATERAL joins, PostGIS geo, ON CONFLICT DO UPDATE upserts, bulk ops that shouldn’t enter Node memory.

Source: sequelize.org/docs/v6/other-topics/connection-pool

Sequelize uses sequelize-pool (fork of generic-pool). Five knobs:

OptionDefaultMeaning
pool.max5Max concurrent connections this pool will open
pool.min0Min connections kept warm
pool.acquire60000 msHow long a query waits for a free connection before throwing SequelizeConnectionAcquireTimeoutError
pool.idle10000 msHow long an idle connection lives before being released
pool.evict1000 msHow often the eviction sweeper runs
pool.maxUsesInfinityRecycle a connection after N uses (good for cluster rebalancing)

The cardinal rule: pool.max × number of app instances ≤ database max_connections − headroom. Postgres defaults to 100 max_connections; 10 pods × max: 20 = 200, half will hit acquire timeouts. For high-fanout, put PgBouncer in transaction mode in front, set pool.max modestly per app.