DSA — Theory

DSA — Theory (interview-grade)

Approach template

State this in interview before coding:

Restate the problem. Confirm input shape, output, edge cases.
Walk through 1-2 small examples by hand.
Brute force first (state O()).
Observe waste; identify pattern.
Pick optimization; state new O().
Code.
Trace through example.
Edge cases + tests.

Recursion mental model

Base case + recursive case.
Each call shrinks problem.
Stack depth = recursion depth.
Convert to iteration via explicit stack when depth is too high.

DP mental model

Overlapping subproblems + optimal substructure.
Define state precisely: what does dp[i] mean?
State transition: dp[i] = f(dp[i-1], ...).
Base case.
Order of evaluation (top-down with memo or bottom-up).
Space optimization: do you need full table?

Standard DP problems and patterns:

Knapsack 0/1 — dp[i][w] = max(skip, take).
Unbounded knapsack — coin change.
LCS / LIS — 2D / 1D.
Partition / subset sum.
Edit distance.
Matrix path / min cost.
Interval DP — burst balloons, matrix-chain mult.
Bitmask DP — TSP-like, subset enumeration.
DP on trees — root + post-order.

BFS vs DFS — when

BFS: shortest path in unweighted, level traversal, multi-source spread.
DFS: connectivity, topo, cycle detect, “find any path”, backtracking, generating combinations.

Dijkstra essentials

Priority queue of (dist, node).
Pop smallest; relax neighbors.
O((V+E) log V).
Doesn’t work with negative weights — use Bellman-Ford for that.
For grids, often more elegant with BFS (if uniform cost) or A* (with heuristic).

Topological sort

Two approaches:

DFS-based: post-order; reverse for topo order.
Kahn’s BFS: in-degree queue. Detect cycle: if not all nodes emitted.

Union-Find (DSU)

class DSU:
    def __init__(self, n):
        self.p = list(range(n))
        self.r = [0]*n
    def find(self, x):
        while x != self.p[x]:
            self.p[x] = self.p[self.p[x]]   # path compression
            x = self.p[x]
        return x
    def union(self, a, b):
        ra, rb = self.find(a), self.find(b)
        if ra == rb: return False
        if self.r[ra] < self.r[rb]: ra, rb = rb, ra
        self.p[rb] = ra
        if self.r[ra] == self.r[rb]: self.r[ra] += 1
        return True

Use for: connectivity queries, MST (Kruskal), grouping.

Sliding window pattern

left, right = 0, 0
state = init
while right < n:
  add A[right]; right += 1
  while window violates:
    remove A[left]; left += 1
  update best with window

Used for: max/min subarray of length K, longest substring with property, fewest replacements, etc.

Two pointers

Sorted array + pair sum, palindrome check, partition, dedupe in place.

Monotonic stack / queue

Track “next greater element” / sliding window max in O(n).

# next greater
stack = []
res = [-1]*n
for i, x in enumerate(arr):
    while stack and arr[stack[-1]] < x:
        res[stack.pop()] = x
    stack.append(i)

Bit manipulation tricks

x & (x-1) — clear lowest set bit. Used to count bits.
x & -x — isolate lowest set bit.
x ^ x = 0, x ^ 0 = x — find unique among pairs.
(x >> i) & 1 — read bit i.
Subsets via 2^n bitmask iteration.

String techniques

Hashing (rolling hash, Rabin-Karp).
KMP failure function for pattern match O(n+m).
Trie for prefix-related.
Z-array for prefix matches.
Suffix array advanced.

Common interview Qs (patterns to spot)

Two sum → hashmap.
Three sum → sort + two pointers.
Longest substring without repeat → sliding window.
Min window substring → sliding window with counts.
Top K frequent → heap or bucket sort.
Kth largest → quickselect or min-heap of size K.
Merge K sorted lists → heap.
LRU cache → hashmap + doubly linked list.
Word ladder → BFS.
Number of islands → DFS / BFS / union-find.
Course schedule → topo sort.
Coin change → DP.
Longest palindromic substring → expand around center.
Word break → DP.
Edit distance → 2D DP.
Trapping rain water → two pointers / monotonic stack.
Median of two sorted arrays → binary search.
Sliding window max → monotonic deque.
Serialize/deserialize binary tree → BFS / DFS with markers.
Merge intervals → sort + sweep.

Complexity proof techniques

Master theorem for T(n) = aT(n/b) + f(n).
Amortized analysis (e.g., dynamic array, splay tree).
Potential method (less common in interviews).

Mistakes to avoid

Coding before clarifying.
Over-engineering for unmentioned scale.
Wrong recursion exit condition (off-by-one).
Not handling empty / single / duplicate inputs.
Mutating input without warning.
Inefficient string concat in loops.
Recomputing in DP instead of memoizing.
Forgetting visited set in graph traversals.