Skip to content

DSA — Theory

State this in interview before coding:

  1. Restate the problem. Confirm input shape, output, edge cases.
  2. Walk through 1-2 small examples by hand.
  3. Brute force first (state O()).
  4. Observe waste; identify pattern.
  5. Pick optimization; state new O().
  6. Code.
  7. Trace through example.
  8. Edge cases + tests.
  • Base case + recursive case.
  • Each call shrinks problem.
  • Stack depth = recursion depth.
  • Convert to iteration via explicit stack when depth is too high.
  • Overlapping subproblems + optimal substructure.
  • Define state precisely: what does dp[i] mean?
  • State transition: dp[i] = f(dp[i-1], ...).
  • Base case.
  • Order of evaluation (top-down with memo or bottom-up).
  • Space optimization: do you need full table?

Standard DP problems and patterns:

  • Knapsack 0/1dp[i][w] = max(skip, take).
  • Unbounded knapsack — coin change.
  • LCS / LIS — 2D / 1D.
  • Partition / subset sum.
  • Edit distance.
  • Matrix path / min cost.
  • Interval DP — burst balloons, matrix-chain mult.
  • Bitmask DP — TSP-like, subset enumeration.
  • DP on trees — root + post-order.
  • BFS: shortest path in unweighted, level traversal, multi-source spread.
  • DFS: connectivity, topo, cycle detect, “find any path”, backtracking, generating combinations.
  • Priority queue of (dist, node).
  • Pop smallest; relax neighbors.
  • O((V+E) log V).
  • Doesn’t work with negative weights — use Bellman-Ford for that.
  • For grids, often more elegant with BFS (if uniform cost) or A* (with heuristic).

Two approaches:

  • DFS-based: post-order; reverse for topo order.
  • Kahn’s BFS: in-degree queue. Detect cycle: if not all nodes emitted.
class DSU:
def __init__(self, n):
self.p = list(range(n))
self.r = [0]*n
def find(self, x):
while x != self.p[x]:
self.p[x] = self.p[self.p[x]] # path compression
x = self.p[x]
return x
def union(self, a, b):
ra, rb = self.find(a), self.find(b)
if ra == rb: return False
if self.r[ra] < self.r[rb]: ra, rb = rb, ra
self.p[rb] = ra
if self.r[ra] == self.r[rb]: self.r[ra] += 1
return True

Use for: connectivity queries, MST (Kruskal), grouping.

left, right = 0, 0
state = init
while right < n:
add A[right]; right += 1
while window violates:
remove A[left]; left += 1
update best with window

Used for: max/min subarray of length K, longest substring with property, fewest replacements, etc.

Sorted array + pair sum, palindrome check, partition, dedupe in place.

Track “next greater element” / sliding window max in O(n).

# next greater
stack = []
res = [-1]*n
for i, x in enumerate(arr):
while stack and arr[stack[-1]] < x:
res[stack.pop()] = x
stack.append(i)
  • x & (x-1) — clear lowest set bit. Used to count bits.
  • x & -x — isolate lowest set bit.
  • x ^ x = 0, x ^ 0 = x — find unique among pairs.
  • (x >> i) & 1 — read bit i.
  • Subsets via 2^n bitmask iteration.
  • Hashing (rolling hash, Rabin-Karp).
  • KMP failure function for pattern match O(n+m).
  • Trie for prefix-related.
  • Z-array for prefix matches.
  • Suffix array advanced.
  1. Two sum → hashmap.
  2. Three sum → sort + two pointers.
  3. Longest substring without repeat → sliding window.
  4. Min window substring → sliding window with counts.
  5. Top K frequent → heap or bucket sort.
  6. Kth largest → quickselect or min-heap of size K.
  7. Merge K sorted lists → heap.
  8. LRU cache → hashmap + doubly linked list.
  9. Word ladder → BFS.
  10. Number of islands → DFS / BFS / union-find.
  11. Course schedule → topo sort.
  12. Coin change → DP.
  13. Longest palindromic substring → expand around center.
  14. Word break → DP.
  15. Edit distance → 2D DP.
  16. Trapping rain water → two pointers / monotonic stack.
  17. Median of two sorted arrays → binary search.
  18. Sliding window max → monotonic deque.
  19. Serialize/deserialize binary tree → BFS / DFS with markers.
  20. Merge intervals → sort + sweep.
  • Master theorem for T(n) = aT(n/b) + f(n).
  • Amortized analysis (e.g., dynamic array, splay tree).
  • Potential method (less common in interviews).
  • Coding before clarifying.
  • Over-engineering for unmentioned scale.
  • Wrong recursion exit condition (off-by-one).
  • Not handling empty / single / duplicate inputs.
  • Mutating input without warning.
  • Inefficient string concat in loops.
  • Recomputing in DP instead of memoizing.
  • Forgetting visited set in graph traversals.