# PLAN — Brief 02: Sheet Engine

A mini-spreadsheet with a real formula engine: **tokenizer → Pratt parser → AST → evaluator →
dependency graph → incremental topological recalc**, wrapped in a vanilla-TS interactive grid UI.

The brief is correctness- and architecture-heavy (A1 = 0.30 of Tier A, hidden suite is large), so the
engine is the centre of gravity. Everything is built so the engine module (`src/lib/engine.ts`) is
**pure** (no DOM) and independently testable; the UI is a thin shell over it.

## Architecture & data flow

```
raw cell input ──▶ tokenizer ──▶ parser ──▶ AST
                                              │  (static walk extracts precedents)
                                              ▼
Sheet ── precedents/dependents maps ── incremental recalc (Kahn topo sort)
   │                                          │
   │                                  evaluator(AST, resolver) ──▶ value
   ▼
get/getFormula/dependents  ◀── cached cell values ──▶  Grid UI (render + keyboard)
```

**Recalc model (the heart of F4/F5).** On `set(ref, content)`:
1. Re-parse the cell; diff its precedent edges and update the `precedents`/`dependents` adjacency.
2. `recalc(seeds = {ref})`:
   - Collect the **affected set** = `{ref}` ∪ all transitive *dependents* (DFS over `dependents`).
     This is what makes recalc incremental — untouched cells are never re-evaluated.
   - **Kahn topological sort** restricted to the affected set (in-degree = precedents that are also
     in the affected set). Evaluate in that order so every cell sees fresh precedent values.
   - Any cell that never reaches in-degree 0 is in a cycle → value `#CIRC!` (F5, no hang — Kahn is
     O(V+E), never recurses into a loop).
3. Errors thrown by the evaluator are caught per-cell and stored as the cell's value
   (`#DIV/0!`, `#REF!`, `#ERR!`); downstream cells read the error string and propagate it (F6).

## File layout (each file one concern, kept < ~250 LOC for the A3 band)

```
src/
  lib/
    errors.ts      FormulaError class + error-code constants
    coords.ts      colToIndex / indexToCol / parseRef / refToString / expandRange
    ast.ts         AST node discriminated union
    tokenizer.ts   string -> Token[] (numbers, strings, refs, idents, operators, ranges)
    parser.ts      precedence-climbing parser Token[] -> AST (+ precedent extraction)
    evaluator.ts   evaluate(AST, ctx) -> CellValue ; numeric/boolean coercion
    functions.ts   SUM/AVERAGE/MIN/MAX/COUNT/IF + stretch (ROUND/ABS/CONCAT/AND/OR/NOT/...)
    engine.ts      Sheet class + dependency graph + recalc; RE-EXPORTS colToIndex/indexToCol
  ui/
    grid.ts        virtual-free 100x100 grid render, selection, inline editor
    keyboard.ts    arrow/enter/tab/escape navigation + edit triggers
  main.ts          bootstraps the app, wires the formula bar, installs window.__buildoff
  style.css
tests/             engine.test.ts, parser.test.ts, coords.test.ts, recalc.test.ts, functions.test.ts
```

## Key decisions & justification

- **No parser/runtime libraries.** Hand-rolled tokenizer + precedence-climbing parser. Justification:
  the grammar is tiny and fixed; a dependency (e.g. a PEG generator) would *hurt* A5 (bundle ≤ 50 KB =
  10 pts) and A3 (dependency hygiene) while adding nothing. Zero runtime deps keeps the gzipped bundle
  tiny and the build fast (A4: build ≤ 10 s = 10 pts).
- **Precedence climbing over recursive descent per-level.** One compact `parseExpr(minBp)` with a
  binding-power table — lowest code/complexity for full precedence (`= <> < > <= >=` < `+ -` < `* /` <
  `^`(right-assoc) < unary `-` < primary), which helps the A2 cyclomatic-complexity band.
- **Eager incremental recalc with an explicit scheduler** (not lazy memoized eval). The brief explicitly
  asks for a *recalc-scheduler* and `dependents()` introspection; an eager Kahn pass over only the
  affected subgraph is both correct and demonstrably incremental, and gives clean `#CIRC!` semantics.
- **Single shared inline editor** over a div-grid (not 10 000 `<input>`s). Display cells are plain
  divs; editing pops one positioned `<input>`. Keeps the DOM light → better A5 heap and A4 FPS on the
  100×100 grid (F10).
- **`window.__buildoff`** drives a real dependent **chain** of ~10 000 cells in a headless engine
  instance and times genuine `set`→recalc cycles (A4 recalc throughput).

## Semantics locked to the contract

- `get` → `number | string | boolean`; empty cell → `""`; errors → `"#…!"` strings.
- Empty cell = 0 in arithmetic; `COUNT`/`AVERAGE` consider numbers only; ranges valid only inside
  functions. `=1/0` → `#DIV/0!`. Cycles → `#CIRC!`. Absolute refs (`$A$1`) resolve identically to
  relative. `dependents(ref)` = direct downstream refs.

## Risks & mitigations

- **Cycle handling correctness** — mitigated by Kahn (leftover = cyclic) + a dedicated recalc test
  suite (self-ref, 2-cycle, cycle feeding an acyclic tail).
- **Edge-diff bugs on re-edit** (stale `dependents`) — always remove old precedent edges before adding
  new ones; covered by an "edit changes dependency structure" test.
- **Bundle/heap from a full 100×100 DOM** — div grid + one editor; if heap is high, the grid is small
  enough that it stays well under the 40 MB "good" band.
- **Lint/format (A2)** — copy the shared configs verbatim and run `prettier`/`eslint` before finishing.

## Stretch (rubric upside, after core)

F11 formula bar, F13 extra functions (ROUND/ABS/SQRT/MOD/POWER/CONCAT/AND/OR/NOT/COUNTA), F14 perf hook.
Undo/redo (F12) only if time permits.
