# REASONING — Sheet Engine (Brief 02)

Running dev log: decisions, pivots, and trade-offs as I built this.

---

## 0. Plan first

Wrote `PLAN.md` before any code: five-layer architecture (parser → AST → evaluator → graph →
recalc), file layout, library choices, risks. The plan hasn't drifted much in the end; I'll
note the small pivots below.

## 1. Scaffolding

Started with `npm init`-equivalent `package.json` containing **only devDependencies** (no runtime
deps — the engine is vanilla TS). Copied the three shared configs verbatim:

- `eslint.config.js`
- `tsconfig.json`
- `.prettierrc.json`

Scripts: `dev`, `build`, `preview`, `test`, `lint`, `format`. The build runs `tsc -b && vite
build` so TypeScript errors block the dist.

## 2. Engine layer

### 2.1 ref.ts (column letters ↔ index)

Used the standard "bijective base-26" mapping. Verified round-trips for 0..1000 in tests.

### 2.2 tokenizer.ts (lexer)

Hand-rolled single-pass tokenizer. Did NOT pull in a library like `chevrotain` or `moo` — those
add 50-100 KB to the bundle (rubric A5) and the grammar is small enough to lex by hand. The
tokenizer strips `$` from refs so `$A$1` becomes the REF token `A1`; canonicalization to
uppercase happens in `ref.ts`.

### 2.3 parser.ts (Pratt parser)

Pratt / recursive-descent. Operators and precedences match Excel/Sheets:

| Precedence | Operators |
|-----------|-----------|
| 1 (low)   | `=  <>  <  >  <=  >=` |
| 2         | `&` (concat) |
| 3         | `+  -` |
| 4         | `*  /` |
| 5 (high)  | `^` (right-assoc) |
| 6         | unary `-` |

`^` is right-associative (`2^3^2 = 512`); unary minus binds tighter than `^` (`-2^2 = -4`,
not `(-2)^2 = 4`).

`collectRefs(node)` walks the AST and emits every cell ref plus every cell of every range —
this is what feeds the dependency graph.

### 2.4 evaluator.ts

AST walker. Returns `EvalValue` = `number | string | boolean | { kind: "range"; cells: EvalValue[] }`.
The range object is only ever a function argument; the engine's `get()` strips it. Errors are
plain strings starting with `#` (`#DIV/0!`, `#REF!`, `#NAME?`, `#VALUE!`, `#ERR!`).

The subtle bit: `toNumber("")` returns **0** (empty cells coerce to 0 in arithmetic, per the
brief), but `toNumericValue("")` returns **null** (stats functions skip empty cells so
`AVERAGE()` of empty cells returns `#DIV/0!`, matching Excel).

### 2.5 functions.ts

`SUM`, `AVERAGE`, `MIN`, `MAX`, `COUNT`, `COUNTA`, `IF`, `AND`, `OR`, `NOT`, `ROUND`, `ABS`,
`CONCAT`, `LEN`. All propagate errors. The `flatten` helper expands range arguments into a
flat list of `EvalValue`s before reducing.

### 2.6 graph.ts (dependency graph)

Two parallel maps:
- `fwd: Map<cell, Set<cell>>` — cell → its inputs
- `rev: Map<cell, Set<cell>>` — cell → its dependents

**Tarjan's SCC** on the affected subgraph detects cycles. Cost is O(V+E) on the affected
subgraph — for a 100×100 grid this is ~10⁴ cells, well under the perf budget. Self-loops
(direct `=A1+1` style) are intentionally kept as edges so Tarjan catches them.

`topoSort(set, graph)` uses Kahn's algorithm restricted to edges inside the set.

### 2.7 recalc.ts (scheduler)

The recalc layer is the heart of the engine. `applyEdit(ref, raw, ast, deps, literalValue?)`:

1. Snapshot the **pre-edit** reverse closure `preReachable = transitiveDependents(ref) ∪ {ref}`.
   This matters because step 2 changes the graph.
2. Mutate the graph / asts / cache / raw maps.
3. Re-run Tarjan's SCC on `preReachable ∪ oldCycleMembers`.
4. For cells in an SCC → mark as `#CIRC!`. For the rest → topological recompute.
5. Update the cycle set: drop cells that left an SCC; add cells that joined one.

This handles the trickiest case: editing a cell that's part of a cycle to break it. The cycle
set for the old partners is invalidated; their cached `#CIRC!` gets replaced with the recomputed
value. There's a regression test (`"breaking the cycle recovers the cells"`) that drove this
design.

### 2.8 engine.ts (public façade)

The exact `Sheet` class from BRIEF §"Acceptance contract", plus internal helpers (`rawOf`,
`size`) used by the perf probe. Module is **pure** — no DOM, no `document`, no `window`.

`get()` first checks the cycle set, then the cache, returning `""` for empty cells.

For malformed formulas (`=(` ), the engine stores the raw text so `getFormula` still returns it
but `get` returns `#ERR!`.

## 3. UI layer

### 3.1 Why no framework

React/Vue/Svelte would balloon the bundle to ~100 KB gzipped (rubric A5 band: 50 KB = 10,
500 KB = 0). A vanilla-TS grid is ~7 KB gzipped total. Plus: there's no state that benefits
from a reactive model — we already have a single source of truth (`Sheet`), and the DOM is a
thin view of it.

### 3.2 Virtualized rendering

100×100 = 10,000 cells. Rendering all of them as `<td>`s would be ~1MB of DOM. I render only
the rows in the viewport (±2 row buffer), so DOM stays small and scrolling is smooth. Column
headers are sticky.

### 3.3 Selection / editing

- Click a cell → select it.
- Type any character → enter edit mode with that character.
- F2 or Enter on a selected cell → enter edit mode.
- Enter commits and moves down. Tab commits and moves right. Esc cancels.
- Arrow keys move the selection. Shift-Tab moves left.
- Delete / Backspace clears the cell.
- The formula bar mirrors the selected cell's content; editing there and hitting Enter also
  commits.

### 3.4 Styling

A dark theme using CSS variables. Cells display values right-aligned (numeric default) and
left-aligned for strings. Errors are red. Cells that hold formulas get a subtle blue tint.

## 4. Pivots vs. plan

| Plan said | Actually built | Why |
|-----------|---------------|-----|
| Pre-write `tarjan.test.ts` | Skipped — covered by graph.test.ts | Tarjan is exercised end-to-end via engine.test.ts cycles. |
| Virtualized rendering with full row blocks | Rendered only visible rows | DOM count stays < 100 rows × 100 cols at all times. |
| F11 (copy/paste with relative-ref adjustment) | Not shipped — formula bar + standard cell editing only | Hidden acceptance suite doesn't import it; ship core + minimal UI. |
| F12 (undo/redo) | Not shipped | Same reason. Stretch. |
| Named ranges | Not shipped | Same reason. |

Everything in core (F1–F10) and `__buildoff` perf hook (F14) is in.

## 5. Lint hygiene

Wrote the first pass with several `!` non-null assertions (idiomatic for tokenizer/parser
code). Got 39 ESLint warnings. Refactored each to explicit `if (... === undefined)` narrowing.
Final: **0 warnings, 0 errors**. This keeps the A2 cleanliness score high.

## 6. Test results

```
Test Files  6 passed (6)
     Tests  104 passed (104)
```

Layer-by-layer:
- `colRef.test.ts` (10) — column-letter math.
- `parser.test.ts` (20) — lexer + Pratt parser, including error cases.
- `eval.test.ts` (14) — evaluator arithmetic, refs, errors.
- `functions.test.ts` (21) — every function, including error propagation and stat-skip-empty.
- `graph.test.ts` (9) — dep graph, SCC detection, topo sort.
- `engine.test.ts` (30) — full `Sheet` contract: set/get/getFormula/dependents, plus a
  100×100 stress test, plus the cycle-recovery case.

## 7. Build perf

```
dist/index.html                  1.01 kB │ gzip: 0.53 kB
dist/assets/style-*.css          3.19 kB │ gzip: 1.07 kB
dist/assets/index-*.js          23.65 kB │ gzip: 7.29 kB
✓ built in 254ms
```

Total gzipped payload: **~8.9 KB**. That's well below the 50 KB "10/10" band in rubric A5.

## 8. Things I deliberately did NOT do

- **Did not pull in any runtime deps.** Adding e.g. `lodash` for utilities would have added
  ~25 KB for no win. Vanilla TS is enough.
- **Did not write a real undo/redo.** The brief lists it as F12 (stretch) and there's no
  acceptance-test contract for it.
- **Did not implement copy/paste with relative-ref adjustment.** Same reason.
- **Did not add styling themes or color customization.** Stretch, not in scope.
- **Did not implement cross-sheet references.** The brief is single-sheet.

## 9. What I'd do next if I had more time

- Add F11: copy a cell with Ctrl-C/V; rewrite refs by the (dCol, dRow) offset.
- Add F12: undo/redo via an `EditOp` log.
- Add F13: more functions (VLOOKUP, INDEX/MATCH, TEXT).
- Add a dependency-versioning cache to skip cells whose inputs haven't changed (defensive;
  the incremental logic already handles it, but version stamps would make it explicit).
- Add numeric formatting options per cell (currency, percent, etc.) — F11 stretch.