Architecture

The intro called the output a test suite. This section shows how that suite is shaped on disk and which rules keep the AI from blurring the structure on a whim. The POM-plus-fixtures combination, track separation, the one-to-one between scenario and test, and the three agents that stand it up, fill it in, and verify it.

Directory structure

Tracks split at the directory level. A track you did not pick gets removed by /6-cleanup-residue, so every directory exists right after a fork and gets pruned once the decision lands.

<project>/
├── tests/
│   ├── web/                  ← Playwright
│   │   ├── pages/            ← Page Objects (locators + actions)
│   │   ├── fixtures/         ← test.extend fixtures + storageState setup
│   │   ├── specs/            ← <flow>.spec.ts (one scenario = one spec)
│   │   └── utils/
│   ├── electron/             ← Playwright _electron
│   │   ├── specs/
│   │   └── helpers/          ← main-process evaluate, dialog/menu/tray stubs
│   └── mobile/               ← Maestro or Detox
│       ├── flows/            ← Maestro .yml (or Detox e2e/)
│       └── helpers/
├── playwright.config.ts      ← web/electron tracks
├── .detoxrc.js               ← when mobile=detox
├── .maestro/                 ← when mobile=maestro (or tests/mobile/flows)
├── .github/workflows/e2e.yml ← produced by implement-ci-workflow
├── docs/design/e2e-spec.md   ← design SSOT (SUT, flows, tools, env, CI)
└── .env.test.example         ← test env var key list (zero real keys)

Locators live only inside the Page Object classes in pages/. Scatter selectors across spec files and the moment the SUT’s DOM changes once, you lose track of where to follow it.

POM + fixtures

This is the 2026 recommended pairing. Fixtures manage setup, teardown, and auth; Page Objects encapsulate page interaction. The two carry separate responsibilities.

fixtures/: test.extend injects a logged-in page or prepared data into a test. The web auth.setup.ts runs as a setup project that logs in once and saves storageState, and other projects reuse that state through dependencies: ['setup'].
pages/: classes like LoginPage and DashboardPage. They hold the locators and the actions available on that page. A spec only calls the methods on these classes.

The point is not to walk the login UI again on every test. Logging in each time is slow and flaky. Web uses storageState, mobile a pre-login subflow, electron a session injection — log in once, reuse it. The strategy is decided in analyze-and-design, and the per-track implementation lives in tracks.

Track separation and the one-to-one with scenarios

web, electron, and mobile split not just by directory but by config and CI job too. The only thing they share is the track-agnostic utilities in utils/, and cross-track sharing stays minimal. electron runs single-instance with workers: 1 while web runs fullyParallel: true, so mixing them into one config breaks each other’s assumptions.

The SSOT for test scope is docs/design/e2e-spec.md §2. One line of critical flow written there converts into one test (or one spec). You do not grow arbitrary tests that the spec never named. Hand it to an AI and it invents off-spec cases to pad coverage, and that becomes the maintenance bill.

The phase flow and the three agents

The workflow is Atomic Phase. /start is the entry router; it detects the stage, sets the pace, and runs the phases in order from the first.

graph LR
  start["Start project<br/>/start"] --> pace["/1-pace"]
  pace --> setup["/2-setup-base"]
  setup --> analyze["/3-analyze-target"]
  analyze --> design["/4-design-suite"]
  design --> impl["/5-implement-suite"]
  impl --> cleanup["/6-cleanup-residue"]
  cleanup --> run["/run-suite"]
  run --> review["/review-e2e"]
  review -->|changes_requested| impl

When /review-e2e does not pass, it goes back to that track’s implement-*. This loop runs at most two iterations.

In Claude Code, three subagents take the heavy seats of this flow in separate contexts.

graph TB
  spec["e2e-spec.md §2<br/>critical flows"] --> architect["e2e-architect<br/>skeleton + tool wiring (once)"]
  architect --> builder["e2e-builder<br/>POM, fixtures, specs, CI"]
  builder --> reviewer["e2e-reviewer<br/>six-dimension review"]
  reviewer -->|changes_requested| builder
  reviewer -->|passed| done["passed"]

  auth["auth reuse<br/>storageState / subflow / session injection"] -.shared.- builder

e2e-architect: stands up the directories, playwright.config.ts, .detoxrc.js or .maestro/, and the CI skeleton, once. Bodies stay empty — just the bones.
e2e-builder: fills the skeleton with the real flows, POMs, and fixtures. implement-web-suite, implement-electron-suite, implement-mobile-suite, implement-auth-fixtures, and implement-ci-workflow are the actual tools.
e2e-reviewer: only verifies, across six dimensions. It does not touch the code. Fixing is the builder’s job, so the verifier and the fixer stay separate.

The three agents run in separate contexts to protect the main context. Pile per-track implementation detail into the main conversation and the decision context gets buried. Outputs hand off by file path, and only notes land in .claude/state/builder-<track>.md.

ADR

Decisions that are expensive to undo stack one page at a time in docs/adr/. Picking a tool off the recommended path — Cypress instead of Playwright for web, Detox instead of Maestro for mobile — gets its reasoning recorded here. Code carries what was done; ADRs carry why. Which decisions earn an ADR is covered in analyze-and-design.

Tracks comes next. The web Playwright config and the Supabase localStorage and Stripe cross-origin traps, the electron main-process stub, and the mobile Maestro declarative flows with the target matrix, track by track.

Architecture

Directory structure

POM + fixtures

Track separation and the one-to-one with scenarios

The phase flow and the three agents

ADR

Next