Skip to content

Your first test

The one-line critical flow you wrote in Choose your target turns, in this step, into one test that actually runs. The AI writes the code. Your job is to answer a few decisions and confirm the result with your own eyes.

This step splits into two skills. /4-design-suite decides “how to test,” and /5-implement-suite writes the tests to match that decision.

When /4-design-suite runs, it asks a few things. The track was already set in the previous step, so here it pins the tools and environment on top of that.

  • Track tool: Playwright for web, Playwright _electron for electron, Maestro for mobile — those are the defaults. With no special reason otherwise, follow the default. This is an expensive decision to undo, so it gets asked one at a time even on the experienced rhythm.
  • Auth and environment: how the tests handle login, and which environment they reach.
  • CI: whether to run them automatically on GitHub later. No need to decide right now. Leaving it as “later” and moving on is fine.

For decisions like tool choice that are a hassle to change once set, the AI records the reasoning as a short note called an ADR. As a non-developer, all you need to know is where that gets written: the docs/adr/ folder. When you later wonder “why did we pick this tool again,” that is where to look.

When you finish answering, the decisions get pinned into e2e-suite-config.json and the back half of docs/design/e2e-spec.md fills in.

Here is the seat that confuses non-developers most, marked ahead of time. The tests do not walk through the login screen every single time.

If you have five critical flows and each one types in an ID and password to log in, the tests slow down, and the moment the login screen changes even a little, all five break at once. So this base logs in once, saves that state, and has the other tests reuse it.

The mechanism differs slightly by track.

  • web: log in once, save that session state (storageState) to a file, and the other tests load that file to start already logged in.
  • mobile: run the login flow once on its own, and the other flows continue on top of it.
  • electron: inject the session and launch in a logged-in state.

The AI wires this seat up for you. All you need to know is “it logs in once instead of every time.”

Once the design is done, /5-implement-suite implements the tests to match the decisions. Starting with the one flow you marked to automate first (the smoke candidate) is the way to go.

Here the AI moves a one-line flow into one test. For the flow “a logged-in user changes their profile name in settings and it saves,” the AI writes a test that opens the settings page, finds the name field and types a new value, clicks save, and confirms it saved.

Progress lines roll past on screen showing which files are being created. On the web track, a .spec.ts file lands under tests/web/specs/; on mobile, a .yml file under tests/mobile/flows/. Not seeing the filenames is fine. The next step — actually running it — reveals right away whether it was written well.

Healthy shape: the AI creates the test file and moves on with guidance like “implementation complete, shall we run it with /run-suite?”

Where it pauses: the AI might ask back something like “how do I find the name input on the settings page?” The AI has never seen your app’s screen, so it is unsure which button or which input field is which. When that happens, just tell it the label visible on that screen (for example, “Name”, “Save”). The AI prefers to find elements by visible text or by a button’s role.

Do not ask for too much at once: rather than telling it to write all five flows in one shot, take one all the way through (write → run → pass) first. Once the first one runs, the rest follow the same pattern and go faster.

Once the first test exists, you need to actually run it and see whether it passes. Head to Run and read the report. If the AI got stuck on the same spot twice along the way, see When the AI gets stuck.