Skip to content

Architecture

The intro flagged the two tracks and the gates. This section looks at how that’s shaped as 8 agents and a phase flow, and how several videos are isolated. Two things are worth watching: each gate is owned by a single agent with user approval as the link between them, and outputs are kept as files in _workspace/ rather than chat.

graph TB
start[User input]
start -->|no URL, character/scenario| g0[Generative]
start -->|URL attached| r0[Remix]
g0 --> ga[Gate A plan - plan.md]
ga --> gb1[Gate B-1 scenes - scenes.json, 5 structures]
gb1 --> gb2[Gate B-2 images - Codex candidates]
gb2 --> gc[Gate C video - ffmpeg draft]
gc --> gpub[publish metadata - publish.md]
r0 --> ra[Gate A analysis - yt-dlp, Whisper, hook]
ra --> rb[Gate B editing - auto-cut ffmpeg]
rb --> rpub[publish metadata - publish.md]

Generative has four gates and publish; remix has three gates and publish. At each gate a human approves, edits, or rolls back. Neither track publishes automatically (S8).

AgentTrackResponsibility
content-strategistGenerativeplanning, channel strategy
scene-plannerGenerativescene breakdown, filling the 5-structure slots
image-directorGenerativeCodex image generation (Codex calls happen only here)
video-editorGenerativeffmpeg video composition
voice-directorGenerativevoice synthesis
source-analystRemixyt-dlp collection, Whisper transcription, hook detection
remix-editorRemixauto-editing, cutting
publish-copywriterSharedpublish metadata (no auto-publish)

blog-writer is an optional branch in the generative track. Codex CLI calls are gathered in one place, image-director (S7). Calling from several agents breaks the style and loses cost control. The remix track doesn’t generate images.

If a human stops to approve at every gate, the round-trips of human waiting grow the wall time more than the actual compute. fast-preview, the default in preview mode, auto-passes the planning-stage gates (A, B-1, B-2) and runs through to a video draft in one go. A human reviews the actual video — not intermediate JSON — once at Gate C.

The gate philosophy isn’t broken. All outputs remain in _workspace/ and can be rolled back to any stage at Gate C (“redo from planning,” “just scene 3,” “redo the image”), and the no-auto-publish rule stays. Entering a final render ignores fast-preview, and final always runs only after an explicit Gate C approval.

You run several videos at once on one channel. Each video is isolated in its own folder, and the root videos.json is the index.

graph TB
idx[_workspace/videos.json - index + active_slug]
ch[channel.json - channel identity, global]
cache[.cache/ - sheet/image/subtitle/BGM hashes, shared globally]
idx --> v1[robot-pov-ep01/ - generative]
idx --> v2[drama-clip-30s-01/ - remix]
v1 --> v1a[plan.md, scenes.json, images/]
v1 --> v1b[video_drafts/, publish.md, progress.md]
v2 --> v2a[source/ - video, transcript, license.json]
v2 --> v2b[publish.md, progress.md, license_responsibility.json]

videos.json holds the video list and the active video (active_slug). Every agent and skill reads this index to decide its working folder. Partial reruns like “redo the image for scene 3” or “redo the hook” apply to the active video too. channel.json (channel identity) and globalization_decision.json (globalization) are global assets independent of any video, and .cache/ is the spot that reuses character sheets, images, and BGM across videos.

_workspace/
├── videos.json index + active_slug
├── channel.json channel identity (global)
├── globalization_decision.json globalization flag (global)
├── .cache/ hash cache (shared globally)
└── {video_slug}/
├── brief.md Phase 1 requirement
├── plan.md Phase 2 plan (generative)
├── source/ source (remix — video, transcript, license.json)
├── scenes.json scene breakdown (generative)
├── images/{scene}/candidate_*.png image candidates (generative)
├── video_drafts/draft_v{n}.mp4 video
├── publish.md publish metadata
├── progress.md progress source of truth
├── timings.json performance measurement
└── license_responsibility.json explicit responsibility acceptance (remix)

Files outside _workspace/.claude/, AI_AUTOMATION.md — are not modified or deleted without human approval (S2).

The automation entry point points at one place

Section titled “The automation entry point points at one place”

.claude/agents/ and .claude/skills/ are the Claude Code storage location, and AGENTS.md is the Codex entry point. Both read AI_AUTOMATION.md. Track routing, the scene-type spec, the videos.json schema, and license-policy levels live in that one file, so the phase and output contracts are the same in whichever runtime you work.

The next section is tracks. It covers how generative’s template filling differs from remix’s hook discovery, and where the trigger forks.