Prerequisites

This step is a one-time thing. scene-studio has the most external-tool dependencies of any base, so this step is heavier than others. Since you only install the tools for the track you’ll use, deciding the track first makes it lighter.

Which track will you use

Decide your track first. The tools to install differ.

Generative only — ffmpeg, OPENAI_API_KEY, codex CLI
Remix only — ffmpeg, OPENAI_API_KEY, yt-dlp, whisper-cli (optional)
Both — all of the above

If you can’t decide, install both. The tools for a track you don’t use can be installed later when you start that track.

OS-by-OS differences

Item	macOS	Windows	Linux
Terminal you use	`Terminal.app` or iTerm2	Ubuntu on WSL2 recommended	the shell your distro ships
Verified state	verified	install WSL2 first, then run every command inside it	verified

On Windows, finish the WSL2 install guide first, then come back here.

1) ffmpeg — both tracks

The core tool for video composition. Needed on both tracks.

# macOS
brew install ffmpeg

# Linux and WSL2
sudo apt update && sudo apt install -y ffmpeg

Verify

ffmpeg -version    # a line like ffmpeg version ... means it's fine

2) OPENAI_API_KEY — both tracks

Used for Codex image generation and the Whisper API. Don’t plant it in .env; export it in a shell startup file (~/.zshrc, etc.), or keep it in .env while making sure it drops out of git tracking.

export OPENAI_API_KEY="sk-..."

Put this line in ~/.zshrc and open a new terminal, and it’s picked up automatically from then on.

3) Generative track — codex CLI

The generative track’s image generation needs the OpenAI Codex CLI. Without it, generative entry is blocked. Follow the official OpenAI Codex CLI docs for install.

codex --version    # a version means it's fine

REPLICATE_API_TOKEN and FAL_KEY are optional. They’re for preview mode’s fast image backend, so without them it uses only codex, which is a bit slower.

4) Remix track — yt-dlp and whisper

The remix track takes and transcribes a source video.

# yt-dlp (source download)
brew install yt-dlp          # macOS
# Linux: pip install yt-dlp or your package manager

# whisper-cli (optional, local transcription)
brew install whisper-cpp     # macOS, fast on M-series

Without whisper-cli, it falls back to the Whisper API via OPENAI_API_KEY. To use a local model, fetch the model file once.

mkdir -p ~/.whisper/models
curl -L -o ~/.whisper/models/ggml-base.en.bin \
  https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin

5) Claude Code

The official install guide is at docs.claude.com/en/docs/claude-code/quickstart.

claude --version    # a version number means you're ready

You can use the OpenAI Codex CLI as the entry point instead of Claude Code. In that case the repository root AGENTS.md is the entry point.

6) Fonts are optional

Subtitles use Pretendard for Korean and Inter for English. If not installed, it falls back to sans-serif automatically, so you don’t strictly need them. To keep subtitles clean, install the two fonts on your system.

Once the tools for your track are in place, go to clone-and-install.