Skip to content

Prerequisites

This step is a one-time thing. scene-studio has the most external-tool dependencies of any base, so this step is heavier than others. Since you only install the tools for the track you’ll use, deciding the track first makes it lighter.

Decide your track first. The tools to install differ.

  • Generative only — ffmpeg, OPENAI_API_KEY, codex CLI
  • Remix only — ffmpeg, OPENAI_API_KEY, yt-dlp, whisper-cli (optional)
  • Both — all of the above

If you can’t decide, install both. The tools for a track you don’t use can be installed later when you start that track.

ItemmacOSWindowsLinux
Terminal you useTerminal.app or iTerm2Ubuntu on WSL2 recommendedthe shell your distro ships
Verified stateverifiedinstall WSL2 first, then run every command inside itverified

On Windows, finish the WSL2 install guide first, then come back here.

The core tool for video composition. Needed on both tracks.

Terminal window
# macOS
brew install ffmpeg
# Linux and WSL2
sudo apt update && sudo apt install -y ffmpeg
Terminal window
ffmpeg -version # a line like ffmpeg version ... means it's fine

Used for Codex image generation and the Whisper API. Don’t plant it in .env; export it in a shell startup file (~/.zshrc, etc.), or keep it in .env while making sure it drops out of git tracking.

Terminal window
export OPENAI_API_KEY="sk-..."

Put this line in ~/.zshrc and open a new terminal, and it’s picked up automatically from then on.

The generative track’s image generation needs the OpenAI Codex CLI. Without it, generative entry is blocked. Follow the official OpenAI Codex CLI docs for install.

Terminal window
codex --version # a version means it's fine

REPLICATE_API_TOKEN and FAL_KEY are optional. They’re for preview mode’s fast image backend, so without them it uses only codex, which is a bit slower.

The remix track takes and transcribes a source video.

Terminal window
# yt-dlp (source download)
brew install yt-dlp # macOS
# Linux: pip install yt-dlp or your package manager
# whisper-cli (optional, local transcription)
brew install whisper-cpp # macOS, fast on M-series

Without whisper-cli, it falls back to the Whisper API via OPENAI_API_KEY. To use a local model, fetch the model file once.

Terminal window
mkdir -p ~/.whisper/models
curl -L -o ~/.whisper/models/ggml-base.en.bin \
https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin

The official install guide is at docs.claude.com/en/docs/claude-code/quickstart.

Terminal window
claude --version # a version number means you're ready

You can use the OpenAI Codex CLI as the entry point instead of Claude Code. In that case the repository root AGENTS.md is the entry point.

Subtitles use Pretendard for Korean and Inter for English. If not installed, it falls back to sans-serif automatically, so you don’t strictly need them. To keep subtitles clean, install the two fonts on your system.

Once the tools for your track are in place, go to clone-and-install.