
What It Is
A Claude Code skill from the browser-use team that turns video editing into a conversation. Point it at a folder of raw clips, describe what you want, get final.mp4. Handles filler-word removal, color grading, audio fades, subtitles, and overlay animations.
Language
Python (~75%). Uses ffmpeg for cutting and ElevenLabs Scribe for word-level transcript timestamps. Optional yt-dlp for pulling reference footage.
Install
git clone https://github.com/browser-use/video-use.git
ln -s "$(pwd)/video-use" ~/.claude/skills/video-use
pip install -r video-use/requirements.txt
export ELEVENLABS_API_KEY=...
Requires ffmpeg on PATH.
Value
The clever design choice is "text + on-demand visuals": instead of feeding raw frames to the LLM, it reasons over the audio transcript with timestamps, and only renders visual composites when needed. Keeps token usage low and lets FFmpeg do the heavy work.
Capabilities worth noting:
- Removes filler words and dead space between takes
- Adds 30ms audio fades at cut points to prevent artifacts
- Burns in customizable subtitles
- Generates overlays via Manim, Remotion, or PIL
- Self-evaluates rendered output at cut boundaries before handing back