Make any media accessible.

Upload a video or audio file and get captions, AI-powered audio descriptions, and sign language overlays — all in one pipeline.

Try it now View on GitHub

Open Source 21 Languages 10 Sign Languages Self-Hostable

She had nothing she could do anymore. She couldn't drive, she couldn't read. But she would still try to watch as much TV as she could, because that was her habit.

This project was built because someone we loved was losing her connection to the world. Audio descriptions could have narrated what she couldn't see anymore. Captions exist in some tools, but nobody bundles all three accessibility modalities — captions, audio descriptions, and sign language — in a single open source pipeline.

Until now.

How it works

Upload

Drop any video or audio file. MP4, WebM, MOV, MP3, WAV — we handle it.

Process

AI transcribes speech, generates scene descriptions, and creates sign language gloss tokens.

Download

Get WebVTT/TTML captions, audio descriptions, and sign overlays. Share via a player link.

What you get

Captions

AI-powered transcription in WebVTT and TTML formats. Standard, simplified, or verbatim styles. Translatable to 21 languages.

Audio Descriptions

AI-generated narration of visual content during dialogue gaps — so people who can't see the screen still know what's happening.

Sign Language

Gloss tokens and sign cards for 10 sign languages. 6 overlay themes including high contrast and kid-friendly modes.

Self-Hostable

Run it on your own hardware with local models — whisper.cpp, Ollama, Piper TTS. Zero API costs. Full privacy. Docker ready.