Make any media accessible.
Upload a video or audio file and get captions, AI-powered audio descriptions, and sign language overlays — all in one pipeline.
She had nothing she could do anymore. She couldn't drive, she couldn't read. But she would still try to watch as much TV as she could, because that was her habit.
This project was built because someone we loved was losing her connection to the world. Audio descriptions could have narrated what she couldn't see anymore. Captions exist in some tools, but nobody bundles all three accessibility modalities — captions, audio descriptions, and sign language — in a single open source pipeline.
Until now.
How it works
Upload
Drop any video or audio file. MP4, WebM, MOV, MP3, WAV — we handle it.
Process
AI transcribes speech, generates scene descriptions, and creates sign language gloss tokens.
Download
Get WebVTT/TTML captions, audio descriptions, and sign overlays. Share via a player link.
What you get
Captions
AI-powered transcription in WebVTT and TTML formats. Standard, simplified, or verbatim styles. Translatable to 21 languages.
Audio Descriptions
AI-generated narration of visual content during dialogue gaps — so people who can't see the screen still know what's happening.
Sign Language
Gloss tokens and sign cards for 10 sign languages. 6 overlay themes including high contrast and kid-friendly modes.
Self-Hostable
Run it on your own hardware with local models — whisper.cpp, Ollama, Piper TTS. Zero API costs. Full privacy. Docker ready.