V

Vosk

Listed

An open-source speech recognition toolkit that supports multiple languages and can be integrated into AI video platforms for offline transcription and voice commands.

About

An open-source speech recognition toolkit that supports multiple languages and can be integrated into AI video platforms for offline transcription and voice commands.

Detailed overview

Vosk is a speech recognition toolkit that operates offline, even on lightweight devices such as Raspberry Pi, Android, and iOS. It supports over 20 languages and dialects, including English, German, French, Spanish, Chinese, Russian, Japanese, and Korean, among others. The toolkit provides a streaming API for user experience and includes bindings for multiple programming languages, including Python, Java, C#, Swift, and Node. Portable per-language models are approximately 50MB each, with larger server models also available. Vosk allows quick reconfiguration of vocabulary to improve accuracy and supports speaker identification in addition to simple speech recognition. It is intended for developers integrating speech recognition into applications across Android, iOS, Raspberry Pi, and servers, and can be installed via a simple pip3 install command.

Website link is available on the Verified plan