What is the best Python library for continuous speech recognition?

L

Lan_Vanten2020-01-21 19:08:59

Python

Lan_Vanten, 2020-01-21 19:08:59

So, the crux of the matter. I work in Python, looking for a way to recognize long-term sequential speech.
The speech itself is audio stories, the goal is transcription back into text.
So far, I have settled on pocketsphinx (Russian dictionary), but it also works frankly poorly.
First, it's too slow.
Secondly, it is good for small phrases with a clear division. When the speech is continuous, things are bad.
What libraries would you recommend to try taking into account my task?

Reply

Answer the question

In order to leave comments, you need to log in

4 answer(s)

R

Ranwise, 2020-01-22
@Ranwise

try https://github.com/alphacep/vosk-api
was recently in the news
vosk for local continuous speech recognition that supports Russian

A

Andrey, 2020-01-21
@anerev

Such things even at Apple and Google suck, I don’t think there is something better

Z

Zhenya, 2020-01-30
@iq1

Mozilla recently released a new version of its library with many improvements. It also supports different languages, including Russian. https://hacks.mozilla.org/2019/12/deepspeech-0-6-m...
https://github.com/mozilla/DeepSpeech

V

Vladimir Olohtonov, 2020-01-22
@sgjurano

I heard about attempts to use this solution: https://cloud.google.com/speech-to-text/
For Russian, you should definitely try the solution from Yandex, it is currently the best in this segment: https://cloud. yandex.ru/services/speechkit
In practice, it works so-so and in production, speech2text models are usually configured to detect specific keywords, and not full speech recognition.