Answer the question
In order to leave comments, you need to log in
How to implement text-to-speech matching?
The task is to determine in the mobile application whether the person read the text from the screen of the device correctly.
Now I implemented it like this: the default Google speech recognizer built into my smartphone recognizes speech, and I compare the result already obtained with the text. In the long run, this is a bad option, because. sometimes this speech recognizer shamelessly lies.
There is also google speech api and amazon similar. They are smarter, they can send you different versions of the recognized text if in doubt, but they are expensive if you recognize a lot.
What other technologies are there that could be used to solve a similar problem? Perhaps other 3rd party apis or open source libraries? Or, perhaps, it is easier to write your own neural network, because the task of matching with text in theory should be much easier than just voice recognition?
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question