When recognizing speech, play audio files associated with certain words from the speech text?

G

Gordon__Freeman2019-08-21 16:10:26

Speech recognition

Gordon__Freeman, 2019-08-21 16:10:26

TASK:
1. The person says the text.
2. Some software happens this text in real time.
3. This software has its own database where each word corresponds to some sound (audio file)
4. And as soon as the software recognizes a familiar word from its database, the software immediately turns on this audio file,
and continues to happen further.
EXAMPLE:
1. A person says, "Eat some more of those soft French buns and have some tea."
2. The software plays:
/soft/ == audio1.ogg
/drink/ == audio2.ogg
POSSIBLE SOLUTIONS
Maybe something with the Google Speech API cloud.google.com/speech-to-text/ for PHP
WHAT IS THE PROBLEM OF SIMILAR SOLUTIONS
Chatbot is doing something similar, for example, on dialogflow.com.
But the problem with Chatbots is that they need to say a start phrase every time, like "Hey, Google" - "Play a sound if you know one of this word". Then, in order for the Chatbot to give a result, the person needs to stop talking. The chatbot will process the text and produce an audio sound. And so on.
My task is to play these sounds every time the Chatbot recognizes a familiar word from its database, continuously throughout when a person speaks (for example, just reads a poem).
Thank you!

Reply

Answer the question

In order to leave comments, you need to log in

3 answer(s)

D

Dimonchik, 2019-08-21
@dimonchik2013

This software has its own database where each word corresponds to a sound (audio file)

you assemble the database yourself, from Google Translate
connect one of the 6 speech APIs - translate speech into text
, compare text, play the
PROFIT file

S

Sergey Sokolov, 2019-08-21
@sergiks

It is more reliable and easier to remove the textual meaning of words from the chain.
The microphone listens to everything - some sound patterns, which are trained, trigger an action (playing an audio file).
This is how voice commands work for DVRs - for example, Xiaomi 70mai at a cost of about 1500 rubles. They listen constantly, "understand" only a few commands. Firmware with Russification does, incl. and these recognizable commands are Russian-speaking.

G

Gordon__Freeman, 2019-08-22
@Gordon__Freeman

Thanks for the recommendations - I'll try .. I
also found it, it seems to be exactly the same request, but on other technologies - Speech recognition, continuously reading the signal from the microphone?