How to train a neural network to recognize speech?

K

krll-k2016-11-07 04:30:28

JavaScript

krll-k, 2016-11-07 04:30:28

I read an article where using a neural network in javascript they recognize handwritten text. The article recognizes the text entered by hand, they took only numbers (from 0 to 9) as an example. There was an idea to complicate the example, to teach the network to recognize the same thing, but already by ear
Today. Faced the first stumbling block. The network needs to be trained, fed with data, but before that, it is necessary to somehow make the data of the same type. If everything is clear with the image, a pixel-by-pixel overlay is made and passed through the entire network, then what about the sound?
Tomorrow. The sound is not like the image, so how and with what to convert the audio before sending it to the network?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

N

Nicholas, 2016-11-07
@healqq

With sound, various transformations (Fourier, Wavelet) and threshold filtering are usually used to obtain some kind of metrics. I would recommend looking at the second (Wavelet) option.