Answer the question
In order to leave comments, you need to log in
How to trim audio at a given word spacing?
There are a lot of mp3 and wav files (dictation recordings) for an hour each. It is required to break each one into 20-30 minutes, but in such a way that it would be clear on which word the file ended and began. Can this be done with ffmpeg? Or other open source software.
There was an idea to simply read the WaveForm values in a given interval and cut based on the coordinates, but I did not find suitable libraries. Maybe someone can tell in which direction to dig (wpf, win form, you can also java - no difference).
Answer the question
In order to leave comments, you need to log in
To work with audio, there is an answer - Trim an Audio File (.wav, .mp3)
The most difficult thing is - "to make it clear on which word the file ended and began." There is already speech recognition.
Or cut on the nearest silence in a given area, or actually parse into words, indicating the start / end position for each word in the audio stream.
If there is silence and a voice in the recording, then cut through the silence.
If there is background noise, music and anything else, but not words, then it is more difficult.
You can recognize not the whole piece, but up to the first word at the beginning of the piece and at the end.
Recognition answer question - Voice/Speech to text .
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question