What is audio in terms of the program

M

max_mara2013-09-29 07:50:50

Audio

max_mara, 2013-09-29 07:50:50

Good afternoon colleagues,

I am working on one very interesting algorithm for processing audio samples, but unfortunately the algorithm cannot yet find expressions in program form.

If an uncompressed image is a matrix of size [x, y] where the color coordinate value is in (r,g,b) or whatever, and already different algorithms work with matrices, making the necessary transformations, then what is audio in uncompressed form, I'm crazy I will not apply.

Let's say I'm writing a C++ program that takes a 10 second sample from a microphone, I get an array of amplitudes. Is it uncompressed audio? An array of amplitudes? And how then to compress this audio into MP3 or Wav for example?

In general, any information, documentation and code examples are welcome.

Thanks in advance.

Reply

Answer the question

In order to leave comments, you need to log in

2 answer(s)

T

Teivaz, 2013-09-29
@max_mara

If for a simple color image, each point can be represented by a combination of three primary colors (each of which has a bit depth from 0 to some number, 2^8, 2^12, it does not matter), then for audio, each sample in time is determined by only one value - amplitude (which, roughly speaking, can also take values from 0 to some maximum value, 2^8, 2^16) So music can ultimately be represented as a one-dimensional array, each element in which corresponds to a certain time ; while an image is a three-dimensional array, each element of which corresponds to a specific x, y coordinate.
You can expand the sound into a basis of sine and cosine by applying the Fourier transform. Then the sound will be represented as a two-dimensional array (sine amplitude, cosine amplitude) and each pair will correspond to a certain frequency, not time. Or you can expand it in a slightly different form - (amplitude, phases).
There are also other representations of music. They can be decomposed into different audio tracks, instruments. For example, midi files, which, roughly speaking, store information about at what point and for how long a certain note should sound for a certain instrument.

F

Finesse, 2013-09-29
@Finesse

According to the condition of the Kotelnikov theorem, a discrete sample of the instantaneous values of the parameter is taken from the audio signal (with a sampling frequency, you must have heard this phrase somewhere), which is a vector (array). At the output, it restores the original continuous signal. Depending on the format, this array is compressed in different ways.