Notes on Audio (Channels, surround sound, audio files, etc)

December 30, 2015

Trying to get a grasp about what all of this audio business is. Channels, different types of audio files, compression, all was very foreign to me.

So all this 5.1, 7.1 business just refers to the number and type of speakers in a surround sound setup.

In the actual production, they’re recording what are just referred to as tracks (certain voices, certain sound effects, ambiance, etc) and then in post-production they will assign each track to a channel, so that it’s played at a certain location in the room.

A “channel” is simply a stream of audio information. So, if something were two channel (called stereo), then that would mean there are two streams of audio in that recording (most likely a left and a right).

Each speaker is supposed to get its own “channel”, and 5.1 means you have 5 normal audio channels (left, center, right, left surround, right surround) and 1 subwoofer channel (for dat bass). So you need six speakers to appropriately play 5.1 audio. Also, you may not actually have 6 channels, because the .1 (the subwoofer) may just get all the low frequency stuff from the other channels.

7.1 just means you have some extra channels (left front and right front).

Then you also have mono audio, of course, which is just one “channel”.

Human voice is apparently a mono instrument, so it shouldn’t be split among channels. Not even stereo (two). They’ll make plane sound go from one speaker behind you to the one across, to make you feel emerged.

When the record, they could be using multiple recording devices all around the scene, and put each of those on one channel, or they have fancy methods of splitting up audio into components, or they could add sounds in post production to add effects (you know, perhaps some leaves crunching or some birds or wind).

Subwoofers are special low-frequency (bass) speakers. For surround sound, they’ll split it up so that only the subwoofer has the bass (as it may be the only speaker that can handle it).

Stereo means two channels.

Multichannel music has slowly started to become more popular since 1999. Pink Floyd was first to do surround sound concert.

For converting the binary audio data to actual charge on a wire, you need a DAC (digital analog converter). Then this charge travels on the wire, until it hits the speaker, which will use this charge to actually cause motion (probably via magnetism), which will produce sound.

Now, to drive large speakers, you may need to amplify the signal being sent from the DAC (in your phone, or it would be in the sound card in your computer, etc). Now, amps do need power, though, I believe. The gains of amps are often measured in decibels (below).

Decibel: measure of intensity of sound (sound power per unit area). 0dB is near total silence, 15dB is a whisper. 60dB is a normal conversation. But remember, it’s a log scale, so 10 times more powerful than silence is 10dB, 100 times more powerful than silence is 20dB. The reason for the log scale is that when a sound’s intensity increases 10 fold, it sounds twice as loud (generally). But the dB displayed on receivers may not correspond. It seems more like to be 0dB is how high you can go before there’s distortion, but the exact numbers displayed are pretty arbitrary it seems.

Apparently one decibel is the “just noticeable difference” for the human ear.

A high intensity low frequency sound will sound as loud as a lower intensity high frequency sound (to humans). They’ve graphed these equal loudness curves. As a sound’s intensity decreases, the loudness changes more rapidly with a changing frequency (to humans).

As for audio files: very similar to video files. You have an a2rmat, which is what contains the actual audio data, and an audio codec (coder/decoder), which is what decodes the actual audio data. You package this coded audio data into an audio file format (e.g. WAV, AIFF, MP3, AAC, FLAC, etc). Apparently most audio file formats only support one type of audio coding data. WAV, AIFF are examples of uncompressed audio formats, FLAC is lossless compressed, and lossy compression is MP3, AAC, and others. Mostly, you’ll see lossy compressed.

Okay, so MP3 is actually an exception to the usual standard. AAC is the raw compressed audio data, (examples of a codec: LAME), but it’s stored in an m4a extension, for instance, which is a container which also includes other metadata (title, etc). But MP3 files are the actual raw compressed audio data, and there’s a hack called “ID3”, which append tags (like title, etc) to the front of the file, and then they just hope that whatever MP3 player plays the data is going to recognize this part as malformed and skip it as data.