F.4 CIT – Sound
page 1
Introduction Sound is formed by the vibration of particles and transmitted in the form of sound wave. For example, when we talk, the sound is produced by the vibration of air particles. The number of vibration of a single tone in 1 second is called the frequency of the tone, and this is measured in Hertz (1Hz = 1s-1). The sound that we always hear is composed of waves of different frequencies. Sound is described by its pitch (frequency) and loudness (amplitude). The human audible frequency range is from 20Hz to 20kHz. P ay a
t ten
t ion.
This
is
a
t e s t.
Figure 1. A typical waveform of a speech
Figure 2. Waveform of music played by a piano (~10s) (Music extracted from “2046” by Eric Kwok)
Figure 3. Detail of a short period (~0.04s) of the waveform in Figure 2
Digitization of Sound Sound information must be converted to an electrical signal before it can be digitized. This step can be done by a microphone. The electrical signal is then transmitted to and digitized in the sound card of a computer. In electrical string instruments, the vibration of the strings produces electrical signal which is then digitized immediately. The electrical signal has to be sampled and quantized in the digitization step. 1. Sampling The electrical signal has to be sampled first. The number of samples produced in 1 second is called the sampling rate or sampling frequency and it is measured in Hz (or more commonly in kHz).
F.4 CIT – Sound
page 2
A higher sampling frequency produces higher sound quality. Quality CD or near CD FM Radio AM Radio
Sampling Rate 44.1kHz 22.05kHz 11.025kHz
Example of Application Audio CD Production Audio Streaming Voice Transmission
Table 1. The quality and application of different sampling rate
2.
3.
4.
Quantization The number of bits for storing 1 sample is the sample size of the audio. A sample size n will give 2n quantization levels. The most common choice of sample size is 16 bits, as used in CD audio. This gives 216 or 65536 quantization levels. The minimum acceptable sample size is 8 bit, which is normally used for applications like voice communication, where the distortion can be tolerated. Channel The number of channels determines whether a recording produces 1 waveform (mono) or 2 waveforms (stereo). Stereo sound can provide more information and give a richer experience, but will double the file size of mono sound. Bit rate Bit rate is the amount of information (in bits) transferred in a second. Bit rate is measured in bps (bits per second). In sampling audio signal, the bit rate is the number of bits used to stored samples in 1 second. No. of samples of 1 channel in 1s Total no. of samples in 1s Bit Rate
= Sampling Frequency [Hz] = Sampling Frequency * No. of channels = Sample Size * Total no. of samples in 1s
Bit Rate = Sampling Freq. * Sample Size * No. of Channels
5.
e.g. Conventional CD recording uses 16 bits to store each sample taken at 44.1kHz. But Super Audio CD uses another new technology which is known as DSD (Direct Stream Digital). The sampling rate of DSD is 64 times the conventional one and the sample size is reduced to 1 bit. Calculate the bit rates of conventional CD recording and that of DSD. Give your answers in bps. (Information from http://www.sel.sony.com/SEL/consumer/dsd/dsd.pdf) Wave File Size (Uncompressed) The uncompressed file size of an audio is determined by the number of samples stored. Samples are stored in the file as an array. This will be a 1 dimensional array for mono sound and 2 dimensional array for stereo sound.
F.4 CIT – Sound
page 3
Calculation of File Size: No. of samples of 1 channel in 1s
= Sampling Frequency [Hz]
Total no. of samples in 1s
= Sampling Frequency * No. of channels
Total no. of samples in the file
= Total no. of samples in 1s * Duration [s]
Size of all samples [bits]
= Total no. of samples * Sample Size [bits]
Size of File [bytes]
= Size of all samples [bits] / 8
File Size =
Sampling Freq. ∗ Sample Size ∗ No. of Channels ∗ Duration 8
e.g. Calculate at most how many songs of 3.5mins can be stored in a 650MB CD using the CD audio coding standard. Common Audio File Types 1. Wave Format (*.wav) Wave is the standard form for uncompressed audio on a PC. Since a wave file is uncompressed data - as close a copy to the original analogue data as possible - it is therefore much larger than the same file would be in a compressed format such as mp3 or RealAudio. Audio CDs store their audio in, essentially, the wave format. Your audio will need to be in this format in order to be edited using a wave editor, or burned to an audio CD that will play in your home stereo. 2. MIDI (*.mid or *.midi) MIDI stands for Musical Instrument Digital Interface. MIDI is a standard way for communicating information about music between different electronics devices, such as computers and sound synthesizers. A MIDI file is generally much smaller than a wave file because it stores the codes about the instruments and notes instead of the actual digitized sounds. Since in the MIDI standard, the ways of producing sounds are not specified, one file can be played differently in different computers. In order to solve this problem, General MIDI was produced. This gives a standard which specifies 128 standard voices. 1 Acoustic Grand Piano 2 Bright Acoustic Piano 3 Electric Grand Piano …
44 Contrabass 45 Tremolo Strings 46 Pizzicato Strings …
87 Synth Lead~7 88 Synth Lead~8 89 Synth Pad~1 …
Table 2. General MIDI voice numbers
3.
Compressed Audio By compressing audio signal, the file size can be greatly reduced. Moreover, most of the compressed audio format supports streaming. This means the audience can play an audio file through network while the audio data is still transmitting to the audience. However, since different formats use different
F.4 CIT – Sound
page 4
compression algorithms, specific codec (coder-decoder) should be installed before playing the audio stream. MP3 (*.mp3) MP3 stands for MPEG 1 - Audio Layer 3. It is a standard format that enables audio files to be compressed into small files, by losing sound signal in the human insensitive range. It should be noted that the more compression that is used, the more the quality loss will be noticeable. However, for voice data (such as a recording of a lecture) very high compression can be used, and the speaker’s voice will remain recognisable. Typical file size for MP3 is 1Mb/min for near CD quality audio. MP3 files can be played by using Winamp (www.winamp.com) or Windows Media Player. Real Audio (*.ra) Real Audio is developed by Real Networks (www.realnetworks.com). Like MP3, Real Audio is also a lossy audio and it supports streaming. It can be played only with codec developed by Real Networks. The most common player is Real Player. Sound Recording and Editing Wavepad (Freeware): http://www.nch.com.au/wavepad/index.html Exercise: 1. Record a 20s voice or trim a music to 20s. 2. Try different playback setting (sampling frequency, sample size and no. of channels). 3. Fade in or fade out the sound. 4. Mix the sound with other audio files.
Figure 4. Screenshot of Wavepad