COMPACT DISCS AND CODING THEORY AL-HASAN AL-AZZAWI
1. Introduction A Compact Disc (also known as a CD) is an optical disc used to store digital data. It was developed to store music at the start, but later it also allowed the storing of other kinds of data.The compact disc digital audio system standard currently in use was developed by Philips and Sony in an agreement signed in 1979. CDs have been commercially available to consumers since October 1982. The original audio CD was one of the most successful new electronic products ever introduced; everyone was surprised by its rapid acceptance by music lovers. In 2009, they are still the standard physical medium for commercial audio recordings. The Compact Disk is an example of the Optical Disk which was invented in 1958. An Optical Disc is a flat, usually circular, disc which can contain audio, video or data encoded in microscopic depressions called pits (or bumps) on a special material (often aluminum) on one of its flat surfaces.The encoding pattern follows a continuous, spiral path covering the entire disc surface and extending from the innermost track to the outermost track. The data is stored on the disc with a laser or stamping machine forming pits on the track.To access the information, a laser diode is used. A laser beam following the spiral track determines where changes in height in the spiral track occur by detecting changes in the intensity of the light reflected by the compact disc. In this way, a binary string of digits is produced, each change in height corresponding to the digit 1, an absence of a change in height being a 0 digit [1], see figure 1. Imperfections of the disc will produce errors in the recovered data. The principles of Coding Theory is used to correct those errors [4]. DVDs and Blue-ray Discs are also common examples of the Optical Disk. A compact disc is 120 mm in diameter. On each disc is one spiral track, approximately 5 km in length which is optically scanned by a laser, with wavelength approximately 0.8 µm, operating at a constant speed of about 1.25 m/s. The speed of rotation of the disc varies from Date: July, 30 2009. 1
2
AL-HASAN AL-AZZAWI
Figure 1. Taken from [1]
approximately 8 rev/s for the inner portion of the track to 3.5 rev/s for the outer portion. This storage technique is used for both digital and analog information, like music for instance. In the case of dealing with analog signals, an analog-to digital conversion of that signal is done by means of pulse-code modulation (PCM). The analog audio signal is sampled at a rate of 44.1 kHz, that is sampling takes place at the rate of 44 100 pairs of samples per second. This is done over the right and left channels of a stereo audio signal. Each sample is represented by a binary string of length 32, that is, 16 for the left and 16 for the right audio channel [2]. For more information on the physical properties of the Compact Disk, see [4]. 2. Compact Disc Coding The compact disc system was the first example of the introduction of Reed-Solomon codes in a consumer product [5]. This system has been designed from the point of view of a communication system, where the transmitter is the recording process, the channel is the disk and the receiver is the CD reader . The digitized information is then protected against noise by using RS codes, which in the case of the errorcorrection coding for the CD are two shortened versions of the RS code (255, 251), which operate over the Galois field GF (28 ), able to correct error patterns of size t = 2 or less each, implemented in concatenated (binary) form. The channel is a practical example of channels with burst errors, whose effect is diminished in this coding technique by using concatenated RS codes and interleaving. The two shortened versions of the RS code are concatenated by using an interleaver, giving form to the Cross-Interleaved Reed-Solomon code (CIRC). Here is noted the importance of the interleaving/deinterleaving procedure, which causes a given burst error pattern, which is essentially a long chain of consecutive errors, be distributed over many different consecutive code vectors, each one containing a small number of bytes in error. The example next uses one form of interleaving called Block Interleaving. Example 2.1. 3 codewords of length 4 is input into a 4 × 3 matrix.
COMPACT DISCS AND CODING THEORY
3
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 To interleave, a bit stream is formed reading down the columns: c1 , c5 , c9 , c2 , c6 , c10 , c3 , c7 , c11 , c4 , c8 , c12 Suppose that a burst error of 3 occurs erasing c2 , c6 , c10 , so we have the stream: c1 , c5 , c9 , ∗, ∗, ∗, c3 , c7 , c11 , c4 , c8 , c12 The bitstream is now deinterleaved by inputing the bitstream down the columns and reading across the rows thus producing the 3 word: c1 ∗ c3 c4 c5 ∗ c7 c8 c9 ∗ c11 c12 In this case, a single-error correcting code can correct the single errors in each word. However, if the codewords were not interleaved and a burst error of 3 occured then the code would not have been able to correct one of the codewords. This is the essence of interleaving; that is, the interleaving somehow randomizes and distributes a burst over many received vectors, reducing the size of the error event per received vector [3]. In the case of the compact disc, due to the 4-frame delay interleaving and the ability of C1 to correct four erasures, a burst error covering 16 consecutive 588bit strings on the disc can be corrected. Such a burst is approximately 2.8 mm in length along the track! 2.1. Encoding Procedure. As noted earlier, each sample of data is 16 bits, this is converted to 2 symbols in GF (28 ). Hence every symbol in GF (28 ) is a byte, and every sample is 2 bytes [3]. Example 2.2. The sample [1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1] can be represented by [1 + α2 + α3 + α4 , 1 + α + α3 + α4 + α6 + α7 ] which is the same as [α8 , α177 ]. The analog signal is sampled by taking six samples from each of the audio channels, the right and left channels, thus forming a group of 12 vectors of 16 bits each, that is, a vector of 24 bytes. This information is first processed by an interleaver, and then input to the first encoder, which is a shortened version (28, 24) of the RS code (255, 251), which we will denote C1 . This encoder generates an output of 28 bytes that
4
AL-HASAN AL-AZZAWI
Figure 2
is input to another interleaver, which in turn passes these 28-byte interleaved vectors to a second encoder, another shortened version RS code (32, 28), which we will denote C2 [3]. This is illustrated in figure 2. A table of the codewords in the code (255, 251) would be enormous. However, the first page of the table (or set of codewords at the top of the table) would contain codewords with 0s only in the most significant message positions. Therefore a shortened RS code can be formed by taking this page of codewords and deleting the positions that are 0s in all the codewords on this page [3]. For example in the case of the shortened version (28, 24) of the RS code (255, 251), encoding is done using the polynomial form of the word. Then when converting to the vector form, all the extra 0s after the 28th position are removed leaving us with a 28-byte vector. 2.1.1. Interleaver 1. The bytes are encoded in the following manner. The vector we have here is of 24 bytes consisting of 12 samples (L and R).The bytes in the odd numbered positions in the codewords in are all moved along 32 positions, so they are mixed with the bytes in the even numbered positions of the following codeword. This is done to improve the chances of correcting single errors with C2 since now 2 consecutive errors affect different codewords. We can view a frame as: L1 , R1 , L2 , R2 , L3 , R3 , L4 , R4 , L5 , R5 , L6 , R6 Where Li and Ri are samples of 2 bytes representing the left and right channels. The odd-numbered samples L1 , R1 , L3 , R3 , L5 , R5 are grouped with the even-numbered samples L2 ∗, R2 ∗, L4 ∗, R4 ∗, L6 ∗, R6 ∗ taken from two frames later. Then, these new frames are rearranged internally into 24 bytes by separating the odd-numbered samples from the even-numbered samples. So we are now looking at a new frame of 24 bytes: L1 , R1 , L3 , R3 , L5 , R5 , L2 ∗, R2 ∗, L4 ∗, R4 ∗, L6 ∗, R6 ∗
COMPACT DISCS AND CODING THEORY
5
Figure 3. Taken from [3]
2.1.2. Encoder C1 . The 24-byte message vector is encoded with RS code (255,251), then the codeword is shortened to (28, 24).The RS code is shortened by removing the extra zeros after the 28th symbol in the case of C1 and after the 32nd symbol in C2 . This encoder produces four bytes of redundancy [2]. 2.1.3. Interleaver 2. The four bytes of redundancy, that is, two pairs P1 and P2 each with two bytes of parity are then placed in the middle of the vector further separating the odd-numbered samples from the even-numbered samples. Then we have: L1 , R1 , L3 , R3 , L5 , R5 , P1 , P2 , L2 ∗, R2 ∗, L4 ∗, R4 ∗, L6 ∗, R6 ∗ Thus from C1 we produce a string of 28-byte codewords. Then we use a form of interleaving which is known as 4-frame delay interleaving. in Figure 3, for ci,j , i is the codeword number and j is the byte number. So c1,1 represents the first byte of the first codeword. The codewords are arranged in a matrix as in Figure 3. Then we transmit the new words reading down column by column. 2.1.4. Encoder C2 . The 28-byte message vector in C1 is again encoded with C2 (32, 28) to produce a 32-byte message vector. 2.1.5. Interleaver 3. The codewords are regrouped with the odd-numbered symbols of one codeword grouped with the even-numbered symbols of the next codeword. This regrouping is another form of interleaving which further breaks up short bursts that may still be present after the 4-frame delay interleaving. The regrouped bytes are written consecutively in one long stream. We now re-divide this long string into segments of 32 bytes, with 16 bytes from one C2 codeword and 16 bytes from another C2 codeword because of the above regrouping. At the end of each of these segments a 33rd byte is added which contains control and display information. Thus each frame of six samples eventually leads to 33 bytes of data [3].
6
AL-HASAN AL-AZZAWI
Figure 4. A part of the EFM table.Taken from [5]
2.1.6. EFM. Physical limitations of the laser tracking make it desirable that changes in height in the spiral track do not occur too close together nor too far apart. It was therefore determined that in the binary representation of each codeword, between any two 1s there should be at least two and at most ten 0s. The Reed-Solomon codewords do not have this property. However there are exactly 267 binary words of length 14 that do have this property. The 256 field elements are matched to 256 of these words using a table look up, 11 of the 267 words being discarded, see figure 4. This process is called eight to fourteen modulation (EFM). Then, to make sure that the property holds between the words of length 14, 3 further bits of 0s are added. So now, the binary representation of each codeword has length 33 ∗ 17 = 561 [1]. Finally, to each codeword, a binary word of length 27 is added for synchronization purposes, which also has the above property. Therefore in total, audio information from 6 consecutive ticks is initially stored as a binary vector of length 24 x 8 = 196, and after all the processes are complete, appear on the compact disc as a binary word of length 588 [1]. 2.2. Decoding Procedure. There are two kinds of errors: those that are distributed randomly among the individual bits, called random errors, and those that occur in groups that cover hundreds or even thousands of bits, called burst errors. Burst errors, caused by dropouts, are usually the result of surface contamination from fingerprints and
COMPACT DISCS AND CODING THEORY
7
scratches on the disc [5]. Generally, C2 is used to correct single random errors while C1 is used to correct burst errors. If the first decoder C2 detects multiple errors in the received vector, then it erases all its positions, deinterleaves the erased received vector and passes it to the second decoder C1 . The second decoder knows which positions are erased, and therefore knows the positions of the possible errors. The system of four syndrome equations of the second decoder is then able to determine the error values in up to four of these error positions, thus performing the correction of error patterns of size t = 4 or less [3]. 2.2.1. Decoding with C2 . First, the synchronization bits, control and display bits, and merging bits are removed. Then the remaining binary strings are converted from EFM form into byte form, a process called demodulation, using table look-up in figure 4; we now have our data as a stream of bytes. Next we undo the scrambling done in the encoding process. The stream is divided into segments of 32 bytes. Each of these 32-byte segments contains odd-numbered bytes from one codeword (with possible errors, of course) and even-numbered bytes from the next. The bytes in the segments are regrouped to restore the positions in order and are passed on to the decoder for C2 . Note that if a short burst error had occurred on the disc, the burst may be split up into shorter bursts by the regrouping. As C2 is a (32, 28, 5) code over GF (28 ), it can correct two errors. However, it is only used to correct single errors or detect the presence of multiple errors, including all errors of size two or three and some of larger size. This is done to marginally reduce the chance of decoding to a different codeword than the original codeword sent. First, notice that the only way that the decoding using C2 can go wrong is if the received word is within distance 1 of a C2 codeword that is not the right codeword. There are very few error patterns that will do this! That is about 1.9 × 10−6 chance. If C2 were used to its full error-correcting capability by correcting all double errors, then the probability that three or more errors would go undetected is about 7.5 × 10−3 . This is why no more than single error correction is used with C2 . This correction of single error by C2 is designed to cope with small random errors caused by inaccuracies in the coating and cutting of the compact discs, see [1] and [2]. If the decoder for C2 determines that no errors in a 32-byte string are found, the 28-byte message is extracted and passed on to the next stage. If the decoder for C2 detects a single error, the error is corrected and the 28-byte message is passed on. If the decoder detects more than
8
AL-HASAN AL-AZZAWI
two errors, it passes on a 28-byte string with all components flagged as erasures [2]. 2.2.2. Decoding with C1 . The effects of Interleaver 2 are removed. C1 can correct four erasures. Since the locations of errors are already known, the error locator polynomial and the error evaluator polynomial are easy to find. Then error values can easily be determined using Forney’s algorithm. Next, the effects of interleaver 1 are removed and now we are hopefully left with the original frame. 2.2.3. Errors that survive. It is possible that there are samples that cannot be corrected by the use of C2 and C1 but are detected as errors and hence remain erased. One technique used is to conceal the error. Recall that consecutive samples are separated by two frames before any encoding was performed. When the final decoding is completed and these samples are brought back to their correct order, it is likely that the neighboring samples were correct or had been corrected. If this is the case, then the erased sample is replaced by an approximation obtained by linear interpolation using the two reliable samples on either side of the sample in error. Listening tests have shown that this process is essentially undetectable. If the neighbors are unreliable, implying a burst is still present, so that interpolation is not possible, then muting is used. Both linear interpolation and muting mask clicks that may otherwise occur [2]. References [1] D.C. Hankerson, Gary Hoffman, D.A. Leonard, Charles C. Lindner, K.T. Phelps, C.A. Rodger and J.R. Wall , Coding Theory and Cryptography: The Essentials, Marcel Dekker, 1st edition (1991), 182-184. [2] W. Cary Huffman and Vera Pless, Fundamentals of Error-Correcting Codes, Cambridge University Press (2003), 203-208. [3] Jorge C. Moreira and Patrick G. Farrell, Essentials of Error-Control Coding, Wiley (2006), 136-152. [4] Ken C. Pohlmann, The Compact Disc Handbook, A-R Editions (1992), xiii. [5] Stephen B. Wicker and Vijay K. Bhargava, Reed-Solomon Codes and Their Applications, Wiley-IEEE Press (1999), 41-59. University of British Columbia, Department of Mathematics E-mail address:
[email protected]