On the Calculation of the Error Probability for a Multilevel Modulation Scheme Using QAM-signaling1 2 ;
Karin Engdahl, Student Member, IEEE and Kamil Sh. Zigangirov, Member, IEEE3
Abstract{ We analyze the quadrature amplitude modulation (QAM) version of the
multilevel modulation scheme with multistage decoding using suboptimal metric, when transmission takes place over a memoryless Gaussian channel. The upper bounds for decoding error probabilities are functions of the Cherno bounding parameter Z . We argue that the conventional approximation of Z is not adequate, and new values of Z that tightens the error bounds without causing them to lose their validity are given. The capacity for this system is also calculated, and the result hereof is the conclusion that the use of suboptimal metric in multistage decoding causes very little degradation in capacity compared to when optimal metric is used in each decoding stage. Index terms{ Multilevel modulation, multistage decoding, QAM-signaling, set par-
titioning.
This work was supported in part by Swedish Research Council for Engineering Sciences under Grant 95-164 2 Part of this work was presented at IEEE International Symposium on Information Theory, Ulm, Germany, June 29 { July 4 1997 3 Both authors are at the Department of Information Technology, Telecommunication Theory Group, Lund University, Box 118, S-221 00 Lund, Sweden, Phone +46 46 222 3450 and +46 46 222 4460, Fax +46 46 222 4714, e-mail
[email protected] and
[email protected] 1
1
I. INTRODUCTION The principle of trellis-coded modulation was described by Ungerbck in his paper of 1982 [8], and the concept of multilevel modulation was introduced by Imai and Hirakawa [6]. In this method the channel signal set is successively binary partitioned using the set partitioning rule where the binary labels of the branches from one level of the partition chain to the next are encoded by independent binary codes. The multilevel scheme enables the usage of a suboptimal multistage decoder [6], [7], which performs almost as well as an optimum joint maximum likelihood sequence estimator over all levels [4], but is much less complex. In this paper we will study a multilevel modulation scheme using QAM-signaling, transmitting over a discrete memoryless Gaussian channel and employing a multistage decoder. To further reduce complexity we use a suboptimal metric in each decoding stage. This is shown to generate very slight performance loss in terms of capacity. In the following section we introduce the QAM multilevel modulation scheme, which is a modi ed version of the one proposed in [6]. The channel characteristics and the decoding procedure are also given. In Section III we upperbound the block and burst error probabilities, for block and convolutional codes respectively. The calculation of upper bounds for these probabilities reduces to calculation of code generating functions with the Cherno bounding parameter Z as argument. This parameter is a function of the intra-set squared Euclidean distance k2 on level k, the noise variance 2 and the number of signal points on the corresponding level of set partitioning. 2
For 2-QAM transmission (the last level of the multilevel QAM scheme) Z = exp ( k2 =82). This value is a lower bound on Z for other levels of the scheme, and is often used as an approximation of Z , see for example [1], though the approximation results in that the bounds on error probability lose their validity. The inaccuracy increases with the ratio k2 =2. An accurate upper bound on Z for any level of the QAM scheme is Z = 4 exp ( k2 =82). This is a consequence of the \nearest neighbor error events principle" [4], [7], and the fact that each signal point of a
M -QAM set has at most 4 nearest neighbors. This value of Z yields, especially for small values of k2 =2, loose bounds on the error probabilities. In the case of 16-QAM the average number of nearest neighbors is 3 so here the estimation Z = 3 exp ( k2 =82) can be used [5], and for 4-QAM each signal point has 2 nearest neighbors so Z = 2 exp ( k2 =82) is an approximation of Z . In Sections IV and V we calculate better upper bounds on Z that tightens the error bounds without causing them to lose their rigor. This is done using the Cherno bounding method, which gives exponentially tight bounds [11]. The last but one level, when the signal set is 4-QAM, is considered in Section IV, and in Section V we calculate Z for a QAM signal set with an in nite number of points. The assumption of M -QAM with M = 1 is a mathematical abstraction, but it is a good approximation of a M -QAM multilevel coded modulation scheme with
M < 1, at least for large M . Finally, in Section VI we calculate the capacity of this multilevel modulation scheme using QAM-signaling and this type of suboptimal decoder, and compare to the capacity for the same scheme using optimal metric in each decoding stage. 3
II. SYSTEM DESCRIPTION The transmitter and receiver described in Figure 1 is a generalization of the scheme of Imai and Hirakawa [6] to multilevel QAM. A binary information sequence u is partitioned into K binary subsequences u(1) ; u(2) ; : : : ; u(K ), where each subsequence is encoded by an independent binary component code Ck (block or
n
o
convolutional). A set of K bits, v(1) (n) ; v(2) (n) ; : : : ; v(K ) (n) , one bit from each code sequence v(1); v(2) ; : : : ; v(K ), are synchronously mapped onto one of the 2K QAM signal points, s (n). The K -level partitioning of the signal set is a sequence of K partitions S (0) =S (1)= : : : =S (K ). The mapping by set partitioning is illustrated in Figure 2 for K = 4. The squared Euclidean intra-set distance on the kth level of partitioning is k2 = 2k 112, k = 1; 2; : : : ; K . Assuming equiprobable signal points
we get that the total average signal energy per channel use is Es = 2K 1 12 =6
when K is even, and Es = 2K +1 1 12=12 when K is odd. When passed through the discrete memoryless Gaussian channel the complex input sequence s = s (1) ; s (2) ; : : : ; s (n) ; : : : (where s (n) = a (n)+jb (n) is the channel
n o 1; 3; : : : ; 2K=2 1 1=2 n p o when K is even and a (n) ; b (n) 2 1; 3; : : : ; 2K +1=2 1 1=2 2 s.t. a (n)+ n p o b (n) 2 0; 4; 8; : : : ; 4 2K 1=2 1 1=2 2 when K is odd) is corrupted by the error sequence e = e (1) ; e (2) ; : : : ; e (n) ; : : : such that the complex received sequence is r = s + e. Here e (n) = e(I ) (n) + je(Q) (n), where e(I ) (n) and e(Q) (n) are independent Gaussian random variables with zero mean and variance 2. input at the nth moment of time, a (n) ; b (n) 2
The multistage decoder consists of a set of suboptimal decoders matched to the codes used on the corresponding levels of encoding. Each decoding stage consists of 4
calculation of distances (metrics) to the received sequence r from all possible code words on the corresponding level of set partitioning. The side information from the previous decoding stages determines, according to the set partitioning structure (illustrated in Figure 2), the signal set upon which the metrics are calculated. When calculating the metrics, the decoder uses the following suboptimal principle. Let us suppose that a binary block code (the extension to convolutional codes is straight forward) of length N is used on the kth level of the encoding, and that the decoding on the previous (k 1) decoding stages determines the subsets S (k
; S (k 1) (2) ; : : : ; S (k 1) (N ), to which the transmitted symbols of the codeword v(k) = v(k) (1) ; v(k) (2) ; : : : ; v(k) (N ), v(k) (n) 2 f0; 1g, belong. Let 1) (1)
s(k) (n) 2 S (k 1) (n), n = 1; 2; : : : ; N and s(k) = s(k) (1) ; s(k) (2) ; : : : ; s(k) (N ). Let S0(k
1)
(n) and S1(k
1)
(n) be subsets of S (k
n), corresponding to transmis1) (1) ; sion of v(k) (n) = 0 and v(k) (n) = 1 respectively. Finally let S(vk(k)1) = Sv(k(k)(1) 1) (
1) (2) ; : : : ; Sv(k(k)(1)N ) (N ) be the sequence of subsets corresponding to transmission Sv(k(k)(2) of the code word v(k). Then the distance (metric) between the received sequence
r = r (1) ; r (2) ; : : : ; r (N ) and the codeword v(k) is determined as r; v(k) = (k)min(k 1) dE r; s(k) ; s
2Sv(k)
(1)
where dE (x; y) means the squared Euclidean distance between the N -dimensional vectors x and y. The decoding consists of choosing the codeword v(k) for which the
metric r; v(k) above is minimal.
III. ERROR PERFORMANCE The performance of a multilevel coded modulation system, which employs a 5
multistage decoder, is commonly estimated by average bit error probability of each component code [1], [4], [7] or by block error probability and burst error probability for block and convolutional coding respectively [1], [4]. In this section we present upper bounds for the last two probabilities. When considering the overall performance of the multilevel system, it should be noted that the overall decoding error rate is upperbounded by the sum, over all levels, of the probability that a decoding error occurs at one level conditioned that all the previous levels were correctly decoded [3]. Consider the kth decoding level, k = 1; 2; : : : ; K . Let us suppose that a linear binary block code of length N (k), with L codewords, v0(k); v1(k) ; : : : ; vL(k) 1 , is used on
this level. Let the transmitted codeword be v0(k). If r; v0(k) r; vl(k) we say that the lth codeword, l = 1; 2; : : : ; L 1, causes a decoding error and we will denote
this event "(l k). Then the probability of decoding error P "(k) on the kth level is upperbounded by the union bound [9], [11]
LX1 P "(k) P "(l k) l=1
(2)
In the binary case P "(l k) is a function of the Hamming distance wl(k) between v0(k) and vl(k). The main term of the bound is decreasing exponentially with wl(k), i.e. the bound has the form
(k) (k)wl(k) P "l < Z ;
(3)
where Z (k) is a function of k2 and k2 . In this paper we will use the Cherno bounding
method for upperbounding of P "(l k) . This method gives, as is well known, an
6
exponentially tight bound [11]. From (2) and (3) we get for linear block codes [10]
(k) LX1 (k)wl(k) NX(k) (k) (k)w (k) Z aw Z = = G (D) jD=Z (k) ; P " l=1
k) w=d(min
(4)
where fa(wk)g is the weight distribution of the code on the kth decoding level, i.e. k) is the minimal Hamming a(wk) is the number of code words having weight w, d(min
distance of the code, and G(k) (D) is the generating function of the linear block code.
When a convolutional code is used, the burst error ( rst-event) probability P "(k) is upperbounded by the union type bound [9]
P "(k) < T (k) (D) jD=Z (k)
(5)
where T (k) (D) is the generating function of the convolutional code on the kth decoding level. We note that the commonly used upper bounds for block and burst error probabilities in the block and convolutional coding cases are both functions of the parameter Z (k). The same is correct when considering bit error probabilities. Calculation of the minimal possible Z (k) yields tight upperbounding of the decoding error probability on each decoding level. As mentioned in the introduction exp
! k2 Z (k) < 4 exp 8 2
k2 8 2
!
(6)
when QAM-signaling is used. The bounds (4) and (5) are often violated through the use of the lower bound exp ( k2 =82) as an approximation to Z (k). Thereby the upper bounds (4) and (5) lose their validity. On the other hand the upper bound 4 exp ( k2 =82) sometimes gives bounds that are loose. In the following sections we calculate the values Z (k) that give exponentially tight union bounds, (4) and 7
(5), without causing them to lose their validity. This is done by using the Cherno bounding method and the analysis is performed for the last two decoding levels and for a QAM signal set with an in nite number of signal points. The application of the Cherno bound to the situation considered is explained in Appendix A.
IV. THE CHERNOFF BOUNDING PARAMETER FOR THE LAST TWO LEVELS On the last decoding level each of the subsets S0(K
1)
and S1(K
1)
consists of
one point, and the squared Euclidean distance between the subsets is K2 . It is well known that in this case the probability that the lth codeword will cause the decoding error can be calculated exactly, 0 q (K ) 1 0 (K ) 2 1 (K ) w(K) w P "l = Q @ 2l K A < exp @ wl82K A = Z (K ) l ;
(7)
p R where Q (x) = x1 exp ( t2 =2) dt= 2, and Z (K ) = Z2 = exp ( K2 =82), i.e. Z (K ) coincides with the lower bound in (6). The parameter Z (K ) is a function of the normalized Euclidean distance K = K =, and since all the expressions of Z (k) in
the further evaluation of the error probability on the kth level will also be functions of k = k =, we will keep this notation. Thus, without loss of generality, we consider on the kth level of set partitioning the signal set with the minimal inter-set Euclidean distance k and additive Gaussian noise with the variance 2 equal to 1. Numerical values of the Cherno bounding parameter Z2 are shown in Table I for a number of dierent = K . Consider now the decoding on the last but one level, when the subsets S0(K
S1(K
2)
2)
and
consist of two points each. The corresponding set partitioning is presented
in Figure 3. We introduce the notation
(k )
8
= dE r; S0(k
1)
dE r; S1(k 1) , where
dE r; Si(k 1) , i = 0; 1 is the squared Euclidean distance from the received signal point r to the nearest point of the set Si(k 1). In Appendix B it is shown that
f (0)(K 1) ( ), the probability density function of
(K 1)
conditioned on that a zero was
sent, is equal to a scaled convolution of
fX (x) = p2 exp 2 (0)
and
x2 2
!
!1
0
0
!1
2 2 1 1 1 1 K 1 A K 1 A @ @ y+ p fY (y) = p exp 2 y p + p exp ; 2 2 2 2 2 x; y 0. It is also proven i Appendix B that the following theorem applies. (0)
Theorem 1 For the last but one level of suboptimal decoding of a multilevel coded QAM signal set, the Cherno bounding parameter has the value
Z (K
1)
= Z4 = min '(K s0
1) (
s) ;
(8)
where
p 2sK 1 s) = 2 exp 2 (sK 1)2 Q (9) ! !! exp s2K 1 Q pK2 1 (2s 1) + exp s2K 1 Q pK2 1 (2s + 1) : Numerical values of Z4 are shown in Table I for a number of dierent = K 1. We can see that for small the values of Z4 are close to the lower bound Z2 , but for '( K
1) (
large Z4 approaches 2Z2 due to the \nearest neighbor error events principle". In Figure 4 we show by two examples how the bounds on error probability are improved by the use of Z4 instead of 2Z2. In principal the value Z that gives an exponentially tight union bound for the error probability can be calculated for any decoding level of multilevel coded M QAM, but for M > 4 the calculation becomes cumbersome. For example, when the 9
signal set is the 8-QAM signal set, we would have to consider nine reception regions of which there are 6 dierent types, as compared to the 4-QAM case where we consider four regions that are all of the same type. Instead of carrying through with these calculations we perform an analysis of the QAM set with an in nite number of signal points. The resulting Cherno bounding parameter Z1 upperbounds the values of Z for any nite QAM signal set. To make this plausible we present the following intuitive argument. Consider a M -QAM scheme for any nite M . Since the suboptimal decoding procedure only takes the distance to the two nearest points into consideration, the adding of ctional points from the QAM set with an in nite number of signal points, can only make the performance worse. This is because, in this case, the distance from the received point to the nearest point of the reference set competes not only with the distance from the received point to the nearest point of the opposite set, but also with the distance from the received point to the nearest ctional point.
V. THE CHERNOFF BOUNDING PARAMETER FOR A QAM SET WITH AN INFINITE NUMBER OF SIGNAL POINTS Here the Cherno bounding parameter for a multilevel QAM set with an in nite number of signal points, i.e. the (2Z + 1)2 -lattice, will be given. In this case the subsets S0(k
1)
and S1(k
1)
consist, on each decoding level, of in nitely many points
each. The corresponding set partitioning is presented in Figure 5. The probability density function of
(k ) ,
introduced in Section IV, f (0)(k) ( ), is equal to a shifted
version of the scaled convolution of
1 p 2 1 2 X p exp 2 x + 2ik fX (x) = i= 1 2 (0)
10
p
0 x k = 2, with itself, as is shown in Appendix C. There the following theorem is also proven.
Theorem 2 On each level of suboptimal decoding of a multilevel coded QAM signal set with an in nite number of signal points, the Cherno bounding parameter has the value
exp Z (k) = Z1 = min s0 where
! ! spk g(k) (s)2 ; 2
(10)
! p !! 2 p p s k 2sik Q 2ik s Q 2ik s + p = exp 2 g s) = 2 2 i= 1 i sk i 2 p ! ! p X 1 exp 1 ( 1) exp p2 k s 2 2 s 4 k : (11) = s exp p 1 + 2 2 k k i=1 2s2 + 2i (k) (
1 X
The in nite sum in the rst expression for g
(k) (
k
s) in (11) converges faster for large
k than for small k . If it is desirable to have fast convergence when k is small one should use the second expression in (11). Numerical values of Z1 are shown in Table I for a number of dierent = k . Here it can also be seen that Z1 approaches 4Z2 for large , due to the \nearest neighbor error events principle". We conclude that the rough bounds Z2 < ZM < 4Z2 have been tightened to Z4 < ZM < Z1 for
M > 4. In Figure 6 we show by two examples how the bounds on error probability are improved by the use of Z1 instead of 4Z2.
VI. CAPACITY AND CUTOFF RATE OF MULTISTAGE DECODING USING SUBOPTIMAL METRIC Analogously to [4] we consider the transmission on each level of modulation as transmission over an individual channel. But in contrast to [4], where the capacities of these individual channels for optimal decoding on each level of modulation 11
was calculated, we calculate the capacities when suboptimal decoding, described in Section II is used. We consider the QAM signal set with an in nite number of signal points, and as an output of the individual channel we consider the sequence of statistics
(k)
introduced in Section IV.
The capacity for the kth level of the given multilevel QAM system using multistage decoding with suboptimal metric is
H (k) j v(k) = 0 = 1 (0) Z 2k 1 (0) 1 1 (1) (1) = f (k) ( ) + 2 f (k) ( ) log 2 f (k) ( ) + 2 f (k) ( ) d + 2k 2 Z 2k (0) + 2 f (k) ( ) log f (0)(k) ( )d ; C (k ) = H
(k)
(12)
k
where the superscript denotes the transmitted bit, and where the probability density function of
(k)
conditioned on that a one was transmitted satis es f (1)(k) ( )
= f (0)(k) ( ) by symmetry. All signal points are assumed to be equally likely. Considering the ensemble of codes and calculating the expectation of decoding error
(k) = 1 log 1 + Z (k) . The capacity probability gives that the cuto rate is Rcomp 2
and computational cuto rate of a level using QAM with an in nite number of signal points is shown in Figure 7. In this gure we also show the capacity for the same system using optimal metric at each decoding stage calculated in [4]. We conclude that the capacity for the system using suboptimal metric that we considered is close to the capacity for a system using optimal metric and multistage decoding. This indicates that it is not necessary to use the optimal metric in multistage decoding of multilevel QAM. That is, a complexity reduction by use of the suboptimal metric can be achieved at very small loss in capacity. 12
VII. CONCLUSIONS The decoding error probability for multilevel QAM with multistage decoding using suboptimal metric has been analyzed. Rigorous union bounds for error probabilities of component codes have been presented and formulas for the parameter
Z have been derived. To do this the Cherno bounding method, which gives exponentially tight bounds, was applied. This formula yields a Z that gives tighter bounds than the commonly used \nearest neighbor error events principle", but the bounds do not lose their validity as is the case when the often used approximation Z = exp ( 2 =82) is applied. Comparison to traditional bounding techniques for the probability of decoding error has been made for 4-QAM and a QAM signal constellation with an in nite number of signal points. Calculation of capacity and computational cuto rate was made for QAM with an in nite number of signal points. It was concluded that the capacity for the decoding using suboptimal metric is close to the capacity for the system where optimal metric is used. Hence in terms of capacity the need for using optimal metric is small when multilevel QAM with multistage decoding is considered.
13
APPENDIX A. DERIVATION OF THE CHERNOFF BOUNDING PARAMETER
Let r; v0(k) and r; vl(k) be the squared Euclidean distances (metrics) from
r to the transmitted codeword and from r to the lth codeword, l = 1; 2; : : : ; L 1, respectively. The decision is then taken in favor of the codeword whose metric is minimal. From (1) we have N X r; vl(k) = dE r (n) ; vl(k) (n) ;
l = 0; 1; : : : ; L 1;
n=1
where
dE r (n) ; vl(k) (n) =
s(k) (n)2S (k) (n) vl (n)
Then the decision criterion r; v0(k) (l k) = PNn=1 (l k) (n) > < 0, where (k)
l
min (k 1)
dE r (n) ; s(k) (n) :
(13)
(14)
r; vl(k) > < 0, k = 1; 2; : : : ; K becomes
(n) = dE r (n) ; v0(k) (n)
dE r (n) ; vl(k) (n) :
(15)
Without loss of generality we suppose that v0(k) is the all-zero codeword and that vl(k) is a codeword of Hamming weight wl(k). To simplify the analysis, we also change the order of the transmitted symbols, such that the rst wl(k) symbols of vl(k) l are ones. Then the decision statistic (l k) = Pwn=1
(k )
(k ) (
(k ) (
n) is a sum of independent
n). To get an exponentially tight bound for P "(l k) , let us introduce the generating R function of (l k), (l k) (s) = 11 exp (s ) f(lk) ( ) d , where f(l k) ( ) is the probability density function of (l k). Since (l k) is a sum of wl(k) independent identically w(k) distributed random variables we have (l k) (s) = '(k) (s) l , where '(k) (s) = identically distributed random variables
14
R 1 exp (s ) f 1
(k) ( ) d
and f
(k ) ( )
is the probability density function of
(k) (
n).
Using the Cherno bound [11] we get
wl(k) (k ) (k) (k) ( k ) l (s) = min ' (s) P "l = P l 0 min ; s0 s0
(16)
which is equivalent to (3) with
Z (k) = min '(k) (s) : s0
(17)
APPENDIX B. PROOF OF THEOREM 1 Let us consider the last but one level, Figure 3. To study the probability density
n), we introduce the system of coordinates (x; y). In this system the four quadrants de ne 4 regions, where the received point has the same nearest
function of
(K 1) (
point from each set. For example in the rst quadrant the nearest point from S0(K is 2 and from S1(K
2)
2)
it is 1. The noise components in this system of coordinates are
independent Gaussian random variables with zero mean and variance equal to one. The conditional probability that the received point is in the rst quadrant given that v0(K
1)
was transmitted is equal to 1=4. In this quadrant the squared Euclidean
distance from the received point to the nearest point of the set S0(K
(K 1) ( 0
d E r (n) ; v
n) = (X (n)) + Y (n) pK 1 2 2
2)
!2
is (18)
and the squared Euclidean distance from the received point to the nearest point of the set S1(K
2)
is
(K 1)
dE r (n) ; vl
(n) = X (n) pK 1 2 15
!2
+ (Y (n))2 :
(19)
So the dierence between the squared Euclidean distances (15) becomes (K 1) (
p
n) = 2K 1 (X (n) Y (n)) :
Since the probability density function of 1; 2; : : : ; wl(K
1)
(K 1) (
(20)
n) does not depend on n, for n =
we will drop the argument n. The random variables X and Y are
independent. The conditional probability density functions of X and Y given that
v0(K 1) was transmitted and given that the received signal is in the rst quadrant (x 0 and y 0) are 2! 2 x (0) (21) fX (x) = p exp 2 ;
2 0 0 !2 1 !2 1 1 1 1 1 K 1 K 1 (0) A + p exp @ p A ; (22) fY (y) = p exp @ 2 y p 2 y+ 2 2 2 2
and the conditional probability density function of
(K 1)
is a scaled convolution of
the probability density functions of X and Y . If the received signal is in the second, third or fourth quadrant, we get formulas for the representation of
(K 1)
analogous to (20) and for the density functions for X
and Y analogous to (21) and (22), and therefore (20){(22) are valid for all possible received signals. From (20){(22) follows that '(K
'X (s) =
Z1 0
exp
p
2sxK
1
Z1
1) (
s) = 'X (s) 'Y (s) where
fX(0) (x) dx = 2 exp (sK 1)2 Q
p
2sK
1
(23)
p 'Y (s) = exp 2syK 1 fY(0) (y) dy = (24) 0 ! ! 2 K 1 2 K 1 = exp sK 1 (s 1) Q p (2s 1) +exp sK 1 (s + 1) Q p (2s + 1) ; 2 2 and Theorem 1 follows from (17), (23) and (24). 16
APPENDIX C. PROOF OF THEOREM 2 Here we consider a level where the QAM signal set has in nitely many signal
n), we introduce the system of coordinates (x; y), Figure 5. The square regions de ned (k) (
points, Figure 5. To study the probability density function of the statistics
in this gure are regions in which the received point has the same nearest point from
p
p
each set. For example in the region fx; y s.t. 0 < x < k = 2; 0 < y < k = 2g point A is the nearest point from S0(k 1) , and the nearest point from S1(k
1)
is point
B. The noise components in this system of coordinates are independent Gaussian random variables with zero mean and variance equal to one.
p
p
Let us rst study the region fx; y s.t. 0 < x < k = 2; 0 < y < k = 2g. We suppose that the received point is in this region, and that v0(k) was transmitted. Then the squared Euclidean distance from the received point to the nearest reference point is
dE r (n) ; v0(k) (n) = (X (n))2 + (Y (n))2
(25)
and the squared Euclidean distance from the received point to the nearest opposite point is
!2 !2 k k (k) dE r (n) ; vl (n) = X (n) p + Y (n) p :
2 2 So the dierence between the squared Euclidean distances (15) becomes (k ) (
p
n) = 2k (X (n) + Y (n)) 2k :
(26)
(27)
As in Appendix B we drop the argument n since the probability density function of (k) (
n) does not depend on n. The random variables X and Y are independent. The 17
conditional probability density functions of X and Y given that v0(k) was transmitted
p
and given that the received signal is in the region fx; y s.t. 0 < x < k = 2; 0 <
p
y < k = 2g are (see Figure 8)
= mlim !1
1 i= m2 p2 1 2
p
p
1 x + 2i 2 1 k i= m2 p2 exp 2 pk m 2 2 2 m p1 exp 1 0 i= 2 2 2 x + 2ik
fX(0) (x) = fY(0) (x) = mlim !1 R
P m2
P m2
P
p
=
(28)
p 2 p x + 2ik exp 1 2 X k p exp 12 x + 2ik 2 ; = Q p2 (m + 1) i= 1 2
1 2
0 < x < k = 2, and the conditional probability density function of
(k)
is a shifted
version of the scaled convolution of the probability density functions of X and Y . Obviously, for each of the square regions de ned in Figure 5 we can introduce a system of coordinates such that the nearest reference point would have coordinates
p
p
(0; 0) and nearest opposite point coordinates k = 2; k = 2 . The dierence between the squared Euclidean distances satis es (27) where X and Y are coordinates of received point in this system of coordinate. Therefore (28) de nes the conditional probability density functions of X and Y given that v0(k) was transmitted, independently from in which square region the received point is.
p
We have that mins0 '(k) (s) = mins0 '(k) 2k s , and from (17), (27) and p p (28) follows that '(k) 2k s = exp sk = 2 'X (s) 'Y (s), where
'X (s) = 'Y (s) =
Z
pk 2
0
exp (sx) fX(0) (x) dx =
(29)
1 p 2 Z pk2 1 p 2 X 1 2 2 p exp 2 x s 2ik dx exp (ik ) + 2 s 2ik 0 2 i= 1 !! 1 s2 p p p X k 2 si k Q e2 =2 2ik s Q 2ik s + p = g(k) (s) ; 2 i= 1 18
which proves the rst part of Theorem 2. Now let us prove the second part of the theorem. We have from (28) that
1 p 2 1 p 2 1 X 1 fX (x) = p 2ik exp 2 x + 2ik + exp 2 x ; 2 i= 1 (0)
which can be expressed as
! 2 d; 2
Z1 fX (x) = p1 (x; ) exp 2 1 (0)
(30)
p p 2 j 2j k . The Fourier where (x; ) = P1 x + x + k j= 1 series expansion of (x; ) in is
p
p
p
p
1 2 2 X 2 2ix cos 2i : (x; ) = + cos k k i=1 k k
(31)
R p Now, (30){(31) together with the equality 11 exp ( p2x2 qx) dx = exp (q2 =4p2) =p, p > 0 [2] gives
p
p
p
1 X 2 2 2 fX (x) = + cos 2ix exp k k k i=1 (0)
and thus
g (s) =
p
= s2 exp spk 2 k
!
!
Z
pk 2
0
i 2!
exp (sx) fX(0) (x) dx =
i sk p X ( 1) exp p2 1 exp 1 4 2s
1 + k
i=1
and the proof is complete.
19
2s2 +
(32)
k
2i 2 k
2 i k
(33)
;
References [1] E. Biglieri, D. Divsalar, P. J. McLane and M. K. Simon, Introduction to TrellisCoded Modulation with Applications. Macmillan, 1991.
[2] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products. Academic Press, 1980. [3] H. Herzberg, \On the Spectrum of Distances of a Multilevel Code, Decoded by a Multistage Decoder," IEEE Trans. Inform. Theory, vol. IT-43, pp. 1736-1739, Sept. 1997. [4] J. Huber, \Multilevel Codes: Distance Pro les and Channel Capacity," in ITGFachbereicht 130, Oct. 1994, pp. 305-319. Conference Record.
[5] J. Huber, private communication [6] H. Imai and S. Hirakawa, \A New Multilevel Coding Method Using ErrorCorrecting Codes," IEEE Trans. Inform. Theory, vol. IT-23, pp. 371-377, May 1977. [7] Y. Kofman, E. Zehavi and S. Shamai, \Performance Analysis of a Multilevel Coded Modulation System," IEEE Trans. Commun., vol. COM-42, pp. 299312, Feb./Mar./Apr. 1994. [8] G. Ungerbck, \Channel Coding with Multilevel/Phase Signals," IEEE Trans. Inform. Theory, vol. IT-28, pp. 55-67, Jan. 1982.
20
[9] A. J. Viterbi and J. K. Omura, Principles of Digital Communication and Coding. McGraw Hill, 1979.
[10] S. G. Wilson, Digital Modulation and Coding. Prentice Hall, 1996. [11] J. M. Wozencraft and I. M. Jacobs, Principles of Communication Engineering. Wiley, 1965.
21
LIST OF FIGURE CAPTIONS Figure 1: The system model showing the transmitter, the additive white Gaussian noise (AWGN) channel and the multistage receiver.
Figure 2: Set partitioning when K = 4. Figure 3: The signal constellation on the last but one level. Figure 4: Two examples of the error bound (5) for the last but one level in a QAM system. The gure shows the bound when the lower bound in (6) D = Z2 (dashed) and the \nearest neighbor" upper bound for 4-QAM D = 2Z2 (dash-dotted) are used, and when D = Z4 (solid) is applied. (a) T (D) =
D5= (1 2D), (b) T (D) = D10 (D8 4D6 + 5D4 4D2 + 3) = ( D12 + 4D10 4D8 + 2D6 2D4 2D2 + 1).
Figure 5: The signal constellation for M -QAM with M = 1. Figure 6: Two examples of the error bound (5) for a level of a M -QAM system with M = 1. The gure shows the bound when the lower and upper bounds in (6) D = Z2 (dashed) and D = 4Z2 (dash-dotted) are used, and when D = Z1 (solid) is applied. The generating functions are the same as in Figure 4.
Figure 7: Capacity (solid line) and computational cuto rate (dashed line) for QAM with an in nite number of signal points, using suboptimal metric. The capacities for 16-QAM (stars) and a \large" QAM constellation (circles), both using optimal metric [4], are also shown.
Figure 8: Illustration to equation (28). 22
LIST OF TABLE CAPTIONS Table I: Numerical values of Z .
23
u(1) C v(1) 1 (2) (2) v u C 2
u Partition u(3) of information
v(3)
C3 p p p
2K -QAM s r suboptiu^ mal AWGN mapper decoder
u(K ) C v(K ) K Figure 1:
uuuu uuuu uuuu uuuu v(1) (n) = 0 XXv(1) (n) = 1
9
eueu ueue eueu ueue
v(2) (n) = 0 QQv(2) (n) = 1 +
eeee ueue eeee ueue
v
eueu eeee eueu eeee
XXX XX z X
ueue eueu ueue eueu
p
S (1) , 2 = 21
v(2) (n) = 0 QQv(2) (n) = 1 +
ueue eeee ueue eeee
s Q
eeee eueu eeee eueu
S (2) , 3 = 21
n) = 1 n) = 1 n) = 1
Jv (n) = 1
Jv ((3)
Jv ((3)
Jv ((3) (n) = 0
v ( n ) = 0 v ( n ) = 0 v ( n ) = 0
J
J
J e e e e e^e e e e ee u e^u e e u ee e e^e u e e ee e eJ^e e e (3)
(3)
s Q
S (0) , 1
(3)
(3)
(3)
e e u e u e e e e e e e e e e e e e e e e e e e e u e e e e e u (3) e e e e e e e e e u e e e e e u e e u e u e e e e e e e e e e e S , 4 ueee eeue eeee eeee eeee eeee eeeu euee
Figure 2:
24
p
= 2 2 1
received point
6 I x b( y@ @ ((H ( ( H ( Hh1 2@@x((( I @
(K 1) ( 0
d E r (n) ; v
n) = 0
3
@
@
@
@ @
h
@ @
@
dE r (n) ; vl(K
1)
@
a
@ @
K
@ @x 1 @
4
point 2 S0(K 2) e Opposite point 2 S (K 2) 1 u Reference
Figure 3: 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0
Z2 0.9692 0.8825 0.7548 0.6065 0.4578 0.3247 0.2163 0.1353 0.0796 0.0439 0.0228 0.0111 0.0051 0.0022
(n) = 1
Z4 0.9984 0.9793 0.9198 0.8124 0.6686 0.5108 0.3618 0.2379 0.1453 0.0825 0.0437 0.0216 0.0100 0.0043
Z1 Z4=Z2 Z1=Z2 1.0000 1.03 1.03 1.0000 1.11 1.13 0.9997 1.22 1.32 0.9858 1.34 1.63 0.9152 1.46 2.00 0.7740 1.57 2.38 0.5932 1.67 2.74 0.4135 1.76 3.06 0.2636 1.82 3.31 0.1545 1.88 3.52 0.0837 1.92 3.67 0.0420 1.94 3.78 0.0196 1.96 3.86 0.0086 1.98 3.91
Table I:
25
(a)
−2
10
(b) −4
10 −3
10
−6
10 −4
10
−8
10 −5
10
−10
10 −6
10
−12
10 −7
10
−14
10 −8
10
3
4
5
6
3
4
5
6
Figure 4: y
b
B
x
dE r (n) ; vl(k) (n) = 1
j A@ 6 ) @ A @ @ e @@u @@e @@u @u @ A @ @ @ @ @ @ @ @ @ @ @ @ e@ u@ i P @ P e @e @u @ @ P @ @ @ @ @ @ @ @ k @ @ @ @ @ @z @u @e @u @e @u @ @ @ @ u @ @ e @ u @ e @ u @ e @ @ @ @ @ e @ @ @ @ @
@ I @ @
dE r (n) ; v0(k) (n) = 0
B
received point
A
A
a
Reference point 2 S0(k 1) Opposite point 2 S1(k 1)
Figure 5: (a)
−2
10
(b) −4
10 −3
10
−6
10 −4
10
−8
10 −5
10
−10
10 −6
10
−12
10 −7
10
−14
10 −8
10
3
4
5
6
Figure 6: 26
3
4
5
6
bits per level and channel use
1
0.8
0.6
0.4
0.2
0 0 10
1
10
Figure 7:
Figure 8:
27