Final - Volume 3 No 2

  • December 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Final - Volume 3 No 2 as PDF for free.

More details

  • Words: 41,799
  • Pages: 89
ADAPTIVE WIENER FILTERING APPROACH FOR SPEECH ENHANCEMENT M. A. Abd El-Fattah*, M. I. Dessouky , S. M. Diab and F. E. Abd El-samie # Department of Electronics and Electrical communications, Faculty of Electronic Engineering Menoufia University, Menouf, Egypt E-mails: * [email protected] , # [email protected]

ABSTRACT This paper proposes the application of the Wiener filter in an adaptive manner in speech enhancement. The proposed adaptive Wiener filter depends on the adaptation of the filter transfer function from sample to sample based on the speech signal statistics(mean and variance). The adaptive Wiener filter is implemented in time domain rather than in frequency domain to accommodate for the varying nature of the speech signal. The proposed method is compared to the traditional Wiener filter and spectral subtraction methods and the results reveal its superiority. Keywords: Speech Enhancement, Spectral Subtraction, Adaptive Wiener Filter

1

INTRODUCTION

Speech enhancement is one of the most important topics in speech signal processing. Several techniques have been proposed for this purpose like the spectral subtraction approach, the signal subspace approach, adaptive noise canceling and the iterative Wiener filter[1-5] . The performances of these techniques depend on quality and intelligibility of the processed speech signal. The improvement of the speech signal-tonoise ratio (SNR) is the target of most techniques. Spectral subtraction is the earliest method for enhancing speech degraded by additive noise[1]. This technique estimates the spectrum of the clean (noise-free) signal by the subtraction of the estimated noise magnitude spectrum from the noisy signal magnitude spectrum while keeping the phase spectrum of the noisy signal. The drawback of this technique is the residual noise. Another technique is a signal subspace approach [3]. It is used for enhancing a speech signal degraded by uncorrelated additive noise or colored noise [6,7]. The idea of this algorithm is based on the fact that the vector space of the noisy signal can be decomposed into a signal plus noise subspace and an orthogonal noise subspace. Processing is performed on the vectors in the signal plus noise subspace only, while the noise subspace UbiCC Journal - Volume 3

is removed first. Decomposition of the vector space of the noisy signal is performed by applying an eigenvalue or singular value decomposition or by applying the Karhunen-Loeve transform (KLT)[8]. Mi. et. al. have proposed the signal / noise KLT based approach for colored noise removal[9]. The idea of this approach is that noisy speech frames are classified into speech-dominated frames and noise-dominated frames. In the speech-dominated frames, the signal KLT matrix is used and in the noise-dominated frames, the noise KLT matrix is used. In this paper, we present a new technique to improve the signal-to-noise ratio in the enhanced speech signal by using an adaptive implementation of the Wiener filter. This implementation is performed in time domain to accommodate for the varying nature of the signal. The paper is organized as follows: in section II, a review of the spectral subtraction technique is presented. In section III, the traditional Wiener filter in frequency domain is revisited. Section IV, proposes the adaptive Wiener filtering approach for speech enhancement. In section V, a comparative study between the proposed adaptive Wiener filter, the Wiener filter in frequency domain and the spectral subtraction approach is presented.

1

2

SPECTRAL SUBTRACTION

Spectral subtraction can be categorized as a non-parametric approach, which simply needs an estimate of the noise spectrum. It is assume that there is an estimate of the noise spectrum that is typically estimated during periods of speaker silence. Let x(n) be a noisy speech signal :

x ( n) = s ( n) + v ( n)

(1)

where s(n) is the clean (the noise-free) signal, and v(n) is the white gaussian noise. Assume that the noise and the clean signals are uncorrelated. By applying the spectral subtraction approach that estimates the short term magnitude spectrum of the noise-free signal S (ω ) by subtraction of the estimated noise magnitude spectrum Vˆ (ω ) from

A noise-free signal estimate can then be obtained with the inverse Fourier transform. This noise reduction method is a specific case of the general technique given by Weiss, et al. and extended by Berouti , et al.[2,12]. The spectral subtraction approach can be viewed as a filtering operation where high SNR regions of the measured spectrum are attenuated less than low SNR regions. This formulation can be given in terms of the SNR defined as: 2 X (ω ) SNR = Pˆ v (ω )

Thus, equation (3) can be rewritten as: 2 2 Sˆ (ω ) = X (ω ) − Pˆ v (ω )

the noisy signal magnitude spectrum X (ω ) . It is sufficient to use the noisy signal phase spectrum as an estimate of the clean speech phase spectrum,[10]: Sˆ(ω) = ( X (ω) − Nˆ (ω) ) exp(j∠X (ω))

(2)

The estimated time-domain speech signal is obtained as the inverse Fourier transform of Sˆ (ω ) . Another way to recover a clean signal s(n) from the noisy signal x(n) using the spectral subtraction approach is performed by assuming that there is an the estimate of the power spectrum of the noise Pv (ω ) , that is obtained by averaging over multiple frames of a known noise segment. An estimate of the clean signal short-time squared magnitude spectrum can be obtained as follow [8]:

⎧X(ω) 2 −Pˆv(ω), if X(ω) 2 −Pˆv(ω) ≥0 2 ⎪ Sˆ(ω) = ⎨ ⎪ 0, ⎩

otherwise

(3)

2⎡ ≈ X (ω ) 1 +

⎢⎣

j∠X (ω ) Sˆ (ω ) = Sˆ (ω ) e

UbiCC Journal - Volume 3

(4)

⎤ SNR ⎥⎦ 1

−1

(6)

An important property of noise suppression using spectral subtraction is that the attenuation characteristics change with the length of the analysis window. A common problem for using spectral subtraction is the musicality that results from the rapid coming and going of waves over successive frames [13]. 3

WIENER FILTER IN FREQUNCY DOMAIN

The Wiener filter is a popular technique that has been used in many signal enhancement methods. The basic principle of the Wiener filter is to obtain a clean signal from that corrupted by additive noise. It is required estimate an optimal filter for the noisy input speech by minimizing the Mean Square Error (MSE) between the desired signal s(n) and the estimated signal sˆ( n) . The frequency domain solution to this optimization problem is given by[13]: H(ω) =

It is possible combine this magnitude spectrum estimate with the measured phase and then get the Short Time Fourier Transform (STFT) estimate as follows:

(5)

Ps(ω) Ps(ω) + Pv(ω)

(7)

where Ps (ω ) and Pv (ω ) are the power spectral densities of the clean and the noise signals, respectively. This formula can be derived considering the signal s and the noise signal v as

2

Pv (ω ) =σv2

uncorrelated and stationary signals. The signal-tonoise ratio is defined by[13]: SNR =

Ps (ω ) Pˆ v (ω )

(8)

This definition can be incorporated to the Wiener filter equation as follows: 1 ⎤ ⎡ H ( ω ) = ⎢1 + SNR ⎥⎦ ⎣

−1

(9)

The drawback of the Wiener filter is the fixed frequency response at all frequencies and the requirement to estimate the power spectral density of the clean signal and noise prior to filtering. 4

THE PROPOSED ADAPTIVE WIENER FILTER

This section presents and adaptive implementation of the Wiener filter which benefits from the varying local statistics of the speech signal. A block diagram of the proposed approach is illustrated in Fig. (1). In this approach, the estimated speech signal mean mx and variance

σx

2

are exploited. A priori knowledge

SpaceDegraded speech variant x(n) h(n)

(10)

Consider a small segment of the speech signal in which the signal x(n) is assumed to be stationary, The signal x(n) can be modeled by:

x(n) = mx + σxw(n) (11) where mx and σx are the local mean and standard deviation of x(n). w(n) is a unit variance noise. Within this small segment of speech, the Wiener filter transfer function can be approximated by: H (ω ) =

Ps (ω ) Ps (ω ) + Pv (ω )

=

σs

σs + σv

h( n) =

σs

2

σs + σv 2

2

2

δ ( n)

(13)

From Eq.(13), the enhanced speech sˆ( n) within this local segment can be expressed as:

Enhanced speech signal sˆ( n)

= mx +

σs

σs

2

σs + σv 2

2

δ ( n)

2

σs + σv 2

( x(n) − mx )

2

If it is assumed that mx and each sample, we can say:

A priori knowledge

2

(12) From Eq.(12), because H (ω ) is constant over the small segment of speech, the impulse response of the Wiener filter can be obtained by:

sˆ(n) = mx + ( x(n) - mx ) ∗

Measure of Local speech statistics

2

σs

(14) are updated at

σs (n) ( x(n) − mx(n)) σs (n) + σv 2

sˆ(n) = mx (n) +

2

2

(15)

Figure 1: Typical adaptive speech enhancement system for additive noise reduction

It is assumed that the additive noise v(n) is of zero mean and has a white nature with variance 2 of σv .Thus, the power spectrum Pv (ω ) can be approximated by: UbiCC Journal - Volume 3

In

Eq.(15),

the

local

mean

mx(n)

and

( x(n) − mx (n)) are modified separately from segment to segment and then the results are 2 2 combined. If σs is much larger than σv the output signal sˆ( n) is assumed to be primarily due to x(n) and the input signal x(n) is not attenuated. If 2 2 σs is smaller than σv , the filtering effect is performed. 3

Notice that mx is identical to ms when mv is zero. So, we can estimate mx (n) in Eq.(15) from x(n) by:

mˆ s (n) = mˆ x (n) =

1

n+M

(2 M +1)

k =n−M

∑ x(k ) (16)

where ( 2 M + 1) is the number of samples in the short segment used in the estimation. To measure the local signal statistics in the system of Figure 1, the algorithm developed 2 uses the signal variance σs . The specific method used to designing the space-variant h(n) is given by (17.b). 2 2 2 Since σx = σs + σv may be estimated from x(n) by:

In the first experiment , all the abovementioned algorithms are carried out on the Handle signal with different SNRs and the output PSNR results are shown in Fig. (2). The same experiment is repeated for the Laughter and Gong signals and the results are shown in Figs.(3) and (4), respectively. From these figures, it is clear that the proposed adaptive Wiener filter approach has the best performance for different SNRs. The adaptive Wiener filter approach gives about 3-5 dB improvement at different values of SNR. The nonlinearity between input SNR and output PSNR is due to the adaptive nature of the filter.

80 70

2 2 2 2 ⎧σˆx (n) − σˆv , if σˆx (n) > σˆv σˆs (n) = ⎨ otherwise ⎩0,

(17.a) Where

1 σˆx (n) = (2 M + 1) 2

n+ M

∑ ( x(k ) − mˆ (n)) x

O u tp u t P S N R (d B )

2

60 50 40 30

2

20

k =n−M

Spectral Subtraction Wiener Filter Adaptive Wiener Filter

(17.b) By this proposed method, we guarantee that the filter transfer function is adapted from sample to sample based on the speech signal statistics. 5

EXPERIMENTAL RESULTS

For evaluation purposes, we use different speech signals like the handel, laughter and gong signals. White Gaussian noise is added to each speech signal with different SNRs. The different speech enhancement algorithms such as the spectral subtraction method, the Weiner filter in frequency domain and the proposed adaptive Wiener filter are carried out on the noisy speech signals. The peak signal to noise ratio (PSNR) results for each enhancement algorithm are compared.

UbiCC Journal - Volume 3

10 0 -10

-5

0

5

10 15 20 Input SNR (dB)

25

30

35

Figure 2: PSNR results for white noise case at-10 dB to +35 dB SNR levels for Handle signal

4

reveal that the best performance is that of the proposed adaptive Wiener filter.

O u tp u t P S N R (d B )

40

A m p lit u d e

30 20

1 0 -1

-5

0

5

10 15 20 Input SNR (dB)

25

30

2000 4000 6000 8000 (a)

0 -1

0

2000 4000 6000 8000 (c) A m p lit u d e

0 -10

0

1

Spectral Subtraction Wiener Filter Adaptive Wiener Filter

10

A m p lit u d e

A m p lit u d e

50

A m p lit u d e

60

35

1 0 -1

0

2000 4000 6000 8000 (b)

0

2000 4000 6000 8000 (d)

1 0 -1

1 0 -1

0

2000 4000 6000 8000 (e) Time(msec)

Figure 3: PSNR results for white noise case at -10 dB to +35 dB SNR levels for Laughter signal

0

-20

O u tp u t P S N R (d B )

40 30

0 1000 2000 3000 4000 (a)

Spectral Subtraction Wiener Filter Adaptive Wiener Filter -5

0

5

10 15 20 Input SNR (dB)

25

30

-40 0 1000 2000 3000 4000 (c)

35

Figure 4: PSNR results for white noise case at -10 dB to +35 dB SNR levels for Gong signal

The results of the different enhancement algorithms for the handle signal with SNRs of 5, 10,15 and 20 dB in the both time and frequency domain are given in Figs. (5) to (12). These results

UbiCC Journal - Volume 3

Amplitude (dB)

10

-40 0 1000 2000 3000 4000 (b) 0

-20

-20

20

0

-20

-40

50

Amplitude (dB)

0

Amplitude (dB)

60

Amplitude (dB)

70

Amplitude (dB)

SNR = +5dB (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering.

80

0 -10

Figure 5: Time domain results of the Handel sig. at

-40 0 1000 2000 3000 4000 (d)

0

-20 -40 0 1000 2000 3000 4000 (e) Freq.(Hz)

Figure 6:The spectrum of the Handel sig. in Fig.(5) (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering.

5

2000 4000 6000 8000 (a)

1 0

2000 4000 6000 8000 (c)

-1

0 0

-1

0

0 1000 2000 3000 4000 (a) 0 -20

-40 0 1000 2000 3000 4000 (c)

-40 0 1000 2000 3000 4000 (b) 0 -20 -40 0 1000 2000 3000 4000 (d)

0 -20 -40 0 1000 2000 3000 4000 (e)Freq. (Hz)

Figure 8: The spectrum of the Handel sig. in Fig.(7) (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering.

UbiCC Journal - Volume 3

0

2000 4000 6000 8000 (b)

0

2000 4000 6000 8000 (d)

1 0 -1

2000 4000 6000 8000 (e) Time(msec)

Figure 9: Time domain results of the Handel sig. at SNR = 15 dB (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering. Amplitude (dB) Amplitude (dB)

-40

-20

0

0 -1

Amplitude (dB) Amplitude (dB)

-20

0

-1

1

2000 4000 6000 8000 (e) Time (msec)

Amplitude (dB) Amplitude (dB)

0

0

2000 4000 6000 8000 (c)

0 0

1

2000 4000 6000 8000 (a)

0

2000 4000 6000 8000 (d)

Figure 7: Time domain results of the Handel sig. at SNR = 10 dB (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering.

Amplitude (dB)

0

1

1

-1

Amplitude (dB) Amplitude (dB)

A m p lit u d e

A m p lit u d e

2000 4000 6000 8000 (b)

1

-1

0

A m p lit u d e

0

A m p lit u d e

-1

0

1

0 -20 -40

0 1000 2000 3000 4000 (a) 0 -20

-40 0 1000 2000 3000 4000 (c) Amplitude (dB)

0

-1

A m p lit u d e

A m p lit u d e

-1

0

A m p lit u d e

0

1

A m p lit u d e

A m p lit u d e

A m p lit u d e

1

0 -20 -40 0 1000 2000 3000 4000 (b) 0 -20 -40 0 1000 2000 3000 4000 (d)

0 -20 -40 0

1000 2000 3000 4000 (e)Freq. (Hz)

Figure 10: The spectrum of the Handel sig. in Fig.(9) (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering.

6

A m p lit u d e

A m p lit u d e

-1

0

0

REFERENCES

-1

0

0

2000 4000 6000 8000 (b)

0

2000 4000 6000 8000 (d)

1 0

-1

2000 4000 6000 8000 (c) A m p lit u d e

0

2000 4000 6000 8000 (e) Time(msec)

0

-20

-40

Amplitude (dB)

Amplitude (dB)

0

-20 0 1000 2000 3000 4000 (a) 0

-40 0 1000 2000 3000 4000 (b)

Amplitude (dB)

Amplitude (dB)

Figure 11: Time domain results of the Handel sig. at SNR = 20 dB (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering.

0

-20

-20

Amplitude (dB)

-40 0 1000 2000 3000 4000 (c)

-40 0 1000 2000 3000 4000 (d)

0

-20 -40 0

CONCLUSION

1

0

0

-1

2000 4000 6000 8000 (a)

1

-1

1

An adaptive Wiener filter approach for speech enhancement is proposed in this papaper. This approach depends on the adaptation of the filter transfer function from sample to sample based on the speech signal statistics(mean and variance). This results indicates that the proposed approach provides the best SNR improvement among the spectral subtraction approach and the traditional Wiener filter approach in frequency domain. The results also indicate that the proposed approach can treat musical noise better than the spectral subtraction approach and it can avoid the drawbacks of Wiener filter in frequency domain .

0

A m p lit u d e

A m p lit u d e

6 1

1000 2000 3000 4000 (e)Freq. (Hz)

Figure 12: The spectrum of the Handel sig. in Fig.(11) (a) original sig. (b) noisy sig. (c) spectral subtraction. (d) Wiener filtering. (e) adaptive Wiener filtering.

UbiCC Journal - Volume 3

[1] S. F. Boll: Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-27,. pp. 113-120 (1979). [2] M. Berouti, R. Schwartz, and J. Makhoul: Enhancement of speech corrupted by acoustic noise, Proc. IEEE Int. Conf. Acoust., Speech Signal Processing, pp. 208-211 (1979). [3] Y. Ephriam and H. L. Van Trees: A signal subspace approach for speech enhancement, in Proc. International Conference on Acoustic, Speech and Signal Processing, vol. II, Detroit, MI, U.S.A., pp. 355-358, May (1993). [4] Simon Haykin: Adaptive Filter Theory, Prentice-Hall, ISBN 0-13-322760-X, (1996). [5] J. S. Lim and A. V. Oppenheim.: All-pole Modelling of Degraded Speech, IEEE Trans. Acoust., Speech, Signal Processing, ASSP-26, June (1978). [6] Y. Ephraim and H. L. Van Trees, A spectrallybased signal subspace approach for speech enhancement, in IEEE ICASSP, pp. 804-807 (1995). [7] Y. Hu and P. Loizou: A subspace approach for enhancing speech corrupted by colored noise, in Proc. International Conference on Acoustics, Speech and Signal Processing, vol. I, Orlando, FL, U.S.A., pp. 573-576, May (2002). [8] A. Rezayee and S. Gazor: An adaptive KLT approach for speech enhancement, IEEE Trans. Speech Audio Processing, vol. 9, pp. 87-95 Feb. (2001). [9] U. Mittal and N. Phamdo: Signal/noise KLT based approach for enhancing speech degraded by colored noise, IEEE Trans. Speech Audio Processing, vol. 8, NO. 2, pp. 159-167,(2000). [10] John R. Deller, John G. Proakis, and John H. L. Hansen. Discrete- Time Processing of Speech

7

Signals. Prentice-Hall, ISBN 0-02-328301-7 (1997). [11] S. F. Boll: Suppression of Acoustic Noise in Speech Using Spectral Sub- traction. IEEE Trans. Acoustics, Speech, and Signal Processing. vol. ASSP-29. no. 2, pp. 113-120, April (1979). [12] M. R. Weiss, E. Aschkenasy, and T. W. Parsons: Processing Speech Signal to Attenuate Interference, in Proc. IEEE Symp. Speech Recognition, pp. 292-293, April (1974). [13] J. S. Lim and A. V. Oppenheim: Enhancement and band width compression of Noisy speech, Proc. of the IEEE, vol. 67, No..12, pp. 1586-1604, Dec. (1979).

UbiCC Journal - Volume 3

8

FREQUENCY SELECTIVITY PARAMETERS ON MULTI-CARRIER WIDEBAND WIRELESS SIGNALS Víctor M Hinostroza, José Mireles and Humberto Ochoa Institute of Engineering and Technology, University of Ciudad Juárez Valle del tigris # 3247,Ciudad Juárez Chihuahua México C. P. 32306 [email protected], [email protected], [email protected]

ABSTRACT. This work is a study of the effects of frequency selectivity on multi-carrier wideband signals in three different environments; indoors, outdoor to indoor and outdoors. The investigation was made using measurements carried out with a sounder with a 300 MHz bandwidth. The main part of this work is related to evaluate the contribution of several parameters; frequency selective fading, coherence bandwidth and delay spread on the frequency selectivity of the channel. A description of the sounder parameters and the sounded environments are given. The 300 MHz bandwidth is divided in segments of 60 kHz to perform the evaluation of frequency selective fading. Sub channels of 20 MHz for OFDM systems and 5 MHz for WCDMA were evaluated. Figures are provided for a number of bands, parameters and locations in the three environments. It is also shown the variation of the signal level due to frequency selective fading. The practical assumptions about the coherence bandwidth and delay spread are reviewed and a comparison is made with actual measurements. Statistical analysis was performed over some of the results. Keywords. modulation .

Coherence bandwidth, frequency correlation, frequency selective fading and multi-carrier

I. INTRODUCTION. To simulate and evaluate the performance of a wireless mobile system a good channel model is needed. Mobile communication systems are using larger bandwidths and higher frequencies and these characteristics impose new challenges on channel estimation. The channel models that have been developed for the mobile systems in use may not be applicable anymore. To validate that the old models can be used for future systems or to design new models, it is necessary to answer the question about how the same parameters performs at higher bandwidths? Also, we have to be able to measure and validate some parameters and compare them to well known practical assumptions. Measurements for analysis of the fading statistics at common frequencies have been performed before, but they have been performed at small bandwidths, it is necessary to update the models with higher bandwidths. As the data rate (the bandwidth) increases the communication limitations come from the Inter Symbol Interference (ISI) due to the dispersive

UbiCC Journal - Volume 3

characteristics of the wireless communications channel. The dispersive channel characteristics arise from the different propagation paths, i.e. multipath, between the receiver and the transmitter. This dispersion could be measured, if we could measure the channel impulse response (CIR). As a general rule the effects of ISI on the transmission errors is negligible if the delay spread is significantly shorter than the duration of the transmitted symbol. Due to the expected increase in demand of higher data rates, wideband multicarrier systems such as; OFDM and WCDMA are expected to be technologies of choice [1], [12] and [14]. This is because these two technologies can provide both; high data rates and an acceptable level of quality of service. However, these systems need first to address better the problem regarding channel prediction or estimation, because this condition is the main boundary for higher data rates. The study of correlation of the mobile radio channel in frequency and time domains has helped to understand the problem of channel estimation. One of this work objectives is to evaluate frequency selective fading (FSF) in several environments. This work begins with the results of

9

measurements made with a sounder that uses the chirp technique for sounding.

E{H (t1 ; f1 ) * H (t2 ; f 2 )} = ∞ ∞

∫ ∫ E{h(t ;τ ) * h(t ;τ )}e 1

Multipath fading channels are usually classified into flat fading and frequency selective fading according to their coherence bandwidth relative to the one of the transmitted signal. Coherence bandwidth is defined as the range of frequencies over which two frequency components remain in a strong amplitude correlation. Physically, it defines the range of frequencies over which the channel can be considered “flat”. The analytic issue of coherence bandwidth was first studied by Jakes [1] where by assuming homogeneous scattering, his work revealed that the coherence bandwidth of a wireless channel is inversely proportional to its root-mean-square (rms) delay spread. The same issue was subsequently studied by various authors [4], [8], [9], [10]. Since many practical channel environments can significantly deviate from the homogeneous assumption, various measurements were conducted to determine multipath delay profiles and coherence bandwidths [19], [20], [21], [22], aiming to obtain a more general formula for coherence bandwidth. In this work the variations of this formula are reviewed and compared with actual results and a comparison is provided. The rest of this document is structured as follow; in part II the theoretical foundations of the channel impulse response frequency selective fading and coherence bandwidth are reviewed. Also in this part, the characteristics of the three environments sounded are described. In part III, the frequency selective fading evaluation and analysis are presented. Plots of the dependency of fading deep and frequency separation of two specific points in the response are studied. At part IV, data about the relationship between delay spread and coherence bandwidth are provided. At the end in part V, conclusions and future work are mentioned. II. MATHEMATICAL BACKGROUND 2.1 The wideband channel model. The radio propagation channel is normally represented in terms of a time-varying linear filter, with complex low-pass impulse response, h(t, τ). Its time-varying low-pass transfer function is [4] [6] [8] [10]: ∞

H (t , f ) =

− j 2πf ∫ h(t;τ )e τ dτ

−∞

(1) Where τ represents delay, using (1) the frequency correlation function for the channel can be written as:

UbiCC Journal - Volume 3

1

2

2

− 2πf1τ 1 + j 2πf 2τ 2

e

dτ 1dτ 2

− ∞− ∞

(2) By considering the channel to have uncorrelated scattering (US) and to be wide sense stationary (WSS), the subscript for τ is eliminated and f1 and f2 can be replaced by f + ∆f and t1 and t2 replaced by t + ∆t, then: ∞

∫R

R H (∆t ; ∆f ) =

h

(∆t ;τ )e − j 2π∆fτ dτ

−∞

(3) In (3) RH and Rh represents the correlation of random variations in the channels transfer function and its impulse response respectively. If there are US, then ∆t is 0 then:

{

Rh (0;τ ) = E h (0;τ )

2

}= E {h(τ ) } 2

(4) substituting into (3) gives:

∫ E {h(τ )



R H ( ∆f ) =

2

}e

− j 2π∆fτ



−∞

(5)

{

E h (τ )

2

}

is the average Power Delay where Profile PDP of the channel. So, under the above conditions, RH is the Fourier transform of the average PDP. 2.2 Coherence bandwidth. The multipath effect of the channel, the arrival of different signals in different time delays causes the statistical properties of two signals of different frequencies to become independent if the frequency separation is large enough. The maximum frequency separation for which the signals are still strongly correlated is called coherence bandwidth (Bc). Besides to contribute to the understanding of the channel, the coherence bandwidth is useful in evaluating the performance and limitations of different modulations and diversity models. The coherence bandwidth of a fading channel is probed by sending two sinusoids, separated in frequency by ∆f = f1- f2 Hz, through the channel. The coherence bandwidth is defined as ∆f, over which the cross correlation coefficient between r1 and r2 is greater than a preset threshold, say, η0= 0.9. Namely:

10

Cov( r1, r 2) = η0 var(r1) var(r 2)

C r1,r 2 = (6)

Then, using (2) ∞



0

0

R( s, τ ) = r1r 2 = ∫

∫ r1r 2 p(r1, r 2)dr dr 1

2

(7) Where p(r1,r2) is 2π 2π

p( r1, r 2) =

∫ ∫ p(r1, r 2, θ , θ 1

2

)dθ1dθ 2

(8)

It is possible to see in this expression that the correlation decreases with frequency separation. This formula has been substituted by several practical expressions some of them are the following [4], [8], [9], [10].

BC =0.9 =

⎡ r + r ⎤ ⎛ r1r 2 λ ⎞ r1r 2 I ⎜ exp⎢− ⎟ 2 2 ⎥ 0⎜ 2 ⎟ µ (1 − λ ) ⎣ 2 µ (1 − λ ) ⎦ ⎝ µ 1 − λ ⎠ 2 1

2 2

2

Where I0(x) is the modified Bessel function of zero order. Then, substituting (8) in (7) and integrating

π

1 1 R( s,τ ) = b0 F ( − ,− ;1; λ2 ) 2 2 2 this may also be expressed as

π

⎛ λ2 ⎞ R( s, τ ) = b0 ⎜⎜1 + ⎟⎟ 2 ⎝ 4⎠

[r

2 1

(10)

R ( s, τ ) − r1 r 2 − r1

2

][ r

2 2

⎛2 λ ⎞ ⎟ R( s,τ ) = b0 (1 + λ ) E ⎜⎜ ⎟ ⎝1+ λ ⎠

− r2

BC =0.9

1

(13) BC =0.5 =

1

(14)

50σ rms 5σ rms 1 1 (15) BC = (16) = 8σ mean 2πσ rms

In general

BC =

k

σ rms

(17)

It will be shown, comparing with practical measurements that none of these expressions are accurate and it is difficult to obtain a comprehensive expression for all environments.

(9)

ρ ( s, τ ) =

(12)

0

0

=

⎛2 λ ⎞ π ⎟− (1 + λ ) E ⎜⎜ 1 + λ ⎟⎠ 2 ⎝ ρ ( s, τ ) = π 2− 2 2 J (ω τ ) = λ2 = 0 2m 2 1+ s σ

2

] (11)

Where E(x) is the complete elliptic integral of the second kind. The expansion of the hyper geometric function gives a good approximation to (9). After several reductions and considerations, the correlation coefficient becomes

2.3 Sounder systems environment description.

characteristics

and

The sounder system used to make the measurements of this work was developed at UMIST in Manchester UK and is described in [2] and [3]. This sounder uses the FMCW or chirp technique. The generated chirp consists of a linearly frequency modulated signal with a bandwidth of 300 MHz and a carrier frequency of 2.35 GHz. The chirp repetition frequency is 100 Hertz, which allows having 50-Hertz Doppler range measurements. The receiver has the same architecture than the transmitter. But in the receiver, the generated chirp is not transmitted but mixed with the incoming signal from the antenna, which are the multi-path components of the transmitted chirp. This mixing allows having the multi-path components at low frequencies, these low frequencies can be sampled, digitized and stored in a computer to perform the required analysis. The three environments where the measurements took place were the following: 1) Indoors, in

UbiCC Journal - Volume 3

11

different floors around a building in an eight stories building. Each floor form a rectangle with four long corridors; two 66 m long and two 86 m long, the width of the corridors is 3 m, the total covered area was 912 square meters and the ceiling height is 5 meters. The transmitter was static and the receiver was moved around the corridors and in different floors, measurements were taken at specific distances in each corridor. 2) From a building to different building. These two buildings are eight stories high, they have about the same high and they are separated by about 200 meters. The transmitter was located in the top of one of the buildings and the receiver was moved around specific locations inside the second building in all eight different floors. 3) An urban environment around the city center, a commercial area with several high rising buildings. Each location in each environment was sampled during one second and 100 impulse responses were stored for this specific location. When the measurements were taken, every location was sampled with 100 impulse responses, an impulse response (IR) was taken on that specific location every 10 milliseconds. III. FREQUENCY SELECTIVITY PARAMETERS To carry out the calculation of frequency selective fading, each processed IR was split in samples of 60 kHz, which was the minimum sampled frequency, each 300 MHz bandwidth was sampled 5000 times every 10 milliseconds, this means that a sample was taken every 2 microseconds or every 60 kHz. Each IR was averaged over the complete second, meaning that every frequency was averaged 100 times for each IR. The level of the frequency response of each 60 kHz segment was calculated and recorded In each environment the channel transfer function for about 50 different locations were calculated and recorded. Each transfer function has 5000 sampled frequencies, i.e. 5000 segments for each location. Every sample represents the signal level on that segment, relative to the maximum over the complete 300 MHz bandwidth. The outdoors environment has a bandwidth of 120 MHz, the indoors and indoor to outdoor environments has 300 MHz bandwidth. The fading characteristics in each of the environments were calculated. For each of the locations the fading characteristics that were calculated were; average delay spread, RMS delay spread, coherence bandwidth, channel transfer function and frequency correlation. Using the channel transfer function for each IR, specific fading characteristics for sub channels of 5 and 20

UbiCC Journal - Volume 3

MHz were calculated. Figure 1 shows a typical channel transfer function. In figure 2 are shown the fading characteristics for a 20 MHz sub channel, in this figure there are 15 different lines, each one corresponds to a 20 MHz sub channel in a 300 MHz bandwidth. To get this figure, the following has been done; first, the fading information of the complete 300 MHz bandwidth was divided in 15 segments of 20 MHz each. Then, each point on that segment, that represents a 60 kHz sample, was measured and the result was compared to the next sample, then the next sample was compared and so on up to the complete 20 MHz bandwidth was compared. After that, a second 20 MHz sub channel was applied the same procedure and so on up to the end of the 300 MHz bandwidth. To form figure 3, the same procedure was followed, but in this case the sub channel bandwidth was of 5 MHz, then this figure has 60 different lines which come from the 300 MHz total bandwidth. Figure 2 shows that the maximum fading deep within the 20 MHz sub channel is lower than 14 dB in all the bandwidth. On the other hand, in the 5 MHz bandwidth the maximum fading deep was of 18 dB. It is possible to see in figure 2, that most of the lines follows a pattern, which means that the fading in all sub channels is about the same. In figure 2, most of the lines stay below 5 dB and only a few lines go higher than 6 dB; this could mean that deep fading in this bandwidth is rare.

Figure 1. Typical channel transfer function. Figure 3 shows the calculations of fading characteristics for the second environment with 15 different 20 MHz sub channels, this figure shows that there are more dispersion of the lines, which means there are more deep fading in the responses of the IR. Since this environment is the measurement of the propagation for the penetration of the signal in different floors in a building, higher delay spread, (time dispersion) than the former environment was expected and therefore more fading.

12

Another way to look at the statistics of the fading is to calculate the CDF of this parameter. To make the calculations of these CDF’s figures, the mean of all locations in the involved environment were used. Figure 4 shows the CDF of the building-tobuilding environment for both, the 20 and 5 MHz bandwidths, this figure shows that the fading deep for a 20 MHz sub channel is below 7 dB for 90% of the time. On the other hand, for the 5 MHz sub channel, the fades are below 5 dB for 90% of the time.

Figure 2 Indoor to outdoor fading Characteristics for a 20 MHz sub channel

correlation coefficient, the coherence bandwidth (Bc) is lower than 10 MHz most of the locations. This is corroborated in figure 6, this figure shows the average Bc for all locations in the indoors environment. Figure 7, shows the RMS delay spread for all locations for the same environment. Quick calculations comparing figure 11 results and expression (13) show that, few calculated values of the versions of expression (13) match with the measured values of figure 7.

Figure 4. Fading CDF for indoor to outdoor for a 5 MHz sub channel

Figure 3. Outdoor to indoor fading Characteristics For a 5 MHz sub channel Figure 5. Coherence bandwidth for indoors IV. COHERENCE BANDWIDTH EVALUATION. Figure 5 shows the frequency correlation of all locations in the indoors environment. To make this figure the following was done; first the PDP of all locations was calculated. Then a Fourier transform was performed on the PDP, which gave us the frequency correlation for all locations. Then the frequency correlation for each location was plotted in figure 5. On this figure, the thick and dashed line is the line for the maximum coherence bandwidth, when the transmitter and receiver are connected directly. In figure 5, one can see that at 0.9

UbiCC Journal - Volume 3

13

Figure 6. Average of coherence bandwidth for indoors Figure 8 shows the frequency correlation for the outdoor to indoor environment, this figure shows that in this environment the Bc at frequency correlation of 0.9 is higher than the indoor environment, although the delay spread is not different is both environments. Figure 9, shows the average Bc for the outdoors to indoors environment, one can see in this figure, that the coherence bandwidth is higher than the indoor environment, which was an expected result, but the difference is higher than expected. In indoors the coherence bandwidth is not bigger than 20 MHz in average. In the other hand, in the outdoor to indoor, the average is about 100 MHz, here is relation of 5 to 1. The difference in RMS delay spread is 100 nS versus 200 nS, there is a relation of 2 to 1.

Figure 8. Coherence bandwidth for outdoor to indoor

Figure 9. Average of coherence bandwidth for indoor to outdoor Figure 7. RMS delay spread for indoors Figure 11 shows the frequency correlation for the outdoors environment. Figure 12, shows the Bc at frequency correlation of 0.9. In this case the Bc can not be compared to the Bc for the other two environments, since in this environment a lower bandwidth is evaluated, 120 MHz instead of 300 MHz. Despite this difference and observing figures 11 and 12, Bc is not significantly lower even when we have higher distances and higher delay spread. In outdoors the Bc is not bigger than 2 MHz in average. In the other hand, the RMS delay spread is 1.5 µS in average.

UbiCC Journal - Volume 3

Table 1, shows the comparisons of Bc for the three environments with the different versions of expressions 13 -16 and measured results. This table shows that the values of the expressions are always lower than the measured results, which induce to conclude that the expressions were underestimated, at least in these environments. Moreover, it is possible to conclude that these expressions were deduced with not enough measured results. Also, table 1 show that the relationship between delay spread and coherence bandwidth, not necessarily is a single constant.

14

Figure 11. Coherence bandwidth for outdoors

Figure 10. RMS delay spread for outdoors to indoors V. CONCLUSIONS. In this work the results of analysis of frequency selective fading on two indoor and one outdoor environment have been presented. The three environments analyzed demonstrate that the fading is within specific limits, these results could help to the designers of adaptive receivers to estimate the channel more accurately. The division of the channel impulse bandwidth in segments of 20 and 5 MHz bandwidths, allow the calculation of fading in the bandwidth of interest for OFDM and WCDMA transmission. Plots of the frequency selective fading will help for this assessment. The analysis of coherence bandwidth show that the expressions accepted in the literature for its calculation are not accurate and the accepted direct relationship between delay spread and coherence bandwidth is not simple. Also, additional work is require on try to determine how much the combined effect of Doppler spread, time variability and frequency offsets affects the transmission on multi-carrier signals as the ones on OFDM y CDMA

Figure 12. Average of coherence bandwidth for outdoors Table 1. Coherence bandwidth calculations Value from Indoors Outdoors Outdoors to indoors (13) 400 kHz 200 kHz 30 kHz (14) 4 MHz 2 MHz 300 kHz (15) 3.3 MHz 2.5 MHz 250 kHz (16) 3.2 MHz 1.6 MHz 212 kHz Measured0.9 5.3 MHz 12 MHz 300 kHz Measured0.5 19 MHz 72 MHz 5.6 MHz

Figure 13. RMS delay spread for outdoors References. 1. 2.

UbiCC Journal - Volume 3

Jakes W. C., Microwave mobile communications, (Wiley, 1974). Aurelian B, Gessler F, Queseth O, Stridh R, Unbehaun M, Wu J, Zander J, Flament

15

3.

4.

5. 6.

7.

8. 9. 10. 11. 12.

13.

14.

15.

M. “4th-Generation Wireless Infrastructures: Scenarios and Research Challenges”, IEEE Personal Communications Magazine, 8(6), 25-31, Dec 2001. Salous S, Hinostroza V,” Bi-dynamic indoor measurements with high resolution sounder”, 5th. International Symposium on wireless multimedia Communications, Honolulu Hawaii USA, October 2002. Golkap H., “ Characterization of UMTS FDD channels ”, PhD Thesis, Department of Electrical Engineering and Electronics, UMIST, UK 2002 Lee W. C. Y., Mobile Communication Engineering, (McGraw-Hill, 1998). Bello P.A., “Characterization of randomly time-variant linear channels”, IEEE Transactions on Communications Systems, December 1963, pp. 360-393. Hehn T., Schober R.m and Gerstacker W., “Optimized Delay Diversity for Frequency Selective Fading Channels”, IEEE Transaction on Wireless communications, September 2005, Vol. 4, No. 5, pp. 22892298. Hashemi H., “The indoor radio propagation channel”, IEEE Proceedings, Vol. 81, No. 81, July 1993, pp. 943-967. Lee W. C. Y., Mobile Communication Engineering, (McGraw-Hill, 1999) Rappaport T. S., Wireless communications, (Prentice-Hall, 2002, 2nd ed.) Parsons J. D., The mobile radio propagation channel, (Wiley, 2000). Morelli M., Sanguinetti L. and Mengali U., “Channel Estimation for Adaptive Frequency Domain Equalization ”, IEEE Transaction on Wireless communication, September 2005, Vol. 4, No. 5, pp. 25082518. Salous S., and Hinostroza V., “ Bidynamic UHF channel sounder for Indoor environments “, IEE ICAP 2001, pp. 583587 Biglieri E., Proakis J. and Shamai S., “Fading Channels: Information Theoretic and Communications Aspects”, IEEE Transactions on Information Theory, October 1998, Vol. 44 , No. 6, pp. 26192692. Al-Dhahir N., “Single Carrier Frequency Domain Equalization for Space-Time Blok-Coded Transmission over Frequency Selective Fading Channels”, IEEE Communications Letters, July 2001, Vol. 5, No. 7, pp. 304-306.

UbiCC Journal - Volume 3

16. Namgoong N, and Lehnert J., “ Performance of DS/SSMA Systems in Frequency Selective Fading”, IEEE Transaction on Wireless communication, April 2002, Vol. 1, No. 2, pp. 236-244. 17. TA0 X., et. al., “ Channel Modeling of Layered Space-Time Code Under Frequency Selective fading Channel”, Proceedings of ICCT2003, May 2003, Berlin Germany. 18. Shayevitz O. and Feder M.,” Universal Decoding for Frequency Selective Fading”, IEEE Transactions on Information Theory, August 2005, Vol. 51 , N0. 8, pp. 2770-2790. 19. Sánchez M and García M, RMS Delay and Coherence Bandwidth Measurements in Indoor Radio Channels in the UHF Band, IEEE Transactions on Vehicular Technology, vol. 50, no. 2, march 2001 20. Jia-Chin Lin, Frequency Offset Acquisition Based on Subcarrier Differential Detection for OFDM Communications on Doubly-Selective Fading Channels, 21. Yoo D. and Stark W. E., Characterization of WSSUS Channels: Normalized Mean Square Covariance, IEEE Transactions on Wireless Communications, vol. 4, no. 4, july 2005.

16

Continuous Reverse Nearest Neighbor Search Lien-Fa Lin*, Chao-Chun Chen Department of Computer Science and Information Engineering National Cheng-Kung University, Tainan, Taiwan, R.O.C. Department of Information Communication Southern Taiwan University of Technology, Tainan, Taiwan, R.O.C. [email protected],[email protected]

ABSTRACT The query service for the location of an object is called Location Based Services (LBSs), and Reverse Nearest Neighbor (RNN) queries are one of them. RNN queries have diversified applications, such as decision support system, market decision, query of database document, and biological information. Studies of RNN in the past, however, focused on inquirers in immobile status without consideration of continuous demand for RNN queries in moving conditions. In the environment of wireless network, users often remain in moving conditions, and sending a query command while moving is a natural behavior. Availability of such service therefore becomes very important; we refer to this type of issue as Continuous Reverse Nearest Neighbor (CRNN) queries. Because an inquirer’s location changes according to time, RNN queries will return different results according to different locations. For a CRNN query, executing RNN search for every point of time during a continuous query period will require a tremendously large price to pay. In this work, an efficient algorithm is designed to provide precise results of a CRNN query in just one execution. In addition, a large amount of experiments were conducted to verify the above-mentioned method, of which results of the experiments showed significant enhancement in efficiency. Keywords: Location Based Services, Location-Dependent Query, Continuous Query, Reverse Nearest Neighbor Query, Continuous Reverse Nearest Neighbor Query 1

INTRODUCTION

As wireless network communications and mobile device technology develop vigorously and positioning technology matures gradually, LBS is becoming a key development in the industrial as well as academic circles [2, 5, 13, 21, 26, 27]. According to the report of “IT Roadmap to a Geospatial Future” [6], LBSs will embrace pervasive computing and transform mass advertising media, marketing, and different societal facets in the upcoming decade. Despite the fact that LBSs have been existing in the traditional calculation environment (such as Yahoo! Local), its greatest development potential lies in the domain of mobile computing that provides freedom of mobility and access to information anywhere possible. LBSs shall become an indispensable application in mobile network as its required technology has matured and 3G wireless communication infrastructure is expected to be deployed everywhere. The query that answers to LBSs is referred to as

UbiCC Journal - Volume 3

Location-Dependent Query (LDQ), of which applications include Range Query, Nearest Neighbor (NN) query, K-Nearest Neighbor (KNN) query, and Reverse Nearest Neighbor (RNN) query. There are plenty of studies about NN [14, 22, 26], KNN [4, 9, 14, 23, 25], CNN [17, 3, 12, 20], and CKNN [17, 20] queries, and issues pertaining to Reverse Nearest Neighbor (RNN) Query [10, 11, 16, 18, 19, 22, 24] have been receiving attention in recent years. RNN query means finding a collection of nearest neighbor objects for S, a given collection of objects, with q, a given query object. Practical examples of RNN query are provided in [10]. If a bank is planning to open a new branch, and its clients prefer a branch on a nearest possible location, then such new branch should be established on a location where the distance to the majority of its clients is shorter than that of other banks. Taxi cabs selecting passengers is another good example. If a taxi cab uses wireless devices to find out the location of its customer, then RNN queries will be far more advantageous than NN queries from the aspect of

17

competition. Figure 1 illustrates that Customer c is the nearest neighbor for Taxi a, but that does not necessarily mean Taxi a can capture Customer c because Taxi b is even closer to Customer c. On the contrary, the best option for Taxi a should be Customer d because Taxi a is the nearest neighbor for Customer d. That is, d is the RNN for a, and a may reach d faster than any other taxi. This is an example of CRNN query for that the query object, the taxi, changes location according to time. Mobile users will be mobile in a wireless environment, and that is why the continuous query is an important issue in the wireless environment. As far as the knowledge available to the researchers is concerned, there is not yet any researcher working on this issue. Because an inquirer changes location constantly according to time, changes of location will cause RNN queries to return different results. For a CRNN query, executing RNN search for every point of time during a continuous query period will require a tremendously large price to pay. The larger the number of query objects and the shorter the time segment are, the longer the calculation time will be. In addition, due to the continuance nature of time, defining the appropriate time segment for RNN search will be a concern; if the interval between RNN searches is too short, then more CRNN queries need to be executed to complete the query, and vice versa. If a RNN search is repeated over a longer period of time to reduce the number of execution, the RNN query result for the whole time segment will lose accuracy due to insufficient frequency of sampling. In this paper, a more efficient algorithm is designed to replace processing of each and every point of time for RNN search; just one execution of CRNN query is all it takes to properly define the segment for the query time that a user is interested in, and find out the segments that share the same answer and the RNN for each of the intervals. Other than that, an index is also used to filter out unnecessary objects to reduce search space and improve CRNN search efficiency. The experiment result suggests that using index provides efficiency 20 times better than not using index when the number of objects is 1000. This Study provides major contribution in three ways: z This Study pioneers into continuous query processing methods opposite to static query regarding RNN issues. z A CRNN search algorithm is proposed; just one execution will return all CRNN results. z The proposed method allows the index which was only applicable to finding RNN for a single query point to support CRNN query to improve CRNN search efficiency. The structure of the other sections in this work:

UbiCC Journal - Volume 3

Related works about RNN search are introduced in Section 2. Concerned issues are defined and assumptions made are described in Section 3. The proposed CRNN search algorithm is introduced in Section 4 The experiment environment and evaluation parameters for experimental efficacy are described in Section 5. In the end, a conclusion and future study directions are provided in Section 6.

Figure 1: Example of RNN query.

2

RELATED WORK

RNN search concerns about finding q, a query that is the NN for some objects. Related works of study about RNN search are introduced and summarized in this section: z Index methods that support RNN search The number of objects can be infinite; if one must first find out the distance from query q to each object for identifying the RNN for query q, then the efficiency may be unacceptably low due to overwhelmingly large computation cost. To accelerate processing speed, most of studies adopt the index methods. Major index methods are introduced in this section. z RNN search of different types RNN searches in different scenarios are described and categorized according to static and moving situations of query q and the objects. 2.1

Index Methods for RNN Query

RNN search concerns about finding q, a query that is the NN for some objects, and it is necessary to find out the distance between query q and each object, or the distance from the coordinate of query q to the coordinate of an object. For a given q, not every object is its RNN, and these objects which can not be RNN may be practically left out of consideration to reduce the number of objects to be taken into consideration and accelerate processing speed for RNN search. Many studies were dedicated to the designing of an effective indexing structure for coordinates of an object. The most famous ones are R-Tree proposed by [8] and Rdnn-Tree proposed by

18

[10]. These two index methods are described below.

these child trees to its NN will not exceed MaxDnn.

2.1.1R-Tree R-Tree is an index structure developed in early years for spatial database and was used by [10] to accelerate RNN search processing. All objects are grouped and then placed on leaf nodes according to the closeness of their coordinates. That is, objects at similar coordinates are put in one group. Next, each group of objects is contained in a smallest possible rectangle, which is called Minimum Bounding Rectangle (MBR). Next, MBRs are grouped in clusters, which are contained inside a larger MBR until all objects are contained in the same MBR. What is stored on an internal node of a R-Tree is an MBR, in which all nodes underneath are contained, and the root of the R-Tree contains all objects. The size and range of an MBR is defined by its lower left coordinate (Ml,Md) and upper right coordinate (Mr, Mu). Figure 2 is an example of R-Tree. From a to l, there are total 12 objects; (a , b , c , d) belong to MBR b1, and (e,f,g) belong to MBR b2. MBR b1 and b2 belong to MBR B1, and MBR R contains all objects..

Figure 3. Data structure of Rdnn-tree 2.2 Categories of Rnn queries Depending on the static or moving status of query q and the query objects, related studies can be summarized into 4 categories. 1. If query q and the query objects are both static, then this category is called static query vs. static objects. 2. If query q is moving and the query objects are static, then this category is called moving query vs. static objects. 3. If query q is static and the query objects are moving, then this category is called static query vs. moving objects. 4. If both query q and the query objects are moving, then this category is called moving query vs. moving objects. 2.2.1 Static query vs. static objects

Figure 2: Example of R-tree Indexing 2.1.2 Rdnn-Tree Rdnn-tree (R-tree containing Distance of Nearest Neighbors) [22] improves the method of [10]. The author proposes a single index structure (Rdnn-tree) to provide solutions for NN queries and RNN queries at the same time. Rdnn-tree differs from standard Rtree structure by storing extra information about nearest neighbor of the points in each node. Information of (ptid,dnn) is stored on the leaf node of Rdnn-tree, as shown in Figure 3. ptid means an object of which the data concentrate on the dimension, denoted as d, and dnn means the distance from such object to its NN. Information of (ptr , Rect , MaxDnn) is stored on a non-leaf node, where ptr points to the address of a child node, Rect contains the MBR of all child nodes subordinate to this node, and MaxDnn means the maximum value of dnn of all objects in the child trees subordinate to this node. The maximum distance from any object contained in

UbiCC Journal - Volume 3

The scenario that both query q and query objects are static is first discussed because the query and query objects are immobile and are therefore easier for processing than other scenarios. The method proposed in [10] is now introduced. For static database, the author adopts a special R-tree, called RNN-tree, for answering RNN queries. For static database that requires being frequently updated, the author proposes a combined use of NN-tree and RNN-tree. NN of every object is stored in the RNNtree, and what are stored in the NN-tree are the objects themselves and their respective collections of NN. The author uses every object as the center of a circle, of which the radius is the distance from the object to its NN, to make a circle, and then examines every circle that contains query q to find out the answers of RNN queries. Such method, however, is very inefficient for dynamic database because the structures of NN-tree and RNN-tree must be changed whenever the database is updated. In [22], the method proposed by [10] is therefore improved. The author proposes a single index structure, Rdnn-tree, for answering NN queries and RNN queries at the same

19

time. It differs from normal R-tree; it separately stores the information of NN of every object (i.e. Distance of Nearest Neighbor), and NN of every object must be calculated in advance. 2.2.2 Static query vs. moving objects Studies mentioned above primarily assume a monochromatic situation that all objects, including query q and query objects, are of the same type. In [18], the researcher addresses this type of issues in a bichromatic situation that objects are divided into two different types; one is inquirer, and the other is query object. NN and range query techniques are used in this Paper to handle RNN issues.

q; instead, i segments of time, such as segment1, segment2 …segmenti, that have the same result, are first identified among the entire CRNN query time period. RNN result of each segment of time, such as RNN1, RNN2, …, is calculated separately, and the result is returned in the format of (q , [segment1])={RNN1 result} , … , (q , [segmenti])={RNNi result} back to the inquirer.

2.2.3 Moving query Figure 4. Example of a CRNN query This subsection discusses the situation when an inquirer is no longer static but changes his or her location according to time, and the query object can be either static or moving. That is, two categories of query: moving query with static objects and moving query with moving objects, are involved. Because the inquirer is moving, these two categories of query will return different results for the identical RNN search at different points of time. This type of issue is obviously more complicate than the issues previously discussed. As far as the knowledge available to the researcher is concerned, no related study has ever discussed about the issues of these two categories. In this Study, solutions for a moving query with static objects are pursued.

3 Problem Formulation CRNN query concerns about a period of continuing time where adjacent points in such period of time may have the identical RNN. That is, a period of time may have the same RNN unless the query q has moved beyond this period of time. Please refer to Figure 4. When a user executes CRNN query q, the time segment of the continuous query is [Ps,Pe], and the query objects are {a,b,c,d}. If time point P1 can be identified, and any given point of time in the time segment from Ps to P1, or [Ps,P1], has the same RNN result, then one-time execution of RNN search is all it needs for the time segment of [Ps ,P1]. If points of time, P2, P3, and P4, are also identified, and any given point of time in the time segment of [P1, P2] has the identical RNN result, while any given point of time in the time segment of [P2,P3] has the identical result, and any given point of time in the time segment of [P3,P4] has the identical result, then the entire CRNN query needs only one-time execution of RNN search at each time segment. For processing CRNN query, it is not necessary to execute RNN search for every point of time and return the RNN of every point of time back to query

UbiCC Journal - Volume 3

Based on the description above, CRNN query, one issue that this Study concerns, may be stated as below: Given: A collection of static objects S={O1,O2,…,On} A query point q, its current position (q.x,q.y), and moving velocity (v.x,v.y) A continuous query time [Ps , Pe]; where Ps represents the coordinate of the point of time when the query begins, and Pe represents the coordinate of the point of time when the query ends Find: The RNNs of q between any two adjacent points of time of {P1 ,P2 ,…,Pi} within [Ps ,Pe] remain constant. Such that: RNN(q,[Ps ,P1]) = {RNN1},RNN(q,[P1 ,P2]) ={RNN2},…, RNN(q,[Pi ,Pe]) = {RNNi}, where [Pi,Pe] ⊆ [Ps,Pe],{RNN1} ∈ {O1,O2,…, On}. Under the assumptions: 1. The moving direction of query q is fixed. 2. All query objects are static. As described above, two adjacent points of time may share the same RNN, or a segment of time has only one RNN unless query q moves to another segment of time that has a different RNN. The CRNN search algorithm proposed in this Study uses exactly this concept. First, the points of time that produce different RNN results within a query time are identified. These points of time divide the query time into several segments that have different RNN results, and then the RNN results are identified for each of the segments. The detailed algorithm of CRNN Search is explained in the next section.

4. CRNN Search Algorithm The detailed procedure of CRNN Search algorithm is introduced in this section. CRNN Search

20

algorithm is divided into two steps. Step 1: Finding segment points of CRNNq Points of time that produce different RNN results are identified. Based on these points of time, CRNN query is divided into several time segments that require execution of RNN search. The RNN result for any given point of time within one segment will remain constant, and different segments have different RNN results. Step 2: Calculating RNN result of each segment Separately calculate the RNN results for each of the segments that have been divided in the previous step. The entire procedure for processing CRNN Search is illustrated in Figure 5. On top of the necessary query objects and continuous query (query path), it is divided into two steps: finding segment points of CRNNq and calculating RNN result of each segment; each of the steps is described below:

As illustrated in Figure 6, if the NN of object a is b, and a circle is made using ab as the radius with a as the center point, then the distance from query q to a must be shorter than the distance from a to its NN, or object b, as long as query q falls within this circle. Therefore, during the period of time when query q remains within this circle, RNNs of object a must include a, unless query q moves out of this circle. Because the moving direction of query q is assumed to be fixed, CRNN query will form a query line (qline) from its beginning to its end. The point to which this CRNN query begins to leave this circle is the intersection S of this circle and the query line formed by CRNN query. Before intersection S, the result of RNN query must include object a; beyond intersection S, the result of RNN query will not include object a; the RNN results will be different. This intersection is referred to as a segment point. This explains why the intersection of the circle with NN as its radius and the query line is the point of time where RNN query produces different results. Making a circle by using an object itself as the center and the distance to its NN as the radius will enable all of the intersections of the circle and the query line of CRNN query to cut CRNN query into several time segments that have different results of RNN query.

Figure 5: Flow chart of CRNN query processing 4.1 Finding segment points of CRNN What CRNN query pursues is a period of continuous time; the moving distance of query objects is very short among some adjacent points of time for the query, thus possibly resulting in the same RNN result. That is, the entire period of continuous query is divided into several segments, and the RNN results in each segment are the same. If these points of time share the same RNN result, then it is not necessary to execute RNN search for each of the points of time; one-time calculation is enough. Therefore, CRNN query does not require executing RNN search for all points of time. Instead, points of time that share the same RNN result are grouped into time segments, and one-time RNN search is executed for each of the segments. RNN of query q is a collection of the objects of which the NN is query q. If the distance, or N, is realized in advance, then these objects are the RNN for query q when the distances from query q to the objects are shorter than the distances from the objects to their respective NN.

UbiCC Journal - Volume 3

Figure 6. Finding segment point of CRNN search Figure 7 illustrates the time segmentation process described above. For object a, b, and c, their respective NNs are identified first: NN(a)=b, NN(b)=a, and NN(c)=b. Next, use each object as the center of a circle, and the distance to its respective NN as the radius to make circles of a, b, and c. Then, intersections of the circles and qlines, Ps, P1, P2, P3, P4, and Pe , are sorted according to time, and every two intersection points define a time segment. The entire CRNN query is cut into five time segments, [Ps, P1] , [ P1, P2] , [P2,P3] , [P3,P4] , and [P4,Pe]. Every segment has a unique RNN query result.

21

Figure 8: Calculating RNN result of each segment. 1.

CRNN Algorithm with Index Not every object will be an answer in the processing of CRNN query. To improve RNN query efficiency, it is preferred that the objects that can not be answers are filtered out in advance to greatly reduce search space for CRNN query, size of data that requires CRNN query, and consequently, computation cost. The process that further improves CRNN query efficiency dramatically is referred to as pruning process. Figure 9 illustrates the flowchart of CRNN query processing with a pruning process added.

Figure 7: Segmenting of the CRNN query 4.2 Calculating RNN result of each segment In the previous section, intersections of qlines and the circles with the distances between the objects and their respective NNs as the radiuses are defined. With these intersections, CRNN query is cut into several time segments. The next step is to find RNNs for each of the time segments. Because the distances from query objects to their respective NNs are used as the radiuses to make circles which are coded by the objects’ numbers, if a segment falls within a certain circle, then the resulting RNN of this time segment for the CRNN query is the object collection represented by such circle. This is illustrated in Figure 8. First, intersections of qlines that represent the CRNN query and the circles of the objects are sorted by time; every two intersection points define a time segment, and there are five segments, [Ps,P1] , [ P1 , P2] , [P2 , P3] , [P3 , P4] , and [P4 , Pe]. Segment [Ps , P1] is contained only by circle a, therefore: RNN(q,[Ps ,P1]) ={a}. Next, examine segment [Ps,P1]; this segment is contained by circle a and circle b. Therefore: RNN(q, [Ps,P1]) = {a, b}. If this process is repeated, then the obtained results will be RNN(q , [P2 , P3]) = {a , b , c}, RNN(q,[P3,P4]) ={b,c}, and RNN(q,[P4,P5]) ={c}.

UbiCC Journal - Volume 3

Figure 9. Flow chart of CRNN query with index Step 2 and 3 are identical to Step 1 and 2 in CRNN search algorithm, which have been described in the previous sections, and they will not be reiterated again here. For step 1, the pruning process, an index structure for Rdnn-tree is designed to effectively execute the pruning process. The three steps of CRNN query with index are illustrated in Figure 9. The pruning process is described below. For every internal node of Rdnn-tree, the distance from query q to its node will be computed for every separation, and the distance is denoted as D(q,Rect). If D(q,Rect) of a node is larger than MaxDnn of the node, then all the objects beneath it will not be considered because the distance from query q to Rect Node will be equal to or larger than the distances from query q to all the objects underneath Rect node. When the distance from query q to Rect node is longer than MaxDnn, it is impossible that query q is closer to its NN than any other object underneath Rect node, and no object underneath can be the RNN result for query q. On the contrary, if D(q,Rect) equals the MaxDnn of such node, then the distances from some objects underneath Rect node to their respective NNs are shorter than the distance from query q to Rect. That is, some objects are the RNN results for query q. The examination continues along

22

the branch all the way to the lead node. All entries underneath such leaf node are recorded as the candidate objects for RNN query result. The collection of these candidate objects is referred to as RNNCanSet, which means the possible results for RNN query must exist within this collection, and the objects outside of RNNCanSet can not possibly be RNN query results. All that are needed to be considered when finding segment point of CRNNq of CRNN search algorithm are the objects inside RNNCanSet. This will greatly reduce the quantity of objects needed to be handled and enhance CRNN search algorithm efficiency. Figure 3 explains the pruning process. It begins with root node R. Because D(q,R) ≦MaxDnn of R, child nodes of B1 and B2 must be examined. Because the MaxDnn of MBR B1 ≦D(q,B1), all child nodes underneath B1 can be pruned. Next, D(q , B2)≦MaxDnn of B2, so child nodes b3 and b4 of B2 must be examined. D(q,b3) is equal to or smaller than the MaxDnn of b3, which is also a leaf node;

therefore, h and i are placed inside RNNCanSet. Next, b4 is examined. MaxDnn of b4 is equal to or smaller than D(q , b4); therefore, b4 can be pruned. The entire pruning process then ends. However, the CRNN query to be processed is not a RNN query of a single query point; therefore, the pruning process in [22] can not be directly used. To ensure that no possible RNN result is deleted, the criteria of pruning is changed from the condition that D(q,Rect), the distance from query point to Rect, must be longer than MaxDnn to the condition that MinD(q,Rect)>MaxDnn, where MinD(qline,Rect) represents the minimum distance from qline to Rect node. The reason why the shortest distance is selected is that if the minimum distance from the entire qline to Rect node is larger than MaxDnn, then the distance from any given point of time on the qline to Rect node must be longer than MaxDnn. Therefore, all the objects underneath Rect node can not be RNN for qline, and pruning is out of consideration. Details of the pruning algorithm are exhibited in Algorithm 1:

Algorithm 1: Pruning Algorithm. and comparison of experiment results. 5. Performance Study 5.1 Experiment Settings To evaluate the improvement which the method proposed in this Study has made in CRNN query efficiency, some experiments are designed, and this section provides descriptions of experiment environments, experimental parameters and settings,

UbiCC Journal - Volume 3

The coordinates of the objects disperse in an experiment environment of [0 , 1]×[0 , 1] plane. Because distribution density of the objects may influence efficiency, it should be taken into

23

consideration in the experiment. In the experiment, three different types of distribution are used in the generation of objects’ coordinates. The three different types of distribution are Uniform distribution, Gaussian distribution, and Zipf distribution. In uniform distribution, the objects are evenly distributed on the plane, as shown in Figure 10(a). In Gaussian distribution, most of the objects concentrate on the center of the plane, as shown in Figure 10(b). In Zipf distribution, most of the objects will distribute at the extreme left and extreme bottom of the plane. In the experiment, skew factor is set at 0.8, as shown in Figure 10(c). In addition, 30 queries are generated randomly in a [0.4 , 0.6]×[0.4 , 0.6] plane by

referring to [15], and the velocity vector of each query falls between [-0.01 , 0.01]. Because the influence of different types of object distribution on efficiency is concerned in this experiment, the queries are generated as close to the center of the plane as possible. Having executed 30 queries, the average cost of executing one CRNN search is used in determining which method is more favorable. As to the program coding of Rdnn-tree in the CRNN search algorithm, R*-tree code of GIST[7] is used in perfecting Rdnn-tree to make it match with the requirement of this experiment. 1

0.8

0.8

0.8

0.6

0.6

0.6

Y axis

Y axis

1

Y axis

1

0.4

0.4

0.4

0.2

0.2

0.2

0

0

0 0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

X axis

X axis

0.2

0.4

0.6

0.8

1

X axis

Figure 10 Data sets of experiment evaluation In addition to the object distribution described above, the influences that the amount of query time (qline) and the number of objects may impose on efficiency are also considered. Three data sets of Uniform, Gaussian, and Zipf are considered in object distribution. The amount of query time (qline) changes from query length 1 to query length 10. The number of objects changes from 1K to 10K. Parameters and settings used in the experiment are listed in Table 1. Table 1: Parameter settings of experiment Parameter distribution

Description Data distribution

interval

Time interval of Query Number of Data Objects

object-no

Settings Uniform, Gaussian, Zipf 1, 2, 5, 8, 10 1, 10. 30, 50 , 100(k)

CRNN query is executing RNN algorithm for every point of time which is continuous, and it is impossible to calculate the required count of execution. Therefore, the CRNN query time must be segmented before the total execution time required for CRNN query may be calculated. The more the time is segmented, the more executions of RNN are required. If a period of time is segmented into m segments, then time complexity will be O(m×n3), and if time is not adequately segmented, then the RNN result may be erroneous. These make it an inefficient CRNN search algorithm, and it will not be compared in this experiment. Efficiency of two methods is compared in this experiment: one uses Rdnn-tree as the index, and the other uses no index. To evaluate these two methods, comparison of the time required for one CRNN search execution can be used, and this comparison is referred to as total cost in this Study. 5.4 Performance Results and Discussion

5.2 Compared Algorithms and Performance Metrics The most intuitive method for finding RNN is looking for the NN of every object. If the number of query objects is N, then time Complexity is O(n2). Next, determine which objects’ NNs are query points. If the NNs are the query points, then the objects will be the RNNs for the query points. The required time complexity for the RNN algorithm is O(n3). However, the most intuitive method for finding

UbiCC Journal - Volume 3

Based on the changes of metrics (distribution, interval, and object-no), different types of experiments have been conducted. Results are summarized by object-no and query interval in the next section. 5.4.1 The effect of object-no parameter First, the fixed query interval is set at 5. The influence imposed on efficiency by object-no

24

parameter, which is the number of objects, under different types of object distribution, will be discussed. The experiment result is shown in Figure 11. X axle represents the total cost of time required for executing one CRNN search, and Y axle represents the number of query objects. Total cost increases as the number of query objects increases. In addition, when the number of objects is 1K, the efficiency of the CRNN search that uses Rdnntree is about 300 seconds, and that of the CRNN search using no Rdnn-tree index is about 15 seconds; about 20 times faster. It is obvious that pruning some unnecessary objects by adopting Rdnn-tree as the index to reduce CRNN search space provides much higher efficiency than not adopting Rdnn-tree. The 7

7

107

10

10

CRNN without index CRNN with index

106

6

10

5

5

5

10

Total time (sec. )

3

10

2

Total time (sec. )

10

104

104 3

10

103

10

10

10

104

2

2

10

10

10

1 1K

10K

30K object-no

50K

1

1

100K

CRNN without index CRNN with index

6

CRNN without index CRNN with index

10

10

Total time (sec. )

comparison of influence from object distribution on efficiency clearly suggests Zipf distribution offers the best efficiency, followed by Uniform distribution, and Gaussian distribution offers the worst. This result can be explained as such: because Zipf distribution is located at the far left and the lowest bottom, most of the data will be pruned, and the number of objects that are included in RNNCanSet without being pruned is very small, offering the lowest total cost. On the opposite, data in Gaussian distribution concentrate in the center of the plane and very few data can be pruned, allowing many objects to remain, and causing a large RNNCanSet, thus resulting in the highest total cost.

1K

10K

30K object-no

50K

1K

100K

10K

30K object-no

50K

100K

Figure 11: Influences on different types of data distribution from changing object-no distance of MinD(qline , Rect) decreases, causing pruning efficiency to reduce. On the contrary, when the interval increases, the number of time segmentation by CRNN query increases. Consequently, the number of RNN searches for every segment increases, and total cost of CRNN query increases as well.

5.4.2 The effect of query interval parameter This section focuses on the influence from the length of query interval on each method under different types of object distribution. Results of the experiment are shown in Figure 12. Generally speaking, when the query interval is lengthened, the 105

105

104

5

10

104

4

10

3

10

102

10

Total Time (sec. )

CRNN without index CRNN with index

Total time (sec. )

Total Time (sec. )

CRNN without index CRNN with index

103

2

10

10

1 1

2

5 Query Interval

8

10

103

CRNN without index CRNN with index

2

10

10

1 1

2

5

8

Query Interval

10

1 1

2

5

8

10

Query Interval

Figure12: Effect of query interval parameter for different data distribution 6. Conclusions and Future Works An efficient CRNN search algorithm is proposed in this Study. Such algorithm requires only one execution to find out RNN results from all continuous RNN searches. The diversified experiments in this

UbiCC Journal - Volume 3

Study also prove the efficiency of the proposed method. As wireless communication and mobile device technology become mature, more and more users access information from wireless information systems through mobile devices. To process requests from more and more mobile users, data dissemination

25

through broadcast is an effective solution for scalability. The future goal of this Study is extending the issues of CRNN search to the wireless broadcasting environment. 7. References [1] AmitSingh,H.F. and Tosun,A.S. (2003) High dimensional reverse nearest neighbor queries. Proceedings of the 20th International Conference on Information and Knowledge Management (CIKM’03), NewOrleans, LA, USA, pp.91–98. [2] Barbara,D. (1999) Mobile computing and databases-a survey. IEEE Transactions on Knowledge and Data Engineering, 11, 108–117. [3] Benetis,R.,Jensen, C.S.,Karciauskas,G., and Saltenis,S. (2002) Nearest neighbor and reverse nearest neighbor queries for moving objects. International Database Engineering and Applications Symposium, Canada, July17-19, pp.44–53. [4] Chaudhuri,S. and Gravano,L. (1999) Evaluating top-k selection queries. Proceedings of the 25th IEEE International Conference on Very Large Data Bases, pp.397–410. [5] Civilis,A., Jensen,C.S., and Pakalnis,S. (2005) Techniques for efficient road-network-based tracking of moving objects. IEEE Transactions on Knowledge and Data Engineering, 17, 698– 712. [6] Computer Science and Telecommunication Board.IT Roadmap to a geospatial future, the national academies press,2003. [7] http://gist.cs.berkeley.edu/. [8] Guttman,A. (1984) R-trees:A dynamic index structure for spatial searching. Proceedings of the 1984 ACM SIGMOD international conference on Management of data, pp.47–57. [9] Hjaltason,G.R. and Samet,H. (1999) Distance browsing in spatial data bases. ACM Transactions on Database Systems (TODS), 24, 265–318. [10] Korn,F. and Muthukrishnan,S. (2000) Influence sets based on reverse nearest neighbor queries. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May16-18, pp.201– 212. [11] Korn,F.,Muthukrishnan, S.,and Srivastava.,D. (2002) Reverse nearest neighbor aggregates over data streams. Proceedings of the International Conference on Very Large DataBases (VLDB’02), Hong Kong, China, August, pp.91–98. [12] Korn,F., Sidiropoulos, N.,Faloutsos, C.,Siegel,E., and Protopapas,Z. (1996) Fast nearest neighbor search in medical image database. In Proceedings of the 22th International Conference on Very Large Data Bases

UbiCC Journal - Volume 3

(CLDB’96), pp. 215–226. [13] Lee,D.L., chien Lee, W.,Xu,J., and Zheng, B. (2002) Data management in location dependent services. IEEE Pervasive Computing,1, 65–72. [14] Roussopoulos,N., Kelley,S. ,and Vincent, F.(1995) Nearest neighbor queries. Proceedings of ACM Sigmod International Conference on Management of Data , Illinois, USA, June, pp.71–79. [15] Ross,S.(2000) Introduction to Probability and Statistics for Engineers and Scientists. [16] Stanoi,I., Agrawal,D., and Abbadi,A.E. (2000) Reverse nearest neighbor queries for dynamic databases. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp.44–53. [17] Song,Z. and Roussopoulos,N. (2001) K-nearest neighbor search for moving query point. Proceedings of 7th International Symposium on Advances in Spatial and Temporal Databases, LNCS2121, RedondoBeach, CA, USA, July1215, pp.79–96. [18] Stanoi,I.,Riedewald, M.,Agrawal,D., and Abbadi,A.E. (2001) Discovery of influence sets in frequently updated databases. Proceedings of the 27th VLDB Conference, Roma, Italy, pp.99– 108. [18] Tao,Y., Papadias,D., and Lian,X. (2004) Reverse knn search in arbitrary dimensionality. Proceedings of 30th Very Large Data Bases, Toronto, Canada, August29-September3, pp.279–290. [20]Tao,Y., Papadias, D., and Shen,Q. (2002) Continuous nearest neighbor search. International Conference on Very Large Data Bases, Hong Kong, China, August 20-23, pp.279–290. [21] Xu,J., Zheng,B., Lee,W.-C.,, and Lee,D.L. (2003) Energy efficient index for energy query location-dependent data in mobile environments. In Proceedings of the 19th IEEE International Conference on Data Engineering (ICDE’03), Bangalore, India, March, pp.239– 250. [22] Yang,C. and Lin,K.-I. (2001) An index structure for efficient reverse nearest neighbor queries. Proceedings of the 17th International Conference on Data Engineering, pp.485–492. [23] Yiu,M.L., Papadias,D., Manoulis,N., and Tao,Y. (2005) Reverse nearest neighbors in large graphs. Proceedings of 21st IEEE International Conference on Data Engineering (ICDE), Tokyo, Japan, April5-8, pp.186–187. [24] Yu,C.,Ooi,B.C., Tan,K.-L., and Jagadish,H.V. (2001) Indexing the distance: An efficient method to knn processing. Proceedings of the 27th VLDB Conference, Roma, Italy, pp. 421– 430. [25] Zheng,B.,Lee, W.-C., and Lee,D.L. (2003)

26

Search k nearest neighbors on air. Proceedings of the 4th International Conferenceon Mobile Data Management, Melbourne, Australia, January, pp.181–195. [26] Zheng,B., Xu,J., chien Lee, W., and Lee,D.L. (2004) Energy conserving air indexes for nearest neighbor search. Proceedings of the 9th International Conference on Extending Database Technology (EDBT’04), Heraklion,

UbiCC Journal - Volume 3

Crete, Greece,March, pp.48–66. [27] Zhang,J., Zhu,M., Papadias,D., Tao,Y., and Lee,D.L. (2003) Location-based spatial queries. In Proceedings of the 2003 ACM SIGMOD international conference on Management of data, SanDiego, California, USA, June9-12, pp.443–454.

27

REDUCTION OF INTERCARRIER INTERFERENCE IN OFDM SYSTEMS R.Kumar

Dr. S.Malarvizhi

* Dept. of Electronics and Comm. Engg., SRM University, Chennai, India-603203 [email protected]

ABSTRACT Orthogonal Frequency Division Multiplexing (OFDM) is a promising technique for the broadband wireless communication system. However, a special problem in OFDM is its vulnerability to frequency offset errors due to which the orthogonality is destroyed that result in Intercarrier Interference (ICI). ICI causes power leakage among subcarriers thus degrading the system performance. This paper will investigate the effectiveness of MaximumLikelihood Estimation (MLE), Extended Kalman Filtering (EKF) and Self-Cancellation (SC) technique for mitigation of ICI in OFDM systems. Numerical simulations of the ICI mitigation schemes will be performed and their performance will be evaluated and compared in terms of bit error rate (BER), bandwidth efficiency and computational complexity. Keywords: Orthogonal Frequency Division Multiplexing (OFDM), Intercarrier Interference (ICI), Carrier Frequency Offset (CFO), Carrier to Interference Ratio (CIR), Maximum Likelihood (ML), Extended Kalman Filtering (EKF).

1. Introduction Orthogonal frequency division multiplexing (OFDM), because of its resistance to multipath fading, has attracted increasing interest in recent years as a suitable modulation scheme for commercial high-speed broadband wireless communication systems. OFDM can provide large data rates with sufficient robustness to radio channel impairments. It is very easy to implement with the help of Fast Fourier Transform and Inverse Fast Fourier Transform for demodulation and modulation respectively [1]. It is a special case of multi-carrier modulation in which a large number of orthogonal, overlapping, narrow band sub-channels or subcarriers, transmitted in parallel, divide the available transmission bandwidth [2]. The separation of the subcarriers is theoretically minimal such that there is a very compact spectral utilization. These subcarriers have different frequencies and they are orthogonal to each other [3]. Since the bandwidth is narrower, each sub channel requires a longer symbol period. Due to the increased symbol duration, the ISI over each channel is reduced. However, a major problem in OFDM is its vulnerability to frequency offset errors between the transmitted and received signals, which may be caused by Doppler shift in the channel or by the difference between the transmitter and receiver local oscillator frequencies [4]. In such situations, the orthogonality of the carriers is no longer maintained, which results in Intercarrier Interference (ICI). ICI results from the other sub-channels in the same data block of the same user. ICI problem would become more complicated when the multipath fading is present [5]. If ICI is not properly compensated it results in power leakage among the subcarriers, thus degrading the system performance.

In [6], ICI self-cancellation of the data-conversion method was proposed to cancel the ICI caused by frequency offset in the OFDM system. In [7], ICI self-cancellation of the data-conjugate method was proposed to minimize the ICI caused by frequency offset and it could reduce the peak average to power ratio (PAPR) than the data-conversion method. In [8], self ICI cancellation method which maps the data to be transmitted onto adjacent pairs of subcarriers has been described. But this method is less bandwidth efficient. In [9], the joint Maximum Likelihood symbol-time and carrier frequency offset (CFO) estimator in OFDM systems has been developed. In this paper, only carrier frequency offset (CFO) is estimated and is cancelled at the receiver. In addition, statistical approaches have also been explored to estimate and cancel ICI [10]. Organization: This paper is organized as follows: In section 2, the standard OFDM system has been described. In section 3, the ICI mitigation schemes such as SelfCancellation (SC), Maximum Likelihood Estimation (MLE) and Extended Kalman Filtering (EKF) methods have been described. In section 4, simulations and results for the three methods has been shown and are compared in terms of bandwidth efficiency, bit error rate (BER) performance. Section 5 concludes the paper and inference has been given. 2. System Description The block diagram of standard OFDM system is given in figure 1. In an OFDM system, the input data stream is converted into N parallel data streams each with symbol period Ts through a serial-to-parallel Port. When the parallel symbol streams are generated, each stream would be modulated and carried over at different center frequencies. The sub-carriers are spaced by 1/NTs in frequency, thus they are orthogonal over the interval (0, Ts). Then, the N symbols are mapped to bins of an inverse fast Fourier transform (IFFT). These IFFT [11] bins correspond to the orthogonal sub-carriers in the OFDM symbol. Therefore, the OFDM symbol can be expressed as

x ( n) =

1 N

N −1

∑X e m

j 2πnm / N

(1)

m =0

where the Xm’s are the base band symbols on each sub-carrier. The digital-to-analog (D/A) converter then creates an analog time-domain signal which is transmitted through the channel. At the receiver, the signal is converted back to a discrete N point sequence y(n), corresponding to each subcarrier. This discrete signal is demodulated using an N-point Fast Fourier Transform (FFT) operation at the receiver. S/P

IFFT

P/S

D/A

Channel

UbiCC Journal - Volume 3

w(n)

28

Figure 1: OFDM System Model The demodulated symbol stream is given by: N −1

Y (m) = ∑ y (n)e − j 2πnm / N + w(m)

(2)

n=0

N-1 where w (m) corresponds to the FFT of the samples of w (n), which is the Additive White Gaussian Noise (AWGN) introduced in the channel.

3. ICI Mitigation Schemes 3.1 Self-Cancellation (SC) Scheme In this scheme, data is mapped onto group of subcarriers with predefined coefficients. This results in cancellation of the component of ICI within that group due to the linear variation in weighting coefficients, hence the name self- cancellation. The complex ICI coefficients S (l-k) are given by (3) Sin(π (l + ε − k )) Sin(l − k ) =

NSin(π (l + ε − k ) / N )

exp( jπ (1 − 1 / N )(l + ε − k ))

3.1.1 ICI Canceling Modulation The ICI self-cancellation scheme requires that the transmitted signals be constrained such that X (1) = - X (0), X (3) = - X (2) …X (N-1) = - X (N-2).The received signal on subcarriers k and k + 1 to be written as

Y ' (k ) =

N −2

∑ X (l )[S (l − k ) − S (l + 1 − k )] + n

k

(4)

l = 0 , 2 , 4 , 6 ,..

Y ' (k + 1) =

N −2

∑ X (l )[S (l − k − 1) − S (l − k )] + n

Figure 2: Comparison between | S ' ' (l-k)|, |S ' (l-k)| and |S (l-k)| It is seen from figure 2 that |S ' (l-k)| << |S (l-k)| for most of the l-k values. Hence, the ICI components are much smaller Also, the total number of interference signals is halved in as opposed to since only the even subcarriers are involved in the summation.

3.1.2 ICI Canceling Demodulation ICI modulation introduces redundancy in the received signal since each pair of subcarriers transmit only one data symbol. This redundancy can be exploited to improve the system power performance, while it surely decreases the bandwidth efficiency. To take advantage of this redundancy, the received signal at the (k + 1) th subcarrier, where k is even, is subtracted from the kth subcarrier. This is expressed mathematically as Y '' (k) = Y '(k) -Y ' (k+1)

=

N−2

∑X (l)[−S(l − k −1) + 2S(l − k) − S(l + k +1)]+ n − n k

(7) Subsequently, the ICI coefficients for this received signal becomes S '' (l-k) = – S (l-k-1) + 2S (l-k) – S (l-k+1) (8) When compared to the two previous ICI coefficients |S (l-k)| for the standard OFDM system and |S'(l-k)| for the ICI canceling modulation, |S''(l-k)| has the smallest ICI coefficients, for the majority of l-k values, followed by |S' (l-k)| and |S (l-k)|. This is shown in Figure 2 for N = 64 and ε = 0.5. The combined modulation and demodulation method is called the ICI self-cancellation scheme. The reduction of the ICI signal levels in the ICI self-cancellation scheme leads to a higher CIR. The theoretical CIR is given by

k +1

l = 0 , 2 , 4 , 6 ,..

(5) where nk and nk+1 is the noise added to it. And the ICI coefficient S ' (l-k) is denoted as S '(l-k) = S (l-k) – S (l+1-k)

(6)

k +1

l =0, 2,4,..

CIR =

− S (−1) + 2 S (0) − S (1)

2

N −1

∑ − S (l − 1) + 2S (l ) − S (l + 1)

2

(9)

l = 2 , 4 , 6 ,..

As mentioned previously, the redundancy in this scheme reduces the bandwidth efficiency by half. There is a tradeoff between bandwidth and power tradeoff in the ICI selfcancellation scheme.

3.2 Maximum Likelihood Estimation The second method for frequency offset correction in OFDM systems was suggested by Moose in [12]. In this approach, the frequency offset is first statistically estimated using a maximum likelihood algorithm and then cancelled at the receiver. This technique involves the replication of an OFDM symbol before transmission and comparison of the

UbiCC Journal - Volume 3

29

phases of each of the subcarriers between the successive symbols. When an OFDM symbol of sequence length N is replicated, the receiver receives, in the absence of noise, the 2N point sequence i.e., {r (n)} given by

r ( n) =

1 N

K

∑ X ( k ) H ( k )e

j 2πn ( k +ε ) / N

(10)

k =− K

where {X(k)} are the 2K+1 complex modulation values used to modulate 2K+1 subcarriers, The first set of N symbols are demodulated using an N-point FFT to yield the sequence R1(k), and the second set is demodulated using another N-point FFT to yield the sequence R2(k). The frequency offset is the phase difference between R1 (k) and R2 (k), that is R2 (k) = R1 (k) ej2πε

(11)

Adding the AWGN yields Y1 (k) = R1 (k) + W1 (k) (12) Y2 (k) = R1 (k) ej2πε + W2 (k) k = 0, 1 ...N – 1 The maximum likelihood estimate of the normalized frequency offset is given by:

1 ε= tan − 1 2π ∧

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩

K

∑ Im Y (k )Y * (k )

k =− K K

2

1

∑ Re Y (k )Y * (k )

k =− K

2

1

⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎭

(13)

This maximum likelihood estimate is a conditionally unbiased estimate of the frequency offset and was computed using the received data. Once the frequency offset is known, the ICI distortion in the data symbols is reduced by multiplying the received symbols with a complex conjugate of the frequency shift and applying the FFT, X (n) = FFT {y (n) e-j2π nε / N} (14) 3.3 Extended Kalman Filtering A state space model of the discrete Kalman filter is defined as z(n) = a(n) d(n) + v(n) (15) In this model, the observation z(n) has a linear relationship with the desired value d(n). By using the discrete Kalman filter, d(n) can be recursively estimated based on the observation of z(n) and the updated estimation in each recursion is optimum in the minimum mean square sense. The received symbols in OFDM System are y(n) = x(n) ej 2 π n ε(n) / N + w(n) (16) where y(n) the received symbol and x(n) is the FFT of transmitted symbol. It is obvious that the observation y(n) is in a nonlinear relationship with the desired value ε(n), i.e y(n) = f(ε(n)) + w(n) (17) where f(ε(n)) = x(n) ej 2 π n ε(n) / N (18) In order to estimate ε(n) efficiently in computation, we build an approximate linear relationship using the firstorder Taylor’s expansion: y(n)≈f(εˆ(n-1))+f'(εˆ(n-1))[ε(n)-εˆ(n-1)]+w(n) (19) where εˆ(n-1) is the estimate of ε(n-1). To Define z(n) = y(n) – f(εˆ(n-1) d(n) = ε(n) - εˆ(n-1) and the following relationship z(n) = f'(ε(n-1)) d(n) + w(n)

UbiCC Journal - Volume 3

(20) (21) (22)

z(n) is linearly related to d(n). Hence the normalized frequency offset ε (n) can be estimated in a recursive procedure similar to the discrete Kalman filter. As linear approximation is involved in the derivation, the filter is called the extended Kalman filter (EKF). The EKF provides a trajectory of estimation for ε(n). The error in each update decreases and the estimate becomes closer to the ideal value during iterations.

4.2 ICI Cancellation There are two stages in the EKF scheme to mitigate the ICI effect: the offset estimation scheme and the offset correction scheme.

4.2.1

Offset Estimation Scheme

To estimate the quantity ε (n) using an EKF in each OFDM frame, the state equation is built as ε(n) = ε (n-1) (23) i.e., in this case we are estimating an unknown constant ε. This constant is distorted by a non-stationary process x(n), an observation of which is the preamble symbols preceding the data symbols in the frame. The observation equation is y(n) = x(n) e j2 π n ε(n) / N + w(n)

(24)

where y(n) denotes the received preamble symbols distorted in the channel, w(n) the AWGN, and x(n) the IFFT of the preambles X(k) that are transmitted, which are known at the receiver. Assume there are Np preambles preceding the data symbols in each frame are used as a training sequence and the variance σ2 of the AWGN w(n) is stationary.

4.2.2

Offset Correction Scheme

The ICI distortion in the data symbols x(n) that follow the training sequence can then be mitigated by multiplying the received data symbols y(n) with a complex conjugate of the estimated frequency offset and applying FFT, i.e. xˆ(n) = FFT{ y(n) e -j 2 π n ε(n) / N} (25) As the estimation of the frequency offset by the EKF scheme is pretty efficient and accurate, it is expected that the performance will be mainly influenced by the variation of the AWGN.

4.3 Algorithm 1.

Initialize the estimate εˆ(0) and corresponding state error P(0) 2. Compute the H(n), the derivative of y(n) with respect to ε(n) at εˆ(n-1) the estimate obtained in the previous iteration. 3. Compute the time-varying Kalman gain K(n) using the error variance p (n-1), H(n), and σ2 4. Compute the estimate yˆ(n) using x(n) and εˆ(n-1) i.e. based on the observations up to time n-1, compute the error between the true observation y(n) and yˆ(n) 5. Update the estimate εˆ(n) by adding the K(n)-weighted error between the observation y(n) and yˆ(n) to the previous estimate εˆ(n-1) 6. Compute the state error P(n) with the Kalman gain K(n), H(n), and the previous error P(n-1). 7. If n is less than Np, increment n by 1 and go to step 2; otherwise stop. It is observed that the actual errors of the estimation εˆ(n) from the ideal value ε(n) are computed in each step and are used for adjustment of estimation in the next step.

30

4. SIMULATIONS AND RESULTS In order to compare the ICI cancellation schemes, BER curves were used to evaluate the performance of each scheme. For the simulations in this project, MATLAB was employed. The simulations were performed using an AWGN channel. Table 1: Simulation Parameters PARAMETERS

VALUES

Number of carriers (N)

1705

Modulation (M)

BPSK

Frequency offset ε

[0.25,0.5,0.75]

No. of OFDM symbols

100

Bits per OFDM symbol

N*log2(M)

Eb-No

1:20

IFFT size

2048

Figure 5: BER performance with ICI Cancellation for ε=0.75 the effect of this residual ICI increases for larger offset values. However, ML method has an increased BER performance and proves to be efficient than SC method.

5. CONCLUSION

Figure 3: BER performance with ICI Cancellation for ε=0.25 Figure 3 shows that for small frequency offset values, ML and SC methods have a similar performance. However, ML method has a lower bit error rate for increasing values of Eb/No.

It is observed from the figures that Extended Kalman filter method indicates that for very small frequency offset, it does not perform very well, as it hardly improves BER. However, for high frequency offset the Kalman filter does perform extremely well. Important advantage of EKF method is that it does not reduce bandwidth efficiency as in self cancellation method because the frequency offset can be estimated from the preamble of the data sequence in each OFDM frame. Self cancellation does not require very complex hardware or software for implementation. However, it is not bandwidth efficient as there is a redundancy of 2 for each carrier. The ML method also introduces the same level of redundancy but provides better BER performance, since it accurately estimates the frequency offset. EKF implementation is more complex than the ML method but provides better BER performance. Further work can be done by extending the concept of self-ICI cancellation and by performing simulations to investigate the performance of these ICI cancellation schemes in multipath fading channels.

6. REFERENCEs

Figure 4: BER performance with ICI Cancellation for ε=0.5 Figure 4 illustrates that for frequency offset value of 0.5, BER increases for both the methods but ML method maintains a lower bit error rate than SC.EKF is better than SC method. In figure 5, for frequency offset value of 0.75, selfcancellation method has a BER similar to standard OFDM system since the self-cancellation technique does not completely cancel the ICI from adjacent sub-carriers and

UbiCC Journal - Volume 3

[1] Ramjee Prasad, “OFDM for wireless communication system”,Artech House,2004. [2]S.Weinstein and P.Ebert, ‘Data transmission by frequency-division multiplexing using the discrete Fourier transform,’ IEEE Trans. Commun.,vol.19, pp. 628-634, Oct. 1971. [3] L.J. Cimini, “Analysis and Simulation of a Digital Mobile Channel Using Orthogonal Frequency Division Multiplexing”, IEEE Transactions on Communication. no.7 July 1985. [4] Russell, M.; Stuber, G.L.; “Interchannel interference analysis of OFDM in a mobile environment”, Vehicular Technology Conference, 1995 IEEE 45th, vol. 2, pp. 820 – 824,.Jul. 1995 [5] X.Cai, G.B.Giannakis,”Bounding performance and suppressing intercarrier interference in wireless mobile OFDM”, IEEE Transaction on communications, vol.51, pp. 2047-2056, no.12, Dec.2003.

31

[6] J. Armstrong, “Analysis of new and existing methods of reducing intercarrier interference due to carrier frequency offset in OFDM,” IEEE Transactions on Communications, vol. 47, no. 3, pp. 365 – 369, March 1999. [7] Y. Fu, S. G. Kang, and C. C. KO, “A new scheme for PAPR reduction in OFDM systems with ICI selfcancellation,” in Proc. VTC 2002- Fall, 2002 IEEE 56th Vehicular Technology Conf., vol. 3, pp 1418–1421, Sep. 2002. [8] Y.Zhao and S. Häggman, “Intercarrier interference selfcancellation scheme for OFDM mobile communication systems,” IEEE Transactions on Communications, vol. 49, no. 7, pp. 1185 – 1191, July 2001. [9] J.-J. van de Beek, M. Sandell, and P.O. Borjesson, “ML estimation of time and frequency offset in OFDM systems,” IEEE Trans. Signal Process., 45, pp.1800–1805, July 1997. [10] Tiejun (Ronald) Wang, John G. Proakis, and James R. Zeidler “Techniques for suppression of intercarrier interference in ofdm systems”. Wireless Communications and Networking Conference, 2005 IEEE Volume 1, Issue, 13-17 pp: 39 - 44 Vol. 1, March 2005. [11] William H.Tranter, K.Sam Shanmugam, Theodore S.Rappaport, Kurt L.Kosbar, “Principles of Communication system simulation with wireless application”, Pearson Education, 2004. [12] P.H. Moose, “A technique for orthogonal frequency division multiplexing frequency offset Correction,” IEEE Trans. Commun., 42, pp.2908–2914, October 1994

UbiCC Journal - Volume 3

32

A NEW SIGNALLING PROTOCOL FOR SEAMLESS ROAMING IN HETEROGENEOUS WIRELESS SYSTEMS Azita Laily Yusof, Mahamod Ismail, Norbahiah Misran Dept of Electrical, Electronic & System Engineering, Universiti Kebangsaan Malaysia, 43600 UKM Bangi, Selangor, Malaysia. Tel.: +60389216122, Fax : +60389216146 Email: [email protected], {mahamod, bahiah}@eng.ukm.

ABSTRACT The world is undergoing a major telecommunications revolution that will provide ubiquitous communication access to citizens, wherever they are. Seamless roaming across different wireless networks which has different types of services and quality of service guarantees has becomes a major topic for the past several years in the research area. With the integration of different technologies, the signaling protocol of mobility management must be designed to support seamless roaming for both intra and interdomain system. In this paper, we designed a simplified system architecture, called enhanced system architecture evolution (eSAE) to support mobility between multiple heterogeneous wireless system. eSAE contains fewer network nodes and is reduced to only the enhanced node B (eNB) and access gateway (aGW) that comprise Mobility Management Entity (MME) and User Plane Entity (UPE). We designed a signaling protocol for the location registration due for intersystem roaming in next generation wireless systems. Performance analysis has been carried out and based on this proposed architecture, it is shown that this enhancement can reduce the signaling cost and latency of location registration. Keywords: Seamless roaming; Handoff latency; Intra and interdomain system; Heterogeneous wireless system

1

INTRODUCTION

In the next generation wireless systems, it is expected that the population of the mobile users will be increased with the development of various applications in the seamless global . Mobile users can have different services that suits their need and can move freely between different wireless systems. However, different wireless system will have different environments, interworking and integration. This scenario has becomes challenges for the researcher to support intra and intersystem mobility for providing continuous wireless services to mobile users in the next generation heterogeneous wireless networks. There has been many proposals to integrate different wireless systems. In [1], the mobility gateway location register (GLR) has been developed to support the intersystem roaming. The GLR converts signaling and data formats from one

UbiCC Journal - Volume 3

network to another. This protocol needs to request location registration after it receives signals from the new system and this cause high overhead of signaling cost and processing time. It also causes the triangular call routing problem because the call for roaming mobile in the same network need to route to the previous network before delivered to the new network. Boundary location register (BLR) [2] was designed in order to solve this problem. In this protocol, the home location register (HLR) is not involved in location registration unless the mobile goes through into another system. So the incoming calls of intersystem roaming mobiles are delivered to them directly. However, this approach is not scalable in the sense that one BLR gateway is needed for each pair of adjacent networks when integrating multiple networks. In [3], they proposed a distributed gateway foreign agent (GFA) where each foreign agent [FA] can function dynamically either as an FA or GFA.

33

There is no fixed regional network boundary and mobile decides to perform the home location update scheme increases the requirement of the processing capability on each mobility agent and mobile terminals. The hierarchical Intersystem Mobility Agent (HIMA) [4] was proposed where it acts as an anchor point to forward data as the user moves from one network to another. The HIMAs are placed at the gateway routers or anchor routers for mobile users with high roaming profiles. However, the scheme of address administrative issues and service level agreements across different wireless network and service providers is not analyzed in this paper. In [5], the author introduced an architecture called ubiquitous Mobile Communications (AMC) to integrate multiple heterogeneous systems. AMC eliminates the need for direct SLA among service providers by using a third party, Network Interoperating Agent (NIA). In this paper, they use distributed and hybrid scheme as a network selection. However, because the decision making is implemented in the mobiles, so the system information has to be broadcasted to the mobiles periodically by the handoff management module, resulting in a great update cost of the system. Moreover, the existing protocol does not consider the determination of the NIA’s number required for global integration. Low complexity, centralized network selection scheme [6] has been proposed to overcome the shortcomings of NIA. The proposed scheme eliminated the update cost whereby this scheme will only be invoked by changes in end users’ service requirements, beginning of a new application, or ending of an existing application. In this paper, we propose a simplified network architecture, eSAE to support the low latency system. The network is simplified and reduce to only the Base Station called enhanced Node B (eNB) and access gateway (aGW) that consists of Mobility Management Entity (MME) and User Plane Entity (UPE). The system uses all Internet Protocol (IP) network where all services are via packet switch domain only. In this proposed architecture, we design a signaling protocol for authentication and authorization. The rest of this paper is organized as follows. First we describe the existing system architecture and the signaling protocol called AMC. Then we present our proposed simplified architecture followed by the authentication and authorization information flow in eSAE. We discuss the simulation results and finally the conclusion.

2

according to its changing mobility and packet arrival pattern. However, this among different network operators. The architecture shown in figure 1, where the NIA functions as a trusted third party for authentication dialogs between the foreign agent and home network. The working principle of this third party architecture is as follows. When a mobile user requests services from an foreign network (FN) and the FN determines that it has no SLA with the user’s HN provider, it forwards the request to NIA to authenticate the user. Then, NIA talks to the user’s HN provider and mediates between the FN and HN for authentication message exchanges. Once the user is authenticated, NIA also creates security associations/keys required between different network entities. At the end of the proposed security procedures, the HN and FN will be mutually authenticated, and will have session keys for secured data transfer. They integrate the authentication and Mobile IP registration processes as defined in [5].

CURRENT AMC PROTOCOL

AMC integrates heterogeneous wireless systems using a third party, called Network Interoperating Agent (NIA) which eliminates the need for SLAs

UbiCC Journal - Volume 3

34

NIA FN

HN

AAAL HLR AU

AAAH

FA

HA

UE

Figure 1 : The architecture for AMC

3

THE PROPOSED ARCHITECTURE

Figure 2 shows our proposed architecture for the next generation wireless systems. eSAE will have two types of network elements supporting the user and control planes. • The first is the enhanced base station, so called enhanced node B (eNB). This enhanced base station provides air interface and performs radio resource management for the access system. • The second is the access gateway (aGW). The aGW provides termination of the bearer. It also acts as a mobility anchor point for the user plane. It implements key logical functions including Mobility Management Entity (MME) to manage/store user

UbiCC Journal - Volume 3

equipment (UE) context, generate temporary identities, UE authentication and authorization and mobility management and User Plane entity (UPE) to manage/store UE context and packet routing/forwarding, initiation of paging. Comparing the functional breakdown with existing architecture: • Radio Network elements functions, such as Radio Network Controller (RNC), are distributed between the aGW and the eNB. • Core Network elements functions, such as SGSN and GGSN or PDSN (Packet Data Serving Node) and routers are distributed mostly towards the aGW.

35

aGW (MME/UPE)

aGW (MME/UPE)

ME/UPE)

(E/UPE)

eNB

eNB

eNB

eNB

Figure 2 : The proposed mobility management architecture for next generation all-IP-based wireless systems

3.1

Authentication and Authorization

The working principle of this architecture is as follows. When a mobile user requests service from a FN and the FN determines that it has no SLA with user's home service subscriber (HSS), it forwards the request to aGW to authenticate the user. Then, aGW talks to user's HSS and mediates between FN and HSS for authentication message exchanges. Once the user is authenticated, aGW also creates security associations/keys required between different network entities. Finally the HSS and FN will be mutually authenticated, and will have session keys for secured data transfer. The authentication and Mobile IP registration processes are integrated in the proposed architecture using the procedures defined in [7]. IEEE 802.1x port access control standard [8] is used for end-to-end mutual authentication between a UE

UbiCC Journal - Volume 3

and its HSS. IEEE 802.1x uses a special frame format known as Extensible Authentication Protocol (EAP) over LAN (EAPOL) for transportation of authentication messages between a UE and an access point (AP). EAP [9] over RADIUS [10] or Diameter [11] is used for the transportation of authentication messages between other entities. When the UE roams into a FN, the authentication and MIP registration are carried out as described below. Here, EAP-SIM [12] is used to illustrate the authentication process. Note that any other authentication schemes, e.g. EAP-AKA [13], EAP-SKE [14], EAP-TLS [15] etc. can also be used. Figure 3 shows the location registration procedure.

36

eNB

UE

aGW (MME/UPE)

AuC

AAAH

Inter AS Anchor

HSS

1. Network Discovery and Access System Selection 2. Attach Request [c1 + c2 ]

[c1 + c2 ]

3. Authentication

[c3 + c4]

4. Attach Reply [c1 + c2 ] 5. Register MME [c3 + c4 + c5 + c6 ]

6. Confirm Registration [c3 + c4 + c5 + c6 ]

7. Selection of Intersystem Mobility Anchor GW [c1 + c2 ]

8. User Plane Route Configuration [c3 + c4 + c5] 9. Configure IP Bearer QoS [c7]

10. Attach Accept

[c1 + c2 ]

Figure 3 : The authentication and authorization signaling messages 1. The UE discovers new access system and performs access system and network selection. 2. The UE sends an attach request, MIP Registration Request including Mobile-AAA Authentication extension (as defined in [16]) to the aGW. The UE also includes a SIM Key Request extension [19] and a Network Access Identifier (NAI) [18], e.g. UE@relam, in its MIP Registration Request. The SIM Key Request extension contains a random number (NONCE_UE) picked up by the UE, which is used for new authentication key generation as discussed later in this section. 3. When the aGW receives the MIP Registration Request and finds the Mobile-AAA Authentication extension, it learns that the UE is a roaming user. Based on the NAI in the MIP Registration Request, the aGW recognizes that the operator does not have direct SLA with the UE's HN and forwards the MIP Registration

UbiCC Journal - Volume 3

Request to the Home AAAH server (AAAH). Once the AAAH receives the MIP Registration Request containing the SIM Key Request extension, first it verifies the Mobile-AAA authentication extension. If the authentication is successful, it contacts the home authentication center (AuC) of the UE and obtains n number of triplets (RAND, SRES, Kc), where RAND denotes a random number, SRES denotes the response and Kc is the key used for encryption. Then it forwards a copy of these triplets to aGW. When aGW receives n triplets it derives a UE_AAAH key (KUE_AAAH) and calculates message authentication code (MAC) for the RANDs (MAC_RAND) using [19] KUE_AAAH = h(n * Kc│NONCE_UE) and MAC_RAND = PRF(KUE_AAAH, α)

(1)

37

where α is n*RAND│key lifetime; and h() and PRF() denotes a one-way hash function and a keyed pseudo-random function, respectively. Then, aGW sends the RANDs, MAC_RAND and SIM Key Reply extension to UE. The UE derives the corresponding SRES and Kc values using its SIM card and the received RANDs. It also calculates (KUE__AAAH) and MAC_RAND using (20). It validates the authenticity of RANDs by comparing the calculated MAC_RAND with the received MAC_RAND. Thus, confirming that the RANDs are generated by its HN. If the MAC_RAND is valid, the UE calculates a MAC for its SRES values using [19] MAC_SRES = PRF(KUE (2)

_AAAH,

n * SRES)

The MAC_SRES is used by aGW to know if the SRES values are fresh and authentic. The UE also generates security association keys; (KUE_eNB) for the eNB and (KUE_HSS) for the HSS using [19] KUE_eNB = PRF(KUE _AAAH, AddeNB) and KUE_HSS = PRF(KUE_AAAH, AddHSS)

(3)

where AddeNB and AddHSS are the IP address of eNB and HSS, respectively. These keys are used to authenticate subsequent Mobile IP registrations until the key lifetime expires. 4. Now, the UE resends MIP Registration Request message to the eNB containing SRES extension [19] and Mobile-AAA Authentication extension. When eNB detects the presence of Mobile-AAA Authentication extension, it forwards the MIP Registration Request message to aGW. aGW calculates MAC_SRES and compares that with the received MAC_SRES. If valid, it forwards the MIP Registration Request message to the AAAH. After successful authentication AAAH forwards the MIP Registration Request containing KUE_HSS (calculated using (4)) to the HSS. KUE_HSS = PRF(KUE_AAAH, AddeNB, AddHSS ) (4) 5. The HSS confirms the registration of the new aGW. Subscription data authorising the Default IP Access Bearer are transferred. Information for policy and charging control of the Default IP Access Bearer is sent to the aGW. 6. An Inters AS Anchor is selected. The IP address configuration is determined by user preferences received from the UE, by subscription data, or by HPLMN or VPLMN policies. 7. The Inter AS Anchor configures the IP layer

UbiCC Journal - Volume 3

with the determined user IP address. The user plane is established and the default policy and charging rules are applied. The user plane establishment is initiated by the aGW. 8. The aGW provides the Evolved RAN with QoS configurations for the Default IP Access Bearer, e.g. the upper limits for transmission data rates. 9. The aGW accepts the UE's network attachment and allocates a temporary identity to the UE. Also the determined user IP address is transferred. aGW calculates UE-eNB security key, KUE_eNB, and forwards the MIP Registration Reply (containing KUE_eNB and the Kc keys) to eNB. eNB extracts KUE_eNB and the Kc keys and send a MIP Registration Reply to the UE. The Kc keys are used for secure data transfer between the UE and eNB providing confidentiality and integrity to the data traffic.

4

PERFORMANCE ANALYSIS of eSAE

In this section, we analyze the performance of signaling cost and latency of location registration due to intersystem roaming. The costs for location registration are associated with the traffic of messages between the entities and the accessing cost of databases. To compare the total of signaling cost between the proposed and existing architecture, we assume the following parameters : Table 1 : Simulation parameters p α β c1 c2 c3 c4 c5 c6 c7

transmission cost of messages the UE and the eNB transmission cost of messages the eNB and the aGW transmission cost of messages the aGW and the HSS transmission cost of messages the UE and the eNB transmission cost of messages the eNB and the aGW transmission cost of messages the aGW and the AAAH transmission cost of messages the AAAH and the AUC transmission cost of messages the AUC and the IASA transmission cost of messages the IASA and the HSS transmission cost of messages the eNB and the aGW

between between between between between between between between between between

38

16

Total signalling cost

14

12

10

8

1.4

4 NIA eSAE 0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Probability of intersystem roaming

Figure 4 : Total cost of location registration

UbiCC Journal - Volume 3

1.2

1

0.8

0.6

0.4 NIA eSAE 0.2 0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Probability of intersystem roaming

Figure 5: Latency of location registration

5

Conclusion

In this paper, we introduced a new signaling protocol for mobility management which is based on the enhancement of the SAE architecture. We proposed the detailed procedure of location registration for the eSAE protocol. This protocol is specifically developed to decrease the latency of the NIA protocol. To summarize the comparison of eSAE and NIA protocol, we measured the signaling cost of location registration. Moreover, we evaluated the latency of the location registration, which is composed of waiting time and processing time at a specific database. The results show that the eSAE protocol is able to reduce the signaling cost and latency of location registration for the mobile’s moving across different networks. 4

6

2 0.1

1.6

Latency of location registration

We assume that a mobile keeps the same mobility pattern when it moves into another system. Further, we assume that the updating, deletion and retrieval in the database have the same cost, a. We calculate the total signaling of location registration which is the sum of the transmission cost and the cost associated with database access. Then we calculate the latency of location registration where we assume the average processing time of each database access is 1/μ and the average waiting time is w. So the latency for location registration is the total time including waiting time in queue and the processing time. Figure 4 shows the comparison of total signaling cost as a function of intersystem roaming probability. As we can see from the graph, the total signaling cost increases as the intersystem roaming probability increases,. We can also observe that the total signaling cost of the eSAE protocol is much lower than the NIA protocol. It is seen that as compared to the NIA protocol, the eSAE protocol yields significantly improved because of the simplified architecture. The NIA protocol has to access more databases compared to the eSAE protocol. Similar to the case of total signaling cost, the latency of location registration increases with the increases of the intersystem roaming probability. Figure 5 shows the result obtained. Therefore, eSAE protocol reduces the total signaling cost and latency of location registration so that it is more suitable for an intersystem roaming environment.

REFERENCES

[1] ETSI TS 129 120 V3.0.0, “Universal mobile telecommunications systems (UMTS); mobile application part (MAP) specification for gateway location register (GLR)”, 3GPP/ETSI 2000, 2000-2003. [2] I.F. Akyildiz, W. Wang, “A new signaling protocol for intersystem roaming in next generation wireless systems”, IEEE Journal on Selected Area in Communications, vol.19, no. 10, Oct. 2001, pp. 2040-2052. [3] I.F. Akyildiz, W. Wang, “A novel distributed dynamic location management scheme for minimizing signaling costs in mobile IP”, IEEE Transactions on Mobile Computing, vol. 1, No 3, July 2002, pp. 163-175. [4] N. Shenoy, “A framework for seamless roaming across heterogeneous next generation wireless

39

networks”, Journal on ACM Wireless Networks. [5] I.F. Akyildiz, S. Mohanty, J. Xie, “A ubiquitous mobile communication architecture for nextgeneration heterogeneous wireless systems”, IEEE Communications Magazine, vol. 43, no. 6, pp. 29-36, June 2005. [6] H. Jia, Z. Zhang, P. Cheng, H. Chen, A. Li, “ Study on network selection for next generation heterogeneous wireless networks”, in Proc. IEEE International Symposium on Personal, Indoor and Mobile radio Communications”, 2006. [7] Glass, S., Hiller, T., Jacobs, S., and Perkins, C., “Mobile IP authentication, authorization, and accounting requirements,” RFC 2977, IETF, 2000. [8] “IEEE Standard for Local and metropolitan area networks - Port-Based Network Access Control.” IEEE Std 802.1X-2001. [9] Blunk, L. and Vollbrecht, J., “PPP Extensible Authentication Protocol (EAP),” RFC 2284, IETF, 1998. [10]Rigney, C. and et al, “Remote Authentication Dial In User Service (RADIUS),” RFC 2865, IETF, 2000. [11]Calhoun, P. R., “Diameter Mobile IPv4 application,” Internet Draft, draft-ietf-aaadiameter-mobile ip 16.txt, work in progress, 2004. [12]Haverinen, H. and Salowey, J., “EAP SIM authentication,” Internet Draft, draft-haverinenpppest-eap-sim-16.txt, work in progress, 2004. [13]Arkko, J. and Haverinen, H., “EAP AKA Authentication,” Internet Draft, draft-arkkopppest-eap-aka-09. txt, work in progress, 2003. [14] Salgarelli, L., “EAP SKE authentication and key exchange protocol,” Internet Draft, draftsalgarelli-pppext-eap-ske-03.txt, work in progress, May 2003. [15]Aboba, B. and Simon, D., “PPP EAP TLS Authentication Protocol,” RFC 2716, IETF, 1999 [16]Aboba, B. and Simon, D., “PPP EAP TLS Authentication Protocol,” RFC 2716, IETF, 1999 [17]Haverinen, H., Asokan, N., and Maattanen, T., “Authentication and key generation for Mobile IP using GSM authentication and roaming,” in Proc. IEEE ICC (ICC'01), pp. 2453{2457. [18]Calhoun, P. and Perkins, C., “Mobile IP network access identi¯er extension for IPv4,” RFC 2290, IETF, 2000. [19] Haverinen, H., Asokan, N., and Maattanen, T., “Authentication and key generation for Mobile IP using GSM authentication and roaming,” in Proc. IEEE ICC (ICC'01), pp. 2453{2457. [20]“3GPP System to WLAN Interworking: Functional and Architectural De¯ni-tion.” Tech. rep. 3GPP TR 23.934 v0.3.0. 3GPP.

UbiCC Journal - Volume 3

40

Artificial Neural Network application in Parameter Optimization of Rectangular Microstrip Patch Antenna 1

2

R.Malmathanraj , S.Thamarai Selvi Lecturer /ECE National Institute of Technology, Tiruchirapalli [email protected] 2 Professor and Head /Information Technology, Madras Institute of Technology, Anna University, Chennai. [email protected] 1

Abstract - Printed microstrip antennas and

arrays are known to have limitations in terms of bandwidth and efficiency, all imposed by the very presence of the dielectric substrate. The paper deals with the design of a probe fed and edge fed rectangular microstrip patch antenna with the basic parameters W,h,L,ε r,fo to achieve better bandwidth and directivity with efficient radiation pattern and Gain. The analytical results for various possible dimensions and different dielectric values were calculated for achieving bandwidth and directivity without any structural complexities .The analytical results were tested by simulating with basic design software PCAAD,MSTRIP40. To obtain an optimum value for the design parameters of the microstrip antenna Support Vector Machines (SVM), Generalised Regularisation Neural Network (GRNN) and Back Propagation Network (BPN) were implemented to train the network to attain optimized values to yield wide bandwidth and better directivity with high Gain. The application of artificial neural network ensures an optimum design methodology for microstrip antenna design which is revealed when comparing the results with analytical methods and the results of the simulation softwares. 1. Introduction Microstrip patch antennas have been attractive due to their conformal properties. Mathematical modeling of the basic microstrip radiator was initially carried out by the application of transmission-line analogies to

UbiCC Journal - Volume 3

simple rectangular patch fed at the center of radiating wall. A microstrip patch antenna is a radiating patch on one side of a dielectric substrate, which has a ground plane on the underside. The EM waves fringe off the top patch into the substrate, reflecting off the ground plane and radiates out into the air. Radiation occurs mostly due to the fringing field between the patch and ground. The radiation efficiency of the patch antenna depends largely on the substrate permittivity (εr) of the dielectric[2]. The basic geometry of the microstrip patch is shown in fig (1)

Figure 1.Microstrip Patch Antenna Geometry

Ideally, a thick dielectric is preferred for broadband purposes. Small values of width W of patch result in low antenna efficiencies

41

while large W values lead to higher order modes. Substrate thickness should be chosen as large as possible to maximize bandwidth and efficiency, but not so large as to risk surfacewave excitation. The patch length is determined by condition for resonance. This occurs when the input impedance is purely real. The bandwidth of the patch is defined as the frequency range over which it is matched with that feed line within specified limits. In other words, the frequency range over which the antenna will perform satisfactorily. This means the channels have larger usable frequency range and thus results in increased transmission. The bandwidth of an antenna is usually defined by the acceptable standing wave ratio (SWR) value over the concerned frequency range[3,4]. Dimensions of the top patch were calculated to get the required bandwidth and the impedance matching. The advantages of microstrip antenna is that they are low-cost, conformable, lightweight and low profile, while both linear and circular polarization easily achieved. Disadvantages of microstrip antenna include such as a narrow bandwidth, a low gain (~6 dB) and polarization purity is hard to achieve. Several methods were reported in literature to improve impedance bandwidth including employing wide band impedance matching, stacked patches and utilizing thicker substrates[9].

2.Design Methodology 2.1 Basic Rectangular Microstrip The major design task of this paper is optimization of the dimensions of the probe fed rectangular microstrip patch antenna. The simplest approach is adopted to demonstrate how effectively the Artificial neural network can be used to train and optimize the various parameters involved in the design of microstrip patch antenna. This work concentrates only on the basic geometry of the microstrip ignoring

UbiCC Journal - Volume 3

various complex structures adopted for the enhancement of bandwidth, Directivity and Gain. The size of the probe is selected as 0.2 mm and the various feeding positions were considered for the calculation[5]. The dimensions of the patch antenna along with the substrate permittivity and the probe position is varied for different operating frequencies and the numerical results were arrived using the basic design formulas of the microstrip patch listed below, The width of the Microstrip patch antenna is given by c

W=

 r 1

2 f0

-(1)

2 Effective dielectric constant is given by 1 

1 r 1  h 2 reff = r  1 12  -(2)  2 2  W Effective length ( Leff ) is given by C Leff  -(3) 2 f 0 reff Length extension ( L ) is given by



W   0.3  0.264 h  -(4) L 0.412h W  reff 0.258  0.8  h  reff

Actual length of patch ( L ) is given by L= Leff 2L

-(5)

Ground plane dimensions ( Lg and W g) is given by Lg = 6h +L -(6)

42

Wg = 6h + W

-(7)

Height of the substrate is given by

0.3C h 2 fo  r

-(8)

The bandwidth of a rectangular patch is given by





BW 3.77  r 1 / 2r  W / L h / 0  --(9) Where f 0 is the resonant frequency,  r is the relative substrate permittivity, C is the speed of light 3 108 m/s.

2.2 Artificial Neural Network Design for Rectangular Microstrip Patch Antenna . A microstrip rectangular patch antenna (Fig. 1) can be viewed as a matrix with X variables and four unknown the bs, such that AX=b The unknown are the resonant frequency (RF), bandwidth (BW), gain(G), and polarization (PL). The variables X, are the patch length (L1), patch width(L2), substrate height (H 1) , substrate relative permittivity (r) and the feeding positions(Xf) and (Yf)

[A][ L1 , L2 , r, H1 , X,Y, Xf Yf]T=[G,BW,S 11,PL]T G=f(L1 , L2 , r, H1 , X,Y, Xf , Yf); BW=f(L1 , L2 , r, H1 , X,Y, Xf , Yf); S11=f(L1 , L2 , r, H1 , X,Y, Xf , Yf); PL=f(L1 , L2 ,  r, H 1 , X,Y, Xf , Y f ); In the neural network design the inputs are L1 , L2 , r, H1 , X,Y, Xf , Yf and the output is G,BW,S 11,PL. An artificial neural network is an information processing paradigm. That is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system .It is composed of a large number of highly interconnected processing elements (neurons), working in union to solve specific problems. Artificial neural network is like a normal human, it learns by examples.[2,6] A neural network with feedback is an adequate representation of the information

UbiCC Journal - Volume 3

processing structure of rectangular microstrip antennas, where the input neuron units are (L ,W, ε r ,h ,P ,fo )and the output units are (BW ,D ,G ,RP ). The learning paradigm on the microstrip is supervised learning, where the mapping function between the inputs and outputs is the matrix A. The inputs are weighted and the effect that each input has at decision making is dependent on the weight of the particular input. The weight of an input is a number that when multiplied with the input gives the weighted input. Their calculation is based on the method of moments. These weighted inputs then generate the unknowns. Those unknowns are then compared to stored information that gives the desired bandwidth, directivity along with radiation pattern and gain. The gain is expected to be greater than 3 db and polarization should be linear or circular [1,2]. A good paradigm of supervised learning that is of interest to microstrip antenna designer is error correcting learning that is minimization of error between the desired and computed values. In this learning paradigm, the set of weights that minimizes the error between the teaching input and the

43

weighted inputs is obtained. Neural networks are general function approximators. One important characteristic is that they can learn any Input-Output (IO) mapping by using the information contained in a given input-output data set without needing a structure definition of the IO-mapping. The type of network, parameter settings, number of hidden neurons, and the connectivity of the neural network define the structure of the approximated IOmapping. The only additional information a neural network needs, besides the input-output data, is the definition of the input and output parameters, the relevant parameters which span the IO-mapping. The ideal use of neural networks is antenna model parameters optimization. Without knowing the IOmapping structure of the model, the neural network can learn to mimic the IO-mapping [7,8]. In the IO-mapping problem a feedforward neural network is used because the antenna model is a static mapping (except at the very moment the design failure occurs). Also the ’tangent sigmoidal’, e.g. ’tansig’, activation function will be used in the hidden layer of the network because of two reasons. One, in the antenna design, IO-mapping is a smooth mapping with little to none discontinuities. The ’tansig’ is also a smooth function with the capability of approximating a discontinuity (by squashing the shape with respect to the input axis). The second reason for choosing the ’tansig’ function is that this function is a very ’general’ function. After a failure, the antenna design IO-mapping has changed into an unknown form. Therefore an activation function, which can be used to mimic almost all shapes, is more preferable since in that case all (unknown) IO-mapping can be approximated by the neural network. Neural networks are based on the human brain and its enormous capability of learning and adapting. Over decades, people have been trying to model the human brain

UbiCC Journal - Volume 3

mathematically. The structure of the human brain and the learning process is known, but the main difference between both networks is the effciency. The human brain is capable of recognizing a familiar face in approximately 100-200 ms, where conventional computers can take hours or days to fulfill less complex tasks. The biological neural network is still much faster then the artificial neural network (ANN or NN), but the capabilities of the neural networks are promising. In this section, the general structure of a neural network will be explained. First the concept of neurons is treated with a special attention to the several activation functions. The possible combinations of several neurons in layers is handled in the discussion about the architecture of networks. Some example networks are used to illustrate the effect of the parameters of the networks. The primary element in a neural network is the neuron, an information-processing unit. A mathematical model of an artificial neuron in given in Fig. (2). The structure is similar to the human neuron; more information about the functioning of the human neural network. The elements xi on the left side are the input signals to the neuron k. These inputs are multiplied by the corresponding weights wki and summed together with the bias bk. The results of the summation vk is passed through an activation function (vk) producing the output yk. The mathematics of the neuron given in Fig. (2) can start with the weighted inputs p v k   wki x i -(10) i 1 The output can be written as

yk k (vk bk )

-(11)

A weighted bias can be included by adding an extra term in the first equation.

p v   w xi k ki i 0

-(12)

44

Where the bias is changed to a fixed input of 1 and with a weight of wk0 . The shape of the output is then only depending on the activation function of vk. y k k (v k ) -(13) The type of activation function has a large influence on the output of the neuron as can be seen from the equation (12). In a signal flow diagram, a neuron can be represented as shown in Fig. (3).

x0=1 x1

wk,0 wk,1 x2

wk,2

vk (k)

yk

. . . xn

wk,n

Figure 2. Mathematical definition of a neural network

X1

wk,

bk

1

wk, X2 UbiCC Journal - Volume 3

Σ

2

. . .

45

Fig 3. neuron

Signal

flow

diagram

A neural network is a directed graph consisting of nodes with interconnecting synaptic and activation links, and is characterized by four properties: 1. Each neuron is represented by a set of linear synaptic links, an externally applied bias, and a possibly nonlinear activation link. The bias is represented by a synaptic link connected to an input fixed at +1. 2. The synaptic links of a neuron weight their respective input signals. 3. The weighted sum of the input signals defines the induced local field of the neuron in question. 4. The activation link squashes the induced local field of the neuron to produce an output. 3. Architecture Selection A variety of Neural network architectures are available to process the data from the input data set files. A multi layer Backpropagation Network architecture, Generalized Regularization neural network (GRNN) and Support vector machines (SVM) were used for training because of its ability to generalize well when applied to a wide variety of applications and also for the ability to have better regression.

UbiCC Journal - Volume 3

of

a

3.1 Learning

As the neural network software read the training set, the network learns the data patterns in the training set. Learning subprograms differ depending on the architecture selected. As training progressed, statistical graphs furnished by the neural net software, provided a means to monitor training progress. Numerical historical data and repetitive examples in which the solution is already known are required to train a neural network. While the relationship between variables may not be known, network results can be improved by the addition of more variables. Data may need different representation, for example if data has a very large value range, logarithms or other data transformations or conversions may be necessary. 3.2 Generalised Regularisation Neural Network (GRNN) The GRNN is based on the Nadaraya – Watson Kernel regression. GRNN’s feature fast training times can model non linear functions and have been shown to perform well in noisy environments given enough data. The primary advantage of the GRNN is the speed at which the network can be trained. Training a GRNN is performed in one pass of the training data through the network, the

46

training data values are copied to become the weight vectors between layers. The architecture of the GRNN is shown in the figure 4, it has four layers input pattern, summation and output, with weighted connections Wij between the input and pattern layer and

Summation Layer

Pattern Units

Output Units

Input S S D

fo S

L

S

W

BW ε r

D

S- Summation Unit D- Division Unit

P

fo Figure 4.Architecture of GRNN Network

Input

UbiCC Journal - Volume 3

47

sets or between H1 and H2 we need to minimize ||w|| with the condition that there are no data points between H1 and H2 3.3 Support Vector Machines Traditionally neural networks have been used for classification, which is based on Empirical Risk Minimization (ERM). SVM was developed by Vapnik and had become popular tools for data mining. The formulation embodies the Structural Risk Minimization (SRM), which is superior to empirical risk minimization. SRM minimizes the upper bound on expected risk as supposed to ERM that minimizes the error on training data. So, SVM generalizes much better. There are many linear classifiers that can separate data, but SVM only maximizes the margin i.e. the distance between it and the nearest data point in each class. We have N training data {(x 1,y1), (x2,y2),….. (xN,yN)} Where xi Є Rd and yi Є {+1,-1}. It needs to be classified using a linear hyper plane classifier

w . x –b ≥+1 for yi = +1 w . x –b ≤-1 for yi = -1 Combining the above two equations, yi ( w . x – b ) >= 1

yi ( w . x – b ) ≥1

-(20)

This is a convex quadratic problem in w, b in a convex set. The solution is found by solving using lagrangian method by introducing lagrangian multipliers. It is easier to solve using lagrangian dual equation given by -(21)

-(14)

This hyper plane will have maximum distance between each class. This hyper plane H : y =w . x – b = 0 and two hyper planes parallel to it H1 : y =w . x – b= +1

-(15)

H2 : y =w . x – b = -1

-(16)

With no data points between H1 and H2, and distance between H1 and H2 maximized. Some training point will lie on the hyper plane H1 and H2, they are called support vector machines because they define the separating plane and the other training points can be removed or moved provided they don’t cross the planes H1 and H2. The distance between hyper plane H1 and H2 is 2/ || w||. To maximize the distance between the two data

UbiCC Journal - Volume 3

-(19)

So the problem of maximizing the distance between hyper plane H1 and H2 is formulated as min ½ wT w subject to

LD = Σi αi - Σi Σj αi αj yi yj xi · xj f(x) =sgn (w.x - b)

-(17) -(18)

The significance of the above equation is that the training input vectors appear only as dot product. So when the data is not linearly separable it is required to transform the data into a higher dimensional. This causes complex calculations in neural networks but in SVM as data appear only as a dot product all calculation can be carried explicitly in low dimension if a kernel function exists for LD = Σi αi - Σi Σj αi αj yi yj Φ(xi) · Φ(xj) -(22) as Φ(xi ) · Φ(xj ) = K(xi , xj ) Where K is the kernel function. This is equivalent as the dot product in high dimension is equal to kernel function in input space. The common kernel function used is Gaussian kernel, K (xi , xj) = e - || xi – xj || 2 / σ2

-(23)

48

Mercers condition determines whether a function g(x) can be used as a kernel or not, ∫g(x)2 dx should be finite. 4.Design Implementation and Result. The dimensions of the rectangular patch were selected in a trial and error basis considering the constraints of the design in selecting the values. The different geometrical parameters were designed analytically and the bandwidth given in equation (9) was used to calculate the value for the selected dimensions. The parameters were used to construct the structure using the simulation software. The bandwidth and directivity along with the gain and radiation pattern of the design were obtained. The parameters of the patch equations (1-9) with the feed position for a resonant frequency are fed as input to the networks. The impedance bandwidth and directivity was taken as the output of the network. The analytical data values are given as input to train the network to obtain an optimized geometry for the probe fed microstrip antenna .The wide range of parameters was used to provide the optimum result and the training steps were increased to obtain the accuracy. The validity of the network was tested by comparing the analytical results obtained from the basic formulas for a given set of input values. The same parameters were used to construct a probe fed rectangular patch using simulation software shown in figure (7) and the output radiation pattern was obtained as shown in figure (8). The current pattern of the designed antenna is also plotted which shows the even distribution due to proper impedance matching of the probe feed. The same values were trained using the three networks Backpropagation Network architecture (BPN), Generalized Regularization neural network

UbiCC Journal - Volume 3

(GRNN) and Support vector machines (SVM) whose results were in good agreement with the analytical as well as the designed structure output shown in Table (1). The input output relations were also checked for the experimental results. The Backpropagation Network architecture achieves the antenna parameter optimization with maximum time for convergence. The GRNN and the SVM neural network achieves optimization with quicker learning time as shown in Fig 11 and 12. In this research analysis for antenna parameter optimization the GRNN neural network produced the accurate result with comparatively minimum time for convergence. The computational time was very less in terms of seconds with high accuracy as shown in Fig 13. The optimized parameters obtained using the training neural networks achieved high impedance bandwidth of 7.8%, directivity 7.73db without side lobes and offered high gain 8.67 dbi and radiation efficiency 100% was attained. The results were comparatively better when compared with the results from analytical analysis and simulation analysis for Microstrip patch antenna using PCAAD and MSTRIP40. To train the SVM parameters [alpha,b] = trainlssvm ({X, Y, type, gam, sig2, kernel, preprocess})

Outputs alpha matrix with support values of SVM b

vector with bias term(s) of SVM

Inputs

49

Fig 5. Antenna output model using MATLAB Software .

Fig 6 Antenna output model using MATLAB Software .

UbiCC Journal - Volume 3

50

Model Trained object oriented representation

Gam

Regularization parameter

of the SVM model

sig2

Kernel parameter (bandwidth in the

Model Object oriented representation of the

case of the 'RBF_kernel')

SVM model

kernel Kernel type default 'RBF_kernel'

X

matrix with the inputs of the

training data Y

Xt

inputs of the test data

preprocess

preprocess

vector with the outputs of the

training data

Plotting the graph in SVM

Type function estimation

plotlssvm({X,Y,type,gam,sig2,'RBF_kernel','p

Gam

Regularization parameter

reprocess'},{alpha,b});

sig2

Kernel parameter (bandwidth in the

case of the 'RBF_kernel')

Inputs X

kernel Kernel type (by default 'RBF_kernel')

data

preprocess

Y

preprocess'(*) or 'original'

matrix with the inputs of the training

vector with the outputs of the training

data Simulating the SVM

Type function estimation

Yt=simlssvm({X,Y,type,gam,sig2,'RBF_kerne

Gam Regularization parameter

l','preprocess'},Xt);

sig2

Kernel parameter (bandwidth in

the case of the 'RBF_kernel') Outputs

kernel

Yt

'RBF_kernel')

matrix with predicted output of test

data

Inputs X

type

(by

default

preprocess

preprocess

alpha

support values obtained from

training matrix with the inputs of the training

data Y

Kernel

b

Bias

term

obtained

from

training vector with the outputs of the training

data Type function estimation

UbiCC Journal - Volume 3

5.Conclusion The radiation pattern of the designed antenna presented in this paper figure (7) clearly depicted that it is a wideband antenna with high directivity, gain with radiation

51

efficiency. The major attraction of this antenna is size reduction along with bandwidth and directivity made it most suitable for satellite communication, commercial applications. The size reduction and operating frequency make it suitable

Figure 7.Structure of Probe Fed Microstrip Rectangular Patch Antenna

for mobile communication. The parameter optimization using the networks is the major attraction of this paper, which highlights the simplicity, accuracy and reduction in computational time for the designers of interest.

Figure 8.Current Distribution Pattern

Figure 9. Radiation Pattern of the Optimized Patch Antenna Figure 10. Plot Showing the Learning Trial of Back Propagation Network

Fig 1 1. Plot to show the time for convergence for SVM neural network.

UbiCC Journal - Volume 3

52

Fig 13. Plot to show the weight surface of SVM Fig 12. Plot to show the time for convergence for GRNN neural network.

Figure 15. Output of the Optimized output of the rectangular patch antenna using MSTRIP

UbiCC Journal - Volume 3

Fig 14. Plot to show the radiation pattern using PCAAD

53

References 1.Dipak K.Neog, Shyam S.Pattnaik, C.Panda, Swapna Devi, Bonomali Khuntia, and Malaya Dutta, “Design of a Wideband Microstrip Antenna and the use of Artificial Neural Networks in Parameter Calculation”, IEEE Antennas and Propagation Magazine, Vol.47, No.3, June 2005,pp.60-65. 2. Inder J.Bahl, Prakash Bhartla and Stanislaw S. Stuchly, “ Design of Microstrip Antennas Covered with a Dielectric Layer”, IEEE Transactions on Antennas and Propagation, Vol. AP-30, No. 2, MARCH 1982, pp. 314-318. 3. Kin-Lu Wong and Yi-Fang Lin, “Small broadband rectangular microstrip antenna with chip-resistor loading”, ELECTRONICS LETTERS, 1 l September 1997,Vol. 33 No. 79,pp.1593, 1594. 4. S.Lebbar, Z.Guennoun, M.Drissi, and F.Riouch, “A Compact and Broadband Antenna Design Using a Geometrical- MethodologyBased Artificial Neural Network”, IEEE Antennas and Propagation Magazine, Vol.48, No.2, April 2006,pp.146-154. 5. C. L. Mak, K. M. Luk, Senior Member, IEEE, K. F. Lee, Fellow, IEEE, and Y. L. Chow, “Experimental Study of a Microstrip Patch Antenna with an L-Shaped Probe,” IEEE Transactions on Antennas and Propagation, VOL. 48, NO. 5, MAY 2000,pp.777-783. 6. R.K.Mishra and Patnaik, “Designing Rectangular Patch Antenna Using the Neurospectral Method”, IEEE Transactions on Antennas and Propagation,AP-51,8 August 2003,pp.1914-1921. 7. S.S.Pattnaik, D.C.Panda and S.Devi, “Input Impedance of Rectangular Microstrip Patch Antenna Using Artificial Neural Networks”, Microwave and Optical Technology Letters,32,5,5 March 2002,pp.381-383. 8. S.S.Pattnaik, D.C.Panda and S.Devi, “Radiation Resistance of Coax-Fed Rectangular Microstrip Patch Antenna Using Artificial Neural Networks”, Microwave and Optical Technology Letters, 34,1,5 July 2002,pp.51-53. 9. D.M.Pozar, “Microstrip Patch Antennas,” in L.C.Godara (ed), Handbook of Antennas in Wireless Communications, New York, CRC Press, 2001,Chapter 6. 10.Ye Bin Hu Gu Yu , “The analyze and improve TCP performance using a DSR route protocol based on signal strength”, IEEE Wireless Communications, Networking and Mobile Computing, pp. 846 – 849, 2005. 11. Dongkyun Kim , Hanseok Bae, Jeomki Song, “Analysis of the interaction between TCP variants and routing protocols in MANETs”, IEEE Parallel Processing, ICPP 2005 Workshops, pp 380-386, 2005. 12. Prabakaran, M. Mahasenan, A. , “Analysis and enhancement of TCP performance over an IEEE 802.11 multi-hop wireless network: single session case”, IEEE International Conference on Personal Wireless Communications, pp-29- 33, 2005 13. Caihong Kai Yuzhong Chen Nenghai Yu, “An Improvement Scheme Applied to TCP Protocol in Mobile Ad Hoc Networks”, IEEE International Conference on Mobile Technology, Applications and Systems, pp.1-6, 2005

UbiCC Journal - Volume 3

54

PERFORMANCE OF TRANSMISSION SPACED SELECTION DIVERSITY IN DS-CDMA SYSTEMS Mona Shokair*, Maher Aziz**, Mohamed Nasr** *Faculty of Electronic Engineering, Menoufia University, Egypt. ** Faculty of Engineering, Tanta, Tanta University, Egypt. [email protected]

ABSTRACT The performance of transmission spaced selection diversity (SD) placed at base station (BS) in DS-CDMA system remains insufficiently clear. This performance will be evaluated by considering the effect of space distance between antennas and the maximum Doppler frequency (fd) on bit error rate (BER) performance under optimum conditions which are not clarified until now. Moreover, analysis of this system is presented under the effect of Rayleigh fading. Keywords: DS-CDMA, transmission spaced selection diversity, Rayleigh fading.

1 INTRODUCTION The demand for many radio services is increasing. New techniques are required to improve spectrum utilization to satisfy that demand without increasing the radio frequency spectrum that is used. One technique in a digital cellular system is the use of spread spectrum Code Division Multiple Access (CDMA) technology [1]. Another technique is diversity system. Cooperation between a CDMA system and diversity system has also been studied in [6]. Actually the main purpose of diversity system is mitigating the multipath fading which has negative effect on the quality of transmission of mobile radio communication. There are classifications of diversity system. One view of classifications is transmission and reception diversity. Other classifications are frequency diversity, polarization diversity, spaced diversity, time diversity and angle diversity. All these classifications are presented in detail in [2]. To combine diversity branches, many combing techniques are explained in detail in [1] and [2], including space Selection Diversity SD. In SD one of the M antenna branches that provides the highest Signal-to-Noise Ratio (SNR) is selected for data recovery. The success of diversity techniques depends on the degree to which the signals on the different branches are uncorrelated. This requires that the spacing between the antenna elements in the

UbiCC Journal - Volume 3

receiving or transmitting array is greater than a certain minimum distance. In this paper, the performance of transmission selection spaced diversity under the effects of spacing distance between antennas and the maximum Doppler frequency will be studied under using optimum conditions. These effects are not clarified until now. The organization of this paper is made as follows: Sect. 2 introduces the analysis of the system under Rayleigh fading. Computer simulation conditions are done in Sect. 3. Results are presented in Sect. 4. Conclusions are achieved in Sect. 5. 2 ANALYSIS OF THE SYSTEM OVER RAYLEIGH FADING To get expressions for both SNR and BER values, we consider a two – branch diversity system at BS with correlated fading channels. The received signal from each branch of the system can be modeled as [3]

rk(t)=Rkejαkejψm(t) + nk(t)

k=1,2

(1)

Where ψm(t) is the transmitted signal, Rk is a Rayleigh – distributed amplitude factor, αk is a uniformly distributed phase factor, and nk(t) is zero – mean Additive White Gaussian Noise (AWGN). The received signal can be described by:

55

rk(t)=[ Xk + jYk ] ejψm(t) + nk(t)

k=1,2

(2)

Where X1, X2, Y1, and Y2 are all Gaussian random 2 variables with zero mean and variance σ . The expectation can be expressed as,

E [XiYk] = 0

i = 1, 2;

k=1, 2

E[X1X2] = E [Y1Y2] = ρσ2

(3) (4)

Where ρ is the correlation coefficient between the fading channels. Another assumption is that the channel statistics are independent of the AWGN variables, which are also uncorrelated with each other, therefore

E [n1n2] = E[niXk] = E[niYk] = 0 i=1,2 ; k = 1,2

Eq. (9) and Eq. (10), a new SNR is defined for each uncorrelated signal:

Γ3 = (1 + ρ ) Γ Γ4 = (1 - ρ ) Γ

(11) (12)

Where Γ is the SNR of the original correlated signals. Now the BER values for a two-branch selective diversity system can be calculated from the following expression [3],

BER=

⎤ ⎥ ⎦

1 ⎡ Γ3 Γ3Γ4 Γ4 − + ⎢1− 2 ⎣ Γ3 +1 Γ4 +1 Γ4Γ3 +Γ3 +Γ4

(13)

(5)

3 COMPUTER SIMULATION CONDITIONS

Ref. [3] introduces a transformation matrix T to transform the correlated received signals r1(t) and r2(t) into two new uncorrelated signals r3(t) and r4(t) therefore,

DS-CDMA system with three antennas at the BS and one antenna at the MS is assumed. Fig. 1 shows propagation model at the BS. Table 1 shows simulation parameters. Incident waves

⎡ r3 (t ) ⎤ ⎡ r1 (t ) ⎤ ⎢r (t )⎥ = T ⎢r (t )⎥ ⎣4 ⎦ ⎣2 ⎦

⎡ 2 ⎢ Where T = ⎢ 2 ⎢− 2 ⎢⎣ 2

(6)

φ

2⎤ ⎥ 2 ⎥ 2⎥ 2 ⎥⎦

θ (7)

#3

#2

#1 d/λ

The two new received signals can be expressed as,

Figure1: Linear array and propagation model at BS.

rk(t) = [ Xk + jYk ] ejψm(t) + nk(t)

Table 1: Simulation parameters

k=3 , 4

(8)

By writing out the expressions of X3, X4, Y3, and Y4, it can be seen that they are functions of Gaussian random variables, therefore they are also Gaussian random variables; in addition they are mutually independent. Thus,

E[X23]

2

(9)

E[X24] = E [Y24] = (1- ρ) σ2

(10)

=E

[Y23]

= (1+ρ) σ

Also, n3 and n4 are functions of AWGN random variables, and they have the same noise power, and are uncorrelated with the new channel statistics. If the noise power at each receiver for the original correlated signals is the same, then, from

UbiCC Journal - Volume 3

Modulation Demodulation Symbol rate Spreading code Spreading factor

QPSK Coherent detection 30 Ksps Walsh code 128

To model the Rayleigh fading, we consider a set of 8 plane waves that are transmitted in random direction within the range of φ degrees at the BS [4]. The value of φ will be determined in the next section. Each of the plane waves has constant amplitude and takes the random initial phase distributed from 0 to 2π. The Doppler frequency is uniformly distributed from +fd to –fd ( fd : is the maximum Doppler frequency). The 8 incident plane waves arrive in random direction from 0 to 2π at the MS. QPSK is assumed with coherent detection. A square root raised cosine filtering with a roll-off

56

4 COMPUTER SIMULATION RESULTS Fig. 2 shows the effect of arrival angle, θ, of the signal on BER performance at Eb/N0=10dB. From this figure, it can be concluded that changing the value of θ gives slightly small effect. Therefore, we use in our simulation the value 300 of θ.

1.0E-01

BER

M=2

1.0E-02

M=1

1.0E-03 80

100

120

140

160

180

200

fd (Hz)

Figure 4: Maximum Doppler frequencies (fd ) vs. BER for Eb/N0=10 dB rho=0

1.0E+00

rho=0.1 rho=0.5

1.0E-01

BER

factor α of 0.5 is employed. A symbol rate of 30 ksps is assumed. The spreading code is Walsh code with spreading factor of 128. The Rayleigh fading channels were disturbed by AWGN. The Performance of the diversity system depends on correlation between antenna elements. The correlation is determined by antenna elements spacing, angle spread of incident waves φ and direction of arrival θ [5]. Thus, we have to optimize these values to get better BER performance.

rho=0.9 rho=1.0

1.0E-02 1.0E-03 1.0E-04

1.0E-02

1.0E-05

BER

0

5

10

15

20

SNR dB

0

45

90

135

cita in degrees

Figure 2: Arrival angle of the signal θ vs. BER. The effect of angle spread of incident waves φ is presented in Fig. 3. From this figure, we select the value of 120 which gives better BER performance.

Figure 5: Effect of branch correlation on BER performance of SD with two –branch diversity (theoretical) 1.0E-02

1.0E-03

BER

1.0E-03

1.0E-04

1.0E-05

1.0E-02

0

2

4

d/λ

6

8

BER

1.0E-03

Figure 6: Normalized distance (d/λ) vs. BER

1.0E-04 1.0E-05 6

8

10

12 14 16 fay in degrees

18

20

Figure 3: Angle spread of incident waves φ vs. BER The effect of fd on BER performance is shown on Fig. 4. As fd increases, due to the increase in the speed of the Mobile, BER performance will degrade. This degrading is due to rapid changes in channel characteristics. The lowest value of fd that gives better BER is 90 Hz.

UbiCC Journal - Volume 3

We use these values on the following paragraphs. Fig. 5 shows the results of the theoretical BER in Eq.13 for two-branch diversity system with different values of the correlation coefficient ρ. From this Figure it can be concluded that as the correlation coefficient increases the BER performance decrease. Also, as the coefficient approaches 1, one of the diversity branches is effectively removed; this leads to lose the advantage gained from antenna diversity. On the other hand, reducing values ρ correspond to an increase in the spatial separation between antennas. For this reason we have to look for the optimum antenna separation that yields better BER performance. Simulations were performed where the ratio d / λ was varied between 0.1 and 8. The results are indicated in Fig. 6. It is clear that as the ratio is increased, the BER performance is better. When d / λ is 6, we already have optimal BER

57

results. Also, increasing d / λ beyond 6 does not have any noticeable benefits. 1.0E-01 1.0E-02

M=1

1.0E-03

BER

M=2/0.5λ M=3/0.5λ

1.0E-04

M=2/5.25λ M=3/5.25λ

1.0E-05 1.0E-06 0

5

10

15

Eb/N0

20

Figure 7: BER vs. Eb/N0 for fd =90 Hz d/λ=0.5, 5.25 M=1, 2, and 3 Fig. 7 displays the BER of transmit diversity for different numbers of antennas, M=1, 2, and 3, and different values of antenna separation d / λ at the base station. It is clear that increasing M and d / λ have a positive effect on the BER performance. Numerically, the amount of improvement in Eb/N0 for M = 2, and 3 is 4dB and 6dB with respect to M=1 at d / λ=0.5 and BER=10E-4, respectively. Also, more improvement has been got at d / λ=5.25. It is increased to be 12dB and 14dB for M=2 and 3 with respect to M=1 and BER of 10E-4. 1.0E+00

theoritical

1.0E-01

BER

due to diversity gain. This gain comes from uncorrelated diversity branches. Moreover increasing the maximum Doppler frequency degrades the BER performance due to the rapid changes of channel characteristics. Moreover the analysis of this system is explained under Rayleigh fading.

simulation

6

REFERENCES

[1] M. K. Simon and M. S. Alouini : Digital Communication over Fading Channels: A Unified Approach to Performance Analysis. Wiley Series in Telecommunications and Signal Processing. New York: Wiley- Interscience, 2000. [2] W. C. Jakes : Microwave Mobile Communications. New York: John Wiley & Sons, Inc.1974 [3] L. Fang, G. Bi, and A. C. Kot, "New Method of Performance Analysis for Diversity Reception with Correlated Rayleigh-fading Signals," IEEE Transactions on Vehicular Technology, September 2000, vol. 49, pp. 1807 – 1812. [4] C. X. Wang and M. Patzold: Methods of Generating Multiple Uncorrelated Rayleigh Fading Processes. IEEE Vech. Tech. Conf. 2003_spring. [5] S. Kosono, and S. Sakagumi, "Correlation Coefficient on Base Station Diversity for Land Mobile Communication Systems", IEICE Trans., Comm., Vol. J 70-B No.4 1987, April, pp. 476-482 [6] Qiang Zhao, New Results on Selection Diversity Over Fading Channels, Thesis, December 27, 2002

1.0E-02 1.0E-03 1.0E-04 0

2

4

6

8

10

Eb/N0 dB

Figure 8: Comparison of simulated and theoretical results Figure 8 compares the simulated and theoretical results of BER for M=2. It shows that the simulation results are a close match to the BER in Eq. (13). The small difference is due to optimizing the simulation parameters. 5 CONCLUSIONS In this paper, the performance of transmission selection spaced diversity in DS-CDMA system was studied. This performance is not clarified until now under the effect of changing the space distance between antennas at BS and the maximum Doppler frequency by using the optimum conditions. The results show that increasing the space distance between antennas gives better BER performance

UbiCC Journal - Volume 3

58

WEB-BASED DECISION SUPPORT SYSTEMS AS KNOWLEDGE REPOSITORIES FOR KNOWLEDGE MANAGEMENT SYSTEMS Yuri Boreisha Minnesota State University Moorhead, USA [email protected] Oksana Myronovych North Dakota State University, USA [email protected]

ABSTRACT Problem solving and learning processes conducted on the basis of contemporary Webbased DSS provide for development and enhancement of knowledge management systems. Knowledge objects form the foundation of the conceptual approach to the knowledge management based on the contemporary Internet technologies and knowledge accumulated in DSS. Keywords: knowledge management systems, decision support systems.

1

INTRODUCTION

Knowledge management (KM) has become an important theme as managers realize that much of their firm’s value depends on ability to create and manage knowledge. To transform information into knowledge a firm must use additional resources to discover patterns, rules, and context where the knowledge works [1-3]. Knowledge that is not shared and applied to the practical problems does not add business value. Today people can share their knowledge in three primary ways. Organizational information systems (IS) that store, manage, and deliver documents are called content management systems (CMS). With the arrival of modern communications technology, people can share their knowledge via collaborating knowledge management systems (KMS). In addition to content management and collaboration, the knowledge can be shared via expert systems. Comprehensive discussion of important dimensions of knowledge, the knowledge management value chain, and types of KMS can be found in [2, 3]. Web 2.0 companies use the Web as a platform to create collaborative, community-based sites (e.g., social networking sites, blogs, wikis, etc.). The Web has now become an application, development, delivery, and execution platform [4]. Software as a Service (SaaS) - application software that runs on a Web server rather than being installed on the client computer – has gained

UbiCC Journal - Volume 3

popularity, particularly with businesses. Collaborating on projects with co-workers across the world is easier, since information is stored on a Web server instead of on a single desktop. Rich Internet Applications (RIAs) are Web applications that offer the responsiveness, “rich” features and functionality approaching that of desktop applications. RIAs are the result of today’s more advanced technologies (such as Ajax) that allow greater responsiveness and advanced GUIs. Web services have emerged and, in the process, have inspired the creation of many Web 2.0 businesses. Web services allow you to incorporate functionality from existing applications and Web sites into your own applications quickly and easily. Web 2.0 companies use “data mining” to extract as much meaning as they can from XHTML-encoded pages. XHTML-encoded content does not explicitly convey meaning, but XML-encoded content does. So if we can encode in XML (and derivative technologies) much or all of the content on the Web, we’ll take a great leap forward towards realizing the Semantic Web. Many people consider the Semantic Web to be the next generation in Web development, one that helps to realize the full potential of the Web – the “Web of meaning”. Though Web 2.0 applications are finding meaning in the content, the Semantic Web (heavily depended on XML and XML-based technologies) will attempt to make those meaning clear to computers as well as humans [5].

59

These trends in the Web Science – the new science of decentralized information systems – provide for new opportunities in the KM. In this paper we consider contemporary Decision Support Systems (DSS) as knowledge repositories that can be expanded to KMS using the Web 2.0 software development technologies and tools. This paper is based on a series of previous authors’ publications [6-11]. 2

KNOWLEDGE MANAGEMENT DECISION SUPPORT SYSTEMS

AND

The AI representation principle states that once a problem is described using an appropriate representation, the problem is almost solved. Wellknown knowledge representation techniques include rule-based systems, semantic nets and frame systems [12]. KM refers to the set of business processes developed in an organization to create, store, transfer and apply knowledge. KM increases the ability of the organization to learn from its environment and to incorporate knowledge into business processes. There are three major categories of KMS: enterprise-wide KMS, knowledge work systems (KWS), and intelligent techniques [2, 3]. Enterprise-wide KMS are general purpose, integrated, firm-wide efforts to collect, store, disseminate, and use digital content and knowledge. Such systems provide databases and tools for organizing and storing structured and unstructured documents and other knowledge objects, directories and tools for locating employees with experience in a particular area, and increasingly, Web-based tools for collaboration and communication. KWS (such as computer-aided design, visualization, and virtual reality systems) are specialized systems built for engineers, scientists, and other knowledge workers charged with discovering and creating new knowledge for a company. Diverse group of intelligent techniques (such as data mining, neural networks, expert systems, casebased reasoning, fuzzy logic, genetic algorithms, and intelligent agents) have different objectives, from a focus on discovering knowledge (data mining and neural networks), to distilling knowledge in the form of rules for a computer program (expert systems and fuzzy logic), to discovering optimal solutions for problems (genetic algorithms). It is said that effective KM is 80% managerial and organizational, and 20% technology. One of the first challenges that firms face when building knowledge repositories of any kind is the problem of identifying the correct categories to use when classifying documents. Firms are increasingly using a

UbiCC Journal - Volume 3

combination of internally developed taxonomies and search engine techniques. Organizations acquire knowledge in a number of ways, depending on the type of knowledge they seek. Once the corresponding documents, patters, and expert rules are discovered they must be stored so they can be retrieved and used. Knowledge storage generally involves databases, document management systems, expert systems, etc. To provide a return on investment, knowledge should become a systematic part of the organizational problem solving process. Ultimately, new knowledge should be built into a firm’s business processes and key application systems. KMS and related knowledge repositories should facilitate the problem solving process (Figure 1). During the process of solving problems managers engage into decision making, the act of selecting from alternative problem solutions. The different levels in an organization (strategic, management, and operational) have different decision-making requirements. Decisions can be structured, semi-structured or unstructured. The structured decisions are clustered at the operational level of the organization, and unstructured decisions at the strategic level. Management information systems (MIS) provide information on firm performance to help managers monitor and control the business, often in the form of fixed regularly scheduled reports based on data summarized from the firm’s transaction processing systems (TPS). MIS support structured decisions and some semi-structured decisions. DSS combine data, sophisticated analytical models and tools, and user-friendly software into a single powerful system that can support semistructured and unstructured decision making [3, 13, 14]. The main components of the DSS are the DSS database, the user interface, and the DSS software system (Figure 2). The DSS database is a collection of current data from a number of applications and groups. Alternatively, the DSS database may be a data warehouse that integrates the enterprise data sources and maintains historical data. The DSS user interface permits easy interactions between users of the system and the DSS software tools. Many DSS today have Web interfaces to take advantages of graphics displays, interactivity, and ease of use. The DSS software system contains the software tools that are used for data analysis. It may contain various OLAP tools, data mining tools, or a collection of mathematical and analytical models that easily can be made accessible to the DSS users.

60

Problem

Standards (Desired state)

Problem solver (Manager)

Alternative solutions (DSS) Constraints

Information (Current state)

Solution Figure 1: Elements of the problem solving process. The dialog manager is also in charge for the information visualization. Finally, access to the Internet, networks, and other computer-based systems permits the DSS to tie into other powerful systems, including the TPS or function-specific subsystems. There are many kinds of DSS. The first generic type of DSS is a Data-Driven DSS. These systems include file drawer and management reporting systems, data warehousing and analysis systems, Executive Information Systems and Spatial DSS. Data-Driven DSS emphasize access to and manipulation of large databases of structured data

Internal Data

and especially a time-series of internal company data and sometimes external data. Relational databases accessed by query and retrieval tools provide an elementary level of functionality. Data warehouse systems that allow the manipulation of data by computerized tools tailored to a specific task and setting or by more general tools and operations provided additional functionality. Data-Driven DSS with Online Analytical Processing (OLAP) provide the highest level of functionality and decision support that is linked to analysis of large collections of historical data.

DSS Database/ Data Warehouse

External Data

DSS Software System Models OLAP Tools Data Mining Tools

User Interface (Dialog Manager)

Users Figure 2: Main components of the DSS.

UbiCC Journal - Volume 3

61

A second category, Model-Driven DSS, includes systems that use accounting and financial models, representational models, and optimization models, and optimization models. Model-Driven DSS emphasize access to and manipulation of a model. Simple statistical and analytical tools provide an elementary level of functionality. Some OLAP systems that allow complex analysis of data may be classified as hybrid DSS providing modeling, data retrieval, and data summarization functionality. Model-Driven DSS use data and parameters provided by decision-makers to aid them in analyzing a situation, but they are not usually data intensive. Very large databases are usually not needed for Model-driven DSS. Knowledge-Driven DSS or Expert Systems can suggest or recommend actions to managers. These DSS are human-computer systems with specialized problem-solving expertise. The expertise consists of knowledge about a particular domain, understanding of problems within that domain, and skills at solving some of these problems (AI algorithms and solutions can be used). A related concept is data mining. It refers to a class of analytical applications that search for hidden patterns in a database. Data mining is the process of sifting through large amounts of data to produce data content relationships. Tools used for building Knowledge-Driven DSS are sometimes called Intelligent Decision Support methods.

Relational Database

Knowledge Database

Relational DBMS

Report Writing Software

Periodic and special reports

Mathematical Models

Outputs from mathematical models

Document-Driven DSS are evolving to help mangers retrieve and manage unstructured documents and Web pages. A Document-Driven DSS integrates a variety of storage and processing technologies to provide complete document retrieval and analysis. WWW provides access to large document databases including databases of hypertext documents, images, sounds and video. Examples of documents that would be accessed by Document-Driven DSS are policies and procedures, product specifications, catalogs, and corporate historical documents, including minutes of meetings, corporate records, and important correspondence. Search engines are powerful decision-aiding tools associated with DocumentDriven DSS. Group DSS (GDSS) came first, but now a broader category of Communications-Driven DSS or groupware can be identified. These DSS includes communication, collaboration and related decision support technologies. These are hybrid DSS that emphasize both the use of communications and decision models to facilitate the solution of problems by decision-makers working together as a group. Groupware supports electronic communication, scheduling, document sharing, and other group productivity and decision support enhancing activities. A DSS model that incorporates Group Decision Support, OLAP, and AI is shown on Figure 3.

Inference Engine

Multidimensional Database

Multidimensional DBMS

Groupware

Outputs from groupware

Solutions and explanations

Outputs from OLAP

Figure 3: A DSS model that incorporates GDS, OLAP, and AI.

UbiCC Journal - Volume 3

62

DSS facilitate the decision-making. Decision making is an integrated part of the overall problem solving process. KMS should facilitate the problem solving process. In the next section we are going to discuss how Web-enabled DSS can be integrated into contemporary KMS. 3

WEB-ENABLED SYSTEMS

DECISION

Business logic (domain) layer that implements the rules and procedures of the business processing. View layer that accepts input and formats and displays processing results. RIA have two key attributes – performance and rich GUI. RIA performance comes from Ajax (Asynchronous JavaScript and XML), which uses client-side scripting to make Web applications more responsive by separating client-side user interaction and server communication, and running them in parallel. Various ways to develop Ajax applications are discussed in [5]. Web services promote software portability and reusability in applications that operate over the Internet. Web service is a transition to serviceoriented, component-based, distributed applications. Web services are applications implemented as Webbased components with well-defined interfaces, which offer certain functionality to clients via the Internet. Once deployed, Web services can be discovered, used/reused by consumers (clients, other services or applications) as building blocks via open industry-standard protocols. Web service architecture is built on open standards and vendor-neutral specifications. Services can be implemented in any programming language, deployed and then executed on any operating system or software platform.

SUPPORT

All types of DSS can be deployed using Web technologies and can become Web-based DSS. Managers increasingly have Web access to data warehouses and analytical tools. To discuss the recent trends in this area the latest achievements in the three-layer design, Rich Internet Applications (RIA), and Web services should be taken into account. Three-layer design is an effective approach to development robust and easy maintainable systems. The corresponding architecture is appropriate for systems that need to support multiple user interfaces. Contemporary Web applications are three-layer applications. The most common set of layers includes the following: Data layer that manages stored data, usually in one or more databases. Internal Data

DSS Database/ Data Warehouse

External Data

Web Services provide access to DSS Software System

Ajax-Enabled Applications implement Dialog Manager

Internet Users Figure 4: Web-enabled DSS. The service-oriented architecture (SOA) provides the theoretical model for all Web services. The model behind Web services is a loosely coupled architecture,

UbiCC Journal - Volume 3

consisting of different software components working together. Consuming Web services is based on open standards managed by broad consortia (e.g., World

63

Wide Web Consortium, Organization for the Advancement of Structured Information Standards, Web Services Interoperability Organization). What makes Web services different from ordinary Web sites is the type of interaction that they can provide. Most of the enthusiasm surrounding Web services is based on the promise of interoperability. Every software application in the world can potentially talk to every other software application. This communication can take place across the old boundaries of location, operating system, language, protocol, and so on. Three-layer architecture maps well on the structure of main components of the DSS (see Figure

2). RIA provide for efficient implementation of the Dialog Manager GUI for DSS. Web services allow incorporating functionality from existing applications and due to this providing for access to the DSS Software System through the SOA. The components of the Web-enabled DSS are shown on Figure 4. We can call a group of the following related components a knowledge object (Figure 5). Discussed techniques allow to create new Web services (based on the existing ones and contemporary DSS software systems), and Ajax-enabled application interacting with these Web services. So we can talk about creation and modification of the knowledge objects.

DSS Database/ Data Warehouse

Web Service

Ajax-Enabled Application Figure 5: Structure of a knowledge object. Web-enabled DSS provide for expandable collections of the knowledge objects that constitute the knowledge repository of the corresponding KMS. From this point of view the knowledge objects can be considered as a knowledge representation technique. 4

PROBLEM SOLVING AND LEARNING

AI distinguishes two general kinds of learning. The first kind is based on coupling new information to previously acquired knowledge. Typical examples include learning by analyzing differences, by managing multiple models, by explaining experience, and by correcting mistakes. The second kind is based on digging useful regularity out of data; a practice often refers as data mining. Typical examples include learning by recording cases, by building identification trees, by training neural nets, by training perceptrons, by training approximation nets, and by simulation evolution (e.g. genetic algorithms). Expert systems primarily capture the tacit knowledge of individual experts, but organizations also have collective knowledge and expertise that they have

UbiCC Journal - Volume 3

built up over the years. This organizational knowledge can be captured and stored using case-based reasoning (CBR). In CBR description of the past experiences of human specialists, represented as cases, are stored in a database for the later retrieval when the user encounters a new case with similar parameters. The system searches for stored cases with problem characteristic similar to the new one, finds the closest fit, and applies the solution of the old case to the new case. Successful solutions are tagged to the new case and both are stored together with the other cases in the knowledge base. Unsuccessful solutions are also appended to the case database along with explanations as why the solutions did not work. Problem-based learning (PBL) is (along with active learning and cooperative/collaborative learning) one of the most important developments in contemporary higher education. PBL is based on the assumption that human beings evolved as individuals who are motivated to solve problems, and that problem solvers will seek and learn whatever knowledge is needed for successful problem solving. PBL is a typical example of an

64

application of the first type of learning in higher education [11].

Combining the main ideas of CBR and PBL the following problem solving and learning process can be depicted as it’s shown on Figure 6.

User describes the problem

User learns about the knowledge objects that facilitate the problem solving

System searches Repository of knowledge objects for the suitable ones

Repository of knowledge objects (based on a Web-enabled DSS)

System asks user additional questions to narrow search

System stores the problem description and the knowledge object in the repository

System finds the closest fit and provides access to knowledge objects

New knowledge object is created to better fit the problem Figure 6: Problem solving and learning with knowledge objects. 5

CONCLUSIONS

Knowledge is a complex phenomenon, and there are many aspects to the process of managing knowledge. Knowledge-based core competencies of firms are key organizational assets. Knowing how to do things effectively and efficiently in ways that other organizations cannot duplicate is a primary source of profit and competitive advantage that cannot be purchased easily by competitors in the marketplace. This paper discusses Web-enabled DSS, related knowledge repositories, and KMS that facilitate the problem solving and learning. The knowledge objects approach to the knowledge representation allows considering contemporary DSS as integrated parts of the corresponding KMS.

UbiCC Journal - Volume 3

6

REFERENCES

[1] V.

[2] [3] [4] [5]

Supyuenyong, N. Islam: Knowledge Management Architecture: Building Blocks and Their Relationships, Technology Management for the Global Future, Vol. 3, pp. 1210-1219 (2006). K.C. Laudon, J.P. Laudon: Management Information Systems. Managing the Digital Farm, Prentice Hall, pp. 428-508 (2006). R. McLeod, G. Schell: Management Information Systems, 10th Edition, Prentice Hall, pp. 250-274 (2006). P.J. Deitel, H.M. Deitel: Internet and World Wide Web. How to Program, 4th Edition, Prentice Hall, pp. 50-117 (2008). T. Berners-Lee, et al: A Framework for Web Science, Foundations and Trends in Web Science, Vol. 1, No 1, pp. 1-130 (2006).

65

[6] Y. Boreisha, O. Myronovych: Web-Based Decision Support Systems in Knowledge Management and Education, Proceedings of the 2007 International Conference on Information and Knowledge Engineering, IKE’07, June 25-28, Las Vegas, USA, pp. 11-17 (2007). [7] Y. Boreisha, O. Myronovych: Web Services-Based Virtual Data Warehouse as an Integration and ETL Tool, Proceedings of the 2005 International Symposium on Web Services and Applications, ISWS’05, June 27-30, Las Vegas, USA, pp. 52-58 (2005). [8] Y. Boreisha, O. Myronovych: Data-Driven Web Sites, WSEAS Transactions on Computers, Vol. 2, No 1, pp. 79-83 (2003). [9] Y. Boreisha: Database Integration Over the Web, Proceedings of the International Conference on Internet Computing, IC’02, June 24-27, Las Vegas, USA, pp. 1088-1093 (2002). [10] Y. Boreisha: Internet-Based Data Warehousing, Proceedings of SPIE Internet-Based Enterprise Integration and Management, Vol. 4566, pp. 102-108 (2001).

UbiCC Journal - Volume 3

[11]

Y. Boreisha, O. Myronovych: Knowledge Navigation and Evolutionary Prototyping in ELearning Systems, Proceedings of the E-Learn 2005 World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education, October 24-28, Vancouver, Canada, pp. 552-559 (2005). [12] P.H. Winston: Artificial Intelligence, AddisonWesley, pp. 15-228 (1992). [13] S. French, M. Turoff: Decision Support Systems, Communications of the ACM, Vol. 50, No 3, pp. 39-40 (2007). [14] Chien-Chih Yu: A Web-Based ConsumerOriented Intelligent Decision Support System for Personalized E-Services, ACM International Conference Proceeding Series, Vol. 60, pp. 429437 (2004).

66

Simulating social interaction scenarios in an office. Michele Bezzi



Robin Groenevelt



Frederick Schlereth



October 15, 2007

Abstract

extremely hard even in well structured environments, such as an office. The main issue is the complexity of social human behavior due to its high variability, its dependency on external constraints such as temporal, spatial context (e.g., environment layout) and task context (e.g., personal list of activities and goals). A successful model should therefore incorporate all these aspects, and, to be realistic, parameters have to be set using experimental data.

Work team coordination is becoming a major challenge in the contemporary complex working environments. Coordination process takes place through direct interaction and explicit communication, but it takes also advantage of informal social network within team members. Consequently, in order to develop realistic model of team coordination, we need to measure and model such interactions in real world environments. We present an agent-based model for simulating people movement in a workspace, which may be used as tool for developing and testing social relationship models. We demonstrate the model by simulating office life in one of our laboratories and comparing the results to actual measurements obtained with a sensor network.

On the positive side, recent sensor technologies provide us an unprecedented recording of information from the physical world. In previous studies, we investigated the social patterns during some typical office activities [2], using data from a sensor network located in one of our laboratories [15, 14]. Collecting long-term and reliable data using this pervasive environment is a long process and may raise privacy issues. Consequently, working with a real life environment does not allow us to efficiently test the impact 1 Introduction of changes in the environment (e.g., impact of some Large corporations are often organized in functional space rearrangements on group dynamics). The aim of the paper is to introduce an agent-based teams. The objective of team work is to achieve a common goal by integrating and coordinating indi- model for simulating a workspace with movements of vidual capabilities. In this framework, social interac- people and face-to-face contact between individuals. tions play a major role, and—although many commu- This model can be used as tool for investigating the nication media are nowadays available—-face-to-face dynamics of social interactions, for which the results interactions are still highly important [1, 5]. Accord- can be fed by and/or validated against actual meaingly, theoretical models of how people interact in a surements obtained with a sensor network. In parcertain environment can be useful to shed some light ticular, this allows assessment of these measures unon the mechanisms underlying the collective behavior der different conditions, such as assessing the impact of teams and business units. However, finding realis- of a physical change in the environment, the effects tic mathematical descriptions of social interactions is of team building exercises, the arrival of a new employee, or changes in layout of the teams. ∗ Accenture Technology Labs, 449, route des Cretes, Sophia Antipolis, France † Chalmers University, Goteborg, Sweden

The important reason for being able to simulate social encounters is that it allows us to study the effect 1

UbiCC Journal - Volume 3

67

of (changes in) the environment on the social behavior of people. There are many questions for which, to the best of our knowledge, little or few quantitative studies exist. For example, how well and quickly does a new employee get integrated into the working society under a variety of scenarios? These scenarios could include: having people working in open space offices instead of in cubicles, having team meeting in various locations, the location of a coffee machine, the effect of being at the far end of building. Do people get more social connections when teams are mixed so that it forces people to walk around more? We see our simulator as a step towards quantitatively answering these kinds of questions, in the lack of realworld measurements. The sketch of the paper is the following: in Section 2 we briefly summarize the main features to model and the actual sensor network. In Section 3 we describe the probabilistic model underlying our model. Social behavior, derived through numerical simulations, are presented in Section 4. Finally, conclusions are drawn in the last section.

2

vided into 50 locations, each of them the size of approximately a room. This allowed us to remove the variability of paths inside a room while still maintaining enough information about the movements of people. Each sensor detects signals of people in its sensory field. For each person and location the signals were merged together to build the current probabilistic evidence of finding a certain person in a specific location, after which this information was integrated with the current belief of the system (derived from previous observations). The result was a sequence of matrices, one for each time step, where the probability of finding a person in each location is reported. In the second step, starting from these matrices, we derived the most likely paths for each tracked individual; these data were then analyzed to find frequent patterns and appropriate statistical quantities to describe long term activities. Extracted recurrent patterns were identified later exploiting local semantics (e.g., meetings usually take place in the meeting room) as well as context-based knowledge (e.g., matching movement patterns with the information available from electronic calendars). The data acquisition system is currently still under development, so we had too little data available to find meaningful long-term recurrent patterns. Nonetheless, to give a glimpse of the kind of statistical analysis we are interested in, we analyzed a limited data set showing, for example, that functional teams, such as research and development groups, tend to be strongly interconnected inside the group, but loosely connected across different groups. Results of this analysis are reported in Ref. [2].

Modeling Office Activities

We chose an office environment as a test setting for two reasons. First of all, quantitative evaluations of various office activities have important practical applications (e.g., assessing the quality of space organization in the office, estimating connections amongst different people/departments, safety and security). Secondly, a video-camera infrastructure which collects data on peoples movements and presence was readily available in one of our offices and the data thus collected is accessible to us [15, 14]. This last experimental environment is composed of an office floor at Accenture Technology Labs in Chicago. The floor is equipped with a network consisting of 30 video cameras, 90 infrared tag readers, and a biometric station for fingerprint reading. The first step was the fusion of this raw-sensor data into a higher-level description of peoples movements inside the office. Identification and tracking of the people was performed using a Bayesian network. In short (see [14] for details), the office space was di-

3

Numerical Simulations

In this section we present a model for simulating movements of people in an office setting analogous to the workspace described above. In fact, data collection in a real environment is a long process and it may generate privacy concern. Therefore, to freely test our algorithms and hypothesis, we built an agentbased simulator of movements of people an office. As in the real-life setting, the office map was divided into 50 locations, each of them the size of a room (see 2

UbiCC Journal - Volume 3

68

a snapshot of the simulation (at time 11 a.m.). The output of the agent-based system consisted of a temporal sequence of matrices, which report the location of each agent for each time step, with the same format as for the sensor network. This allowed us to use the same analysis tools for both the agent-based model and for the real-life data collected. Despite its simplicity, this model showed a visual agreement with the trajectories observed in the real environment. We used this model to study the evolution of social interactions.

4

Social Network Analysis

Social network analysis provides a powerful tool for assessing patterns of relationships in informal networks [5, 3]. The nodes in the network represent the people and the links represent the interactions between the nodes. Social network theory has a long history [11], but has only recently been able to take full advantage of the large use of digital communications; the properties of such networks have been extensively studied using data from emails [6, 9] and instant-messaging [16]. In the first study an individual’s emailing history is analyzed and his connections are automatically generated and displayed as a graph. Typical analysis include: the number of connections and frequency of contacts, the diameter and cliqueness (i.e., degree of local clusters) of the network, the time evolution of the network, and identifying the most-connected nodes. The distribution of connections in social networks has often been shown to follow a power law, i.e., the number of nodes with connectivity k falls as:

Figure 1: Snapshot of the simulation (at time 11 a.m.). Numbers indicate locations. Note that a meeting is taking place in the North-East corner room. Fig. 1). In total there were 30 people (agents). In its simplest version, each agent had a set of possible destinations in the office floor, with different probabilities derived from the collected data and from our knowledge of their office life. At each time step, each agent decides to stay in the current location with a certain probability (usually large if it is in his or her own office) or to move to a destination sampled from a distribution of destinations. In this last case, the agent starts moving according to a specific path, usually the shortest one, with possible random fluctuations. An agent also has a personal schedule in which specific tasks are listed (e.g., meetings, lunch, coffee) with a corresponding time and probability of performing that action. This schedule was derived from samples of employees electronic calendars and then integrated with context knowledge, such as typical arrival, lunch, and departure times. Furthermore, in case two or more agents cross paths in the same location, the probability of staying was increased by a quantity, ∆p, specific for each agent. This probability mimics the fact that random encounters may result in short conversations. Its numerical value was derived from real data whenever available and using context knowledge in the other cases. Fig. 1 shows

n(k) ∝ k α where a is a negative constant, usually somewhere between 1 and 4. This leads to a scale free network in which there are many nodes with few connections as well as the existence of highly connected hubs, which foster network cohesion and connections between distant nodes, even in very large networks. Emails or instant-messaging logfiles provide a large source of data about social relationships, and they give interesting results and potential applications [17, 7], but 3

UbiCC Journal - Volume 3

69

1

n(k)

10

0

10 0 10

1

10 k

Figure 3: Social network as extracted from movement Figure 2: The degree of connectivity k of a node plot- data from one day of simulation. Black and white cirted against the frequency of nodes with degree k on a cles indicate researchers and developers, respectively. log-log scale. Each point represents data points from one numerical simulations over a 7 period. ple from different teams are loosely connected. Results vary across different runs but the a two-clusters they do not consider physical interactions and face- structure was already present. Similar results were to-face communications that are at the basis of hu- obtained by analyzing the tracking data from the man behaviors. In this study, we focused on this last real-life sensor network (see Fig. 2 in Ref. [2]). feature, we estimate social relationships from patWe simulated one week of activity and measured terns of collocation in the workplace. This approach the properties of the resulting social network. Fig. 2 will be integrated with data collected from electronic shows the degree of connectivity k versus the frecommunications in future studies, to better specify quency of nodes with degree k for one simulation. the structure of the network and to investigate the In general, the observed distributions do no follow a (possible) different topologies of electronic and phys- power law (straight line in the log-log plots). This is ical social networks. probably due to the limited sampling size: there are We inferred the structure of the social network in few agents and a short duration of the simulation. the office by simulating the movement of a group of In fact, due to the small size of the environment, people for long periods and considering a simple prox- this frequency distribution converges to a delta afimity rule: two individuals share a link if they spend ter 6 − 8 weeks of simulations, at this time every enough time in the vicinity of one another. In addi- agent is directly connected to everybody else. Furtion, we added to the system some context specific ther investigations and more experimental data are rules, e.g., we excluded the entrance hall. This sim- clearly required to fully characterize the topology of ple rule can lead to a number of false positives, e.g., this network, and to assess whether the structure of two individuals may share the same location with- the social network in a real world physical space difout interacting. However, we expect that in the long fers from those measured with email or chat log files, run and with a large number of users it provides a where spatial extension and physical constraints are gross estimation of global structure of the network of not taken in account. interactions and of its evolution in time. Extending the period of simulation to 4 weeks, we Fig. 3 illustrates the social network amongst two observed the network becomes fully connected after departments (Research and Development) after one 9 working days (on average), even if the clusters corday of simulation; it shows, for example, that peo- responding to the different teams are still present at 4

UbiCC Journal - Volume 3

70

the end of the simulation. This suggests that in small environments people get connected in rather short amounts of time. To check this hypothesis, we simulated the arrival of a new employee in the office and measured over the time the number of hops needed to connect him to all the other people in the office (shortest average path length). Fig. 4 shows the average number of hops (links) needed for this new joiner to connect to any other employees in the office (triangles indicate the average over 50 simulations and bars correspond to the standard deviation). After one month the new employee had directly interacted with all the people in the office, i.e., Fig. 4 black triangles ≃ 1 at day 30. Excluding formal meetings from this dataset, we can estimate the contribution of random encounters (square dots in Fig. 4). Random encounters contribute largely to the increase the connectivity stressing the relevance of informal contacts to establish a personal social network. Indeed considering random encounters only, the network becomes connected after 13 days (on average) and after 30 the new joiner is, almost (1.3 hops on average, Fig. 4), connected to all the others. Our current experimental setup does not permit long recordings so we were not able to compare the simulation results to experimental data.

5

of our laboratory during normal office hours. In this paper we presented an agent-based model for modeling peoples movements and social interactions in the same setting. This simulator uses a set of simple rules which reproduces a persons trajectories inside the office, and provides a cheap and flexible tool to develop and test pervasive environment and human interaction models. In particular, we investigated the social interactions taking place during normal work days. The paper does not present a complete model for modeling the dynamics of interactions as we did not consider, for example, digital communication media, and—more importantly—we disregarded the content of the interactions. Still, the results of this preliminary study show that it is technically possible to analyze the spatial influence of the environment on the behavior of the people and relevant numbers concerning face-to-face interactions in real-environment can easily be generated. This allows for important input for collective human behavior modeling, as well as practical implications to evaluate the implementation of certain measures such as office design, team building efforts, efficient information transmission, and the correct integration of new joiners. The next step will be to validate these against real-life data from our experimental setup, and to possibly extend it to larger (and richer) environments. To this scope, privacy is clearly a major concern. Possible solutions include users controlling the personal data released, limiting the data a single party can access, data anonymization, and following accepted ethical guidelines. In applications where real-time is not a requirement (as in our case for identifying social networks), the users could have full control over the data released, e.g., receiving a weekly e-mail with the summary of events; and deciding which of them to disclose for the analysis. Even more important is finding a reasonable equilibrium point in the trade-off between privacy and benefits. In other words, users need to be provided a clear and tangible return for their privacy investment for gaining acceptance. Lastly, even an analysis in some specific cases (research laboratories, conferences, public events [8, 4, 13]) will hopefully increase our—at the moment very limited—quantitative knowledge on social interac-

Conclusions

Social interactions are highly important in collective activity, such as goal-oriented work teams. In particular, despite the fact that many communication media are accessible, face-to-face interactions still constitute one of the preferred media for information transmission [1] and contribute to increase the cohesion within groups. Furthermore, it has been shown [10, 12] that the actual physical context, such as the design of the environment and physical locations of agents, can considerably impact the human agent coordination. Accordingly, suitable measures of social interactions in real environments are needed to develop abstract model of team functioning. We previously developed a prototype pervasive environment allowing the measuring of face-to-face interactions inside one 5

UbiCC Journal - Volume 3

71

Histories in Smart Environments , also eprint arXiv: 0706.1926, 2006.

2.5

[3] K. Chan and J. Liebowitz. The synergy of social network analysis and knowledge mapping: a case study. International Journal of Management and Decision Making, 7(1):19–35, 2006.

2 Hops

1.5

[4] S. Counts and J. Geraci. Incorporating physical co-presence at events into digital social networking. In CHI ’05: CHI ’05 extended abstracts on Human factors in computing systems, pages 1308–1311, New York, NY, USA, 2005. ACM Press.

1

0.5 5

10

20

15

25

30

Days

[5] R. Cross and A. Parker. The Hidden Power of Social Networks: Understanding how Work Really Gets Done in Organizations. Harvard Business School Press, 2004.

Figure 4: Triangles indicate the average number of hops to connect one worker with all the workforce (average shortest path). In the x-axis the number of days are reported, starting from day 8. Previous days are not shown because the network is not fully connected. Vertical bars indicate standard deviations taken over 50 simulations. Square dots indicate the same quantity considering random encounters only. Standard deviations for random encounters are not shown for clarity.

[6] H. Ebel, L.-I. Mielsch, and S. Bornholdt. Scalefree topology of e-mail networks. Phys. Rev. E, 66(3):035103, Sep 2002. [7] R. Guimera, B. Uzzi, J. Spiro, and L. A. N. Amaral. Team Assembly Mechanisms Determine Collaboration Network Structure and Team Performance. Science, 308(5722):697–702, 2005.

tions and their effects on collective behaviors.

[8] Q. Jones and S. A. Grandhi. P3 systems: Putting the place back into social networks. IEEE Internet Computing, 9(5):38–46, 2005.

Acknowledgments

[9] Frederick Schlereth contributed to this study during his internship at Accenture Technology Labs in Sophia Antipolis. We thank Valery Petrushin and Gang Wei for providing tracking data obtained from [10] Accenture Technology Labs in Chicago.

References

H. Kautz, B. Selman, and M. Shah. Referral web: combining social networks and collaborative filtering. Commun. ACM, 40(3):63–65, 1997. D. Kirsh. Distributed Cognition, Coordination and Environment Design. Proceedings of the European conference on Cognitive Science, pages 1–11, 1999.

[11] S. Milgram. The small world problem. Psychol[1] T. Allen. Architecture and Communication ogy Today, 2(1):60–67, 1967. Among Product Development Engineers. MIT [12] A. Omicini, A. Ricci, M. Viroli, C. CastelPress, Cambridge, MA, 1997. franchi, and L. Tummolini. Coordination Artifacts: Environment-Based Coordination for In[2] M. Bezzi and R. Groenevelt. Towards undertelligent Agents. Proceedings of the Third Instanding and modeling office daily life. In 2nd ternational Joint Conference on Autonomous International Workshop on Exploiting Context 6

UbiCC Journal - Volume 3

72

Agents and Multiagent Systems-Volume 1, pages 286–293, 2004. [13] A. Pentland, T. Choudhury, N. Eagle, and P. Singh. Human dynamics: computation for organizations. Pattern Recogn. Lett., 26(4):503– 511, 2005. [14] V. Petrushin, R. Ghani, and A. Gershman. A Bayesian Framework for Robust Reasoning from Sensor Networks. AAAI Spring Symposium on AI Technologies for Homeland Security March, pages 21–23, 2005. [15] V. Petrushin, G. Wei, R. Ghani, and A. Gershman. Multiple sensor integration for indoor surveillance. Proceedings of the 6th international workshop on Multimedia data mining: mining integrated media and complex data, pages 53–60, 2005. [16] R. D. Smith. Instant messaging as a scalefree network. Arxiv preprint cond-mat/0206378, 2002. [17] F. Wu, B. Huberman, L. Adamic, and J. Tyler. Information flow in social groups. Physica A: Statistical Mechanics and its Applications, 337(1-2):327–335, 2004.

7

UbiCC Journal - Volume 3

73

NEW STOP & WAIT ARQ PROTOCOL Nitin Jain, Rishi Asthana & Manuj Darbari Uttar Pradesh Technical University, Lucknow [email protected], [email protected], [email protected]

ABSTRACT In all types of data communication systems, errors may occur. Therefore error control is necessary for reliable data communication. Error control involves both error detection and error correction. Previously error detection can be done by Cyclic Redundancy Check (CRC) codes and error correction can be performed by retransmitting the corrupted data block popularly known as Automatic Repeat Request (ARQ). But CRC codes can only detect errors after the entire block of data has been received and processed. In this work we use a new and “continuous” technique for error detection namely, Continuous Error Detection (CED). The “continuous” nature of error detection comes from using arithmetic coding. This CED technique improves the overall performance of communication systems because it can detect errors while the data block is being processed. We focus only on ARQ based transmission systems. We will show have the proposed CED technique can improve the throughput of ARQ systems by up to 15%. Keywords: Cyclic redundancy check codes, arithmetic coding, automatic repeat request.

1

INTRODUCTION

When we talk about any type of data communication system, we concern only on its reliability. In all types of data communication systems, errors may occur. Error control is the only way out for avoiding this problem. It comes by detecting the error in first step and then correctin g it in another step. For error detection we had CRC codes and for error correction we use to retransmit the corrupted data which is popularly known as ARQ. Although efficient, CRC’s can detect errors only after an entire block of data has been received and processed. An error detection scheme that is “continuous” can detect errors while the block is being processed. Thus, it can enhance the overall performance of the communication systems. In this paper, we use this type of new and continuous method for detecting the errors. The new method of error detection is based on the arithmetic coder, and allows for an efficient way to detect errors continuously in the bit-stream by investing a controlled amount of redundancy in the arithmetic coding operation. During our research, we became aware that the idea of continuous error detection based on arithmetic coding had been tackled earlier by Boyd et al. [1], albeit with little system performance analysis, or exposition of its utility in communication systems. In this paper, we not only undertake a more rigorous analysis of this paradigm, quantifying the underlying tradeoffs involved in the process, but also establish impressive gains in system performance attainable through sophisticated

UbiCC Journal - Volume 3

integration of this novel paradigm into popular, powerful transmission scenarios such as ARQ. Upon applying this method of error detection to stop-and-wait ARQ, gains in throughput were achieved over conventional ARQ schemes at all bit error probabilities. Result shows that the throughput of new stop -and-wait ARQ protocol i.e. with CED is approximately 15% enhanced than the throughput of the conventional stop -and-wait ARQ protocol i.e. with CRC. The rest of the paper is organized as follows. The basic idea behind the continuous error detection is introduced in Section 2. Section 3 presents an application of CED for ARQ transmission where it provides significant throughput gains over conventional CRC-based schemes. We conclude in Section 4. References are given in Section 5. 2

IDEA BEHIND CED

To understand the error detection scheme, an understanding of how the arithmetic coder works is necessary. We assume that the reader is familiar with arithmetic coding and refer readers that are unfamiliar with arithmetic coding to [2]. Arithmetic coding is a method of data compression in which data strings are mapped to code strings which represent the probabilities of the corresponding data strings. The method in which this mapping is achieved requires a model which specifies the assumptions that are made about the source data. A simple example of a model is the memoryless model where the current symbol being

74

encoded is independent of the previously encoded symbols. Another simple example is the first-order Markov model, where the current symbol being encoded is dependent only on the previously encoded symbol. For simplicity, we will examine the memoryless model, keeping in mind that the analysis generalizes to more sophisticated models. Thus, encoding and decoding via the arithmetic coder function by repetitively partitioning subint ervals within the unit interval [0, 1) according to the probabilities of the data symbols. The basic idea is simple and consisting of adding a “forbidden” symbol that is never encoded by the arithmetic coder, but nonetheless occupies a nonzero measure on the set [0, 1), then upon decoding, if an error occurs, this “forbidden” symbol is guaranteed to be decoded due to the loss of synchronization. The amount of time it takes to decode the “forbidden” symbol after the occurrence of an error is inversely related to the amount of redundancy added through introducing the “forbidden” symbol. T his allows for control of the number of bits we suspect need to b e retransmitted. In fact, we can guarantee to a specified accuracy, that errors will be localized to the previous m bits (where m is a function of the amount of redundancy added) upon decoding the “forbidden” symbol. This is useful in an ARQ setting, becau se as soon as the error is detected, we have a statistical assurance as to how many bits need to be retransmitted to ensure that the bit in error will be retransmitted. 3

APPLICATION OF CED

Simulations were run using a binary symmetric channel at various bit-error probabilities. Several ten kilobit packets were sent at each bit-error probability, and the resulting throughput was calculated. As a measure of performance, we compared our method of ARQ i.e. with CED to the conventional methods of ARQ i.e. wit h CRC. The conventional methods of ARQ function by dividing the data into packets and then attaching CRC’s [3] to each packet. Upon detection of an error in the conventional ARQ method, a retransmission of the entire block is requested. To simulate a fair comparison for our method versus conventional ARQ methods, we used the optimal packet size for each bit-error probability tested using conventional ARQ. The optimal packet size was calculated by differentiating the throughput equation for conventional ARQ (details can be found in [4]) with respect to the packet size, and solving for the packet size which maximizes throughput. The resulting throughputs are shown in Fig. 1. and we see that the new method of ARQ outperforms conventional ARQ methods at all bit-error probabilities.

UbiCC Journal - Volume 3

Figure 1: Throughput comparison curves for new stop-and-wait ARQ protocol i.e. with CED (upper curve in red color) versus conventional stop -and-wait ARQ protocol i.e. with CRC (lower curve in blue color ). 4

CONCLUSION

In this paper we have introduced a new method of error detection for common ARQ protocols. We analytically characterized the tradeoff of added redundancy versus error -detection capability and formulated a method for incorporating this new error detection “device” into an ARQ type scenario. We would also like to mention here that CED can be put to good use to improve throughput performance of transport protocols like TCP over heterogeneous networks, where early detection of an error can result in a potentially greater number of retransmits, thereby increasing the probability of successful reception over a fading channel. This is currently being verified. The goal of this work is to present the benefits that communication systems can derive from using CED for throughput enhancement. 5

REFERENCES

[1] C. Boyd, J. Cleary, S. Irvine, I. RinsmaMelchert, and I. Witten, “Integrating error detection into arithmetic coding,” IEEE Trans. Commun., vol. 45, pp. 1–3, Jan. 1997. [2] G. Langdon, “An introduction to arithmetic coding,” IBM J. Res. Develop. , vol. 28, pp. 135– 149, Mar. 1984. [3] T. Ramabadran and S. Gaitonde, “A tutorial on crc computations,” IEEE Micro, vol. 45, pp. 62– 74, Aug. 1988. [4] M. Schwartz, Telecommunication Networks: Protocols, Modeling and Analysis. Reading, MA: Addison-Wesley, 1987.

75

Cohesive Modeling, Analysis and Simulation for Spoken Urdu Language Numbers with Fourier Descriptors and Neural Networks Engr. S K Hasnain, Member IET, Assistant Professor, Pakistan Navy Engineering College, Karachi National University of Engineering & Technology, Pakistan [email protected] Dr. Azam Beg, Member IEEE, Assistant Professor College of Information Technology, UAE University Al-Ain, United Arab Emirate [email protected]

ABSTRACT This unified research describes spoken Urdu numbers investigative analysis from ‘siffar’ (zero) to ‘nau’ (nine) for making a concrete foundation in recognition of Urdu language. Sound samples from multiple speakers were utilized to extract different features using Fourier descriptor and Neural networks. Initial processing of data, i.e. normalizing and time-slicing was done using a combination of Simulink in MATLAB. Afterwards, the MATLAB tool box commands were used for calculation of Fourier descriptions and correlations. The correlation allowed comparison of the same words spoken by the same and different speakers. The analysis presented in this paper laid a foundation step in exploring Urdu language in developing an Urdu speech recognition system. In this paper the speech recognition feed-forward neural network models in Matlab were developed. The models and algorithm exhibited high training and testing accuracies. Our major goal work involves in the future use of TI TMS320C6000 DSK series or linear predictive coding. Such a system can be potentially utilized in implementation of a voice-driven help setup in different systems. Such as multi media, voice controlled tele-customer services. Keywords: Spoken number, Fourier, Correlation, Feature extraction, Feed-forward neural networks, Learning rate

1.

INTRODUCTION

Automatic speech recognition has been an active research topic for more than four decades. With the advent of digital computing and signal processing, the problem of speech recognition was clearly posed and thoroughly studied. These developments were complemented with an increased awareness of the advantages of conversational systems. The range of the possible applications is wide and includes: voice-controlled appliances, fully featured speechto-text software, automation of operator-assisted services, and voice recognition aids for the handicapped [1]. The speech recognition problem has sometimes been treated as a speech-to-text conversion problem.

UbiCC Journal - Volume 3

Many researchers have worked in this regard. Some commercial software is also available in the market for speech recognition, but mainly in English and other European languages. While the humans are speaking, the formats vary depending on the position of jaws, tongue and other parts of the vocal tract. Two related key factors are: bandwidth of each format, and format membership in a known bandwidth. The vowel for all human beings tends to be similar [2]. Sounds are mainly categorized into these groups: voiced sounds (e.g., vowels and nasals), unvoiced sounds (e.g., fricatives), and stop-consonants (e.g., plosives). The speech starts in lungs but is actually formed when the air passes through larynx and vocal tracts. Depending on the status of vocal fold in larynx the sound can be grouped into: voiced sound that is

76

time-periodic in nature and harmonic in frequency; the unvoiced is more noise-like. Speech modeling can be divided into two types of coding: waveform and source. In the beginning, the researchers tried to mimic the sounds as it is, meaning, waveform coding. This method tries to retain the original waveform using quantization and redundancy. An alternative approach makes use of breaking the sound up into individual components which are later modeled, separately. This method of utilizing parameters is referred to as source coding. Different characteristics of speech can be used to identify the spoken words, the gender of the speaker, and/or the identity of the speaker. Two important features of speech are pitch and formant frequencies: (a) Pitch is a significant distinguishing factor among male and female speakers. The frequency of vibration of vocal folds determines the pitch, for example, 300 times per second oscillation of the folds results in 300 Hz pitch. Harmonics (integer multiples of fundamental frequency) are also created while the air passes through the folds. The age also affects the pitch. Just before puberty, the pitch is around 250 Hz. For adult males, the average pitch is 60 to 120 Hz, and for females, it is 120 to 200 Hz. (b) The vocal tract, consisting of oral cavity, nasal cavity, velum, epiglottis, and tongue, modulates the pitch harmonics created by the pitch generator. The modulations depend on the diameter and length of the cavities. These reverberations are called formant frequencies (or resonances). The harmonics closer to the formant frequencies get larger while others are attenuated. While the humans are speaking, the formants vary depending on the positions of the tongue, jaw, velum and other parts of the vocal tract. Two related key factors are: bandwidth of each formant, and formant’s membership in a known bandwidth. The vowels for all human beings tend to be similar. Each vowel uttered by a person generates different formants. So we can say that the vocal tract is a variable filter, whose inputs are (1) the pitch, and (2) the pitch harmonics. The output of the filter is the gain or the attenuation of the harmonics falling in different formant frequencies. The filter is called variable filter model. The transfer function for the filter is determined by the formant frequencies [3]. Correlation exists between objects, phenomena, or signals and occurs in such a way that it cannot be by chance alone. Unconsciously, the correlation is used in everyday life. When one looks at a person, car or house, one’s brain tries to match the incoming image with hundreds (or thousands) of images that are

UbiCC Journal - Volume 3

already stored in memory [4]. We based our current work on the premise that same word spoken by different speakers is correlated in frequency domain. In the speech recognition research literature, no work has been reported on Urdu speech processing. So we consider our work to be the first such attempt in this direction. The analysis has been limited to number recognition. The process involves extraction of some distinct characteristics of individual words by utilizing discrete (Fourier) transforms and their correlations. The system is speaker-independent and is moderately tolerant to background noise. 2.

REVIEW OF DISCRETE TRANSFORMATION & ITS MATLAB IMPLEMENTATION

Discrete Fourier transform (DFT) is itself a sequence rather than a function of continuous variable and it corresponds to equally spaced frequency samples of discrete time Fourier transform of a signal. Fourier series representation of the periodic sequence corresponds to discrete Fourier transform of finite length sequence. So we can say that DFT is used for transforming discrete time sequence x(n) of finite length into discrete frequency sequence X[k] of finite length. This means that by using DFT, the discrete time sequence x(n) is transformed into corresponding discrete frequency sequence X[k][4]-[5].. DFT is a function of complex frequency. Usually the data sequence being transformed is real. A waveform is sampled at regular time intervals T to produce the sample sequence of N sample values, where n is the sample number from n=0 to n=N-1.

{x(nT)} = x(0), x(T ),...,x[(N − 1)T ] The data values x(nT) will be real only when representing the values of a time series such as a voltage waveform. The DFT of x(nT) is then defined as the sequence {X [kω]} = X(0), X(ϖ),........ X[(N − 1)ω] in the frequency domain, where ω is the first harmonic frequency given by ω = 2π / NT. Thus X [ k ω ] has real and imaginary components in general, so that for the kth harmonic X ( k ) = R ( k ) + j I( k ) X (k ) =

[R

2

(k ) + I 2 (k )

]

1/ 2

(2.1)

77

and X [k] has the associated phase angle

φ ( k ) = tan −1 [I ( k ) / R( k )]

with N=4 can be easily verified using definition in (2.6). (2.2)

where X[k] is understood to represent X[k]. These equations are therefore analogous to those for the Fourier transform. Note that N real data values (in time domain) transform to N complex DFT values (in frequency domain). The DFT values, X[k], are given by: N−1

FD [x (nT)] = ∑x(nT)e− jkωnT, k = 0,1,...,N − 1

(2.3)

n=0

where ω= 2π / NT and FD denotes the DFT. X [k ] =

N −1

∑ x (n T ) e

− jk 2 π n / N

The results of the following commands with N=4 can be for example easily verified with [5]: x=[1 0 0 1]; X=fft(x) In this example, the DFT components [Xm] =[2, 1-j, 0, 1+j] are found from (2.4). The second expression in (2.6) specifies the value of N in (2.4), which effectively overrides any previous specification of the length of the vector x, thus the following commands produce the same result: x=[1 0 0 1 3]; X=fft(x,4)

The Fast Fourier transform (FFT) eliminates most of the repeated complex products in DFT. In C version of signal processing algorithm, there are several different routines for real and complex versions of the DFT and FFT. When these routines are coded into the MATLAB language, they are very slow compared with the MATLAB fft routine, which are coded much more efficiently. Furthermore, the MATLAB routines are flexible and may be used to transform real or complex vector of arbitrary length. They meet the requirements of nearly all signal processing applications; consequently, in this paper, the fft routines are preferred over all discrete transform operations. MATLAB’s fft routine produces a one-dimensional DFT using the FFT algorithm; that is when [XK] is a real sequence, fft produces the complex DFT sequence [Xm]. In MATLAB, the length N of the vector [XK] may not be given. Thus both of the following are legal expressions: (2.5)

The first expression in (2.5) produces a DFT with same number of elements as in [XK], regardless of whether [XK] is real or complex. In the usual case where [XK] is real and length N, the last N/2 complex elements of the DFT are conjugates of the first N/2 elements in the reverse order, in accordance with (2.4). In the unusual case where [XK] is complex, the DFT consists of N independent complex elements. For example, the results of the following commands

UbiCC Journal - Volume 3

(2.6)

(2.4)

n =0

X=fft(x) X=fft(x, N)

FFT computing time 1 = log 2 N DFT computing time 2 N

The DFT, x= [Xm] has length = 4 is the same as in previous example. x=[1 1]; X=fft(x, 4) [Xm] =[ 2, 1-j, 0, 1+j] The result here is same because, when N is greater than the length of x; X is the DFT of a vector consisting of x extended with zeros on the right, from the length of x to N. (The length of vector x itself is not increased in this process). The MATLAB library also includes a two dimensional fft routine called fft2. The routine computes two-dimensional FFT of any matrix, whose element may be, for example, samples (pixel values) of a two dimensional image. Usually, some recognition occurs when the incoming images bear a strong correlation with an image in memory that “best” corresponds to fit or is most similar to it. A similar approach is used in this investigation, to measure the similarity between two signals. This process is known as autocorrelation if the two signals are exactly same and as crosscorrelation if the two signals are different. Since correlation measures the similarity between two signals, it is quite useful in identifying a signal by comparing it with a set of known reference signals. The reference signal that results in the lowest value of the correlation with the unknown signals is most likely the identity of the unknown object [9].

78

Correlation involves shifting, multiplication and addition (accumulation). The cross-correlation function (CCF) is a measure of the similarities or shared properties between two signals. Application of CCF includes cross spectral density, detection and recovery of signals buried in noise, for example the detection return signals, pattern, and delay measurement. The general formula for crosscorrelation rxY(n) between two data sequences x(n) and y(n) each containing N data might therefore be written as: rxy =

N −1

∑x

(n) y (n)

(2.7)

n =0

The autocorrelation function (ACF) involves only one signal and provides information about the structure of the signal or its behaviour in the time domain. It is special form of CCF and is used in similar applications. It is particularly useful in identifying hidden properties.

organized into layers, and only unidirectional connections are permitted between adjacent layers. This is known as a feed-forward multi layer Perceptron (MLP) architecture. This architecture is shown in Fig 1. 4.

DATA ACQUISITION AND PROCESSING

The system presented in this model was limited to processing of individual Urdu numerals (0 to 9). The data was acquired by speaking the numerals into a microphone connected to MS-Windows-XP based PC. We recorded the data for fifteen speakers who uttered the same number set (0 to 9), specifically, siffar, aik, do, teen, chaar, paanch, chay, saat, aath, and nau. Each sound sample was curtailed to 0.9 minute.

From Wave File s1_w1.wav Out (22050Hz/1Ch/16b) One

S-Function

rxx = 3.

N −1

∑ x ( n) x ( n)

(2.8)

FDATool

Digital Filter Design1

To Wave Device yout Signal To Workspace

n =0

FEEDFORWARD VS RECURRENT NETWORKS

Neural networks have proven to be a power tool for solving problem of prediction, classification and pattern recognition. [12]-[17].

|FFT| 2

yout1

Magnitude FFT

Signal To Workspace1

Fig.2 Simulink model for data of numbers extraction for analysis

One of the obvious methods of speech data acquisition is to have a person speak into an audio device such as microphone or telephone. This act of speaking produces a sound pressure wave that forms an acoustic signal. The microphone or telephone receives the acoustic signal and converts it into an analog signal that can be understood by an electronic system. Finally, in order to store the analog signal on a computer, it must be converted to a digital signal [6],[8].

Fig.1 Neural network model for feedforward multilayer Perceptron for analysis

Neural network architecture can be divided into two principal types: recurrent and non-recurrent networks. An important sub-class of non-recurrent NN consists of architectures in which cells are

UbiCC Journal - Volume 3

The data in this paper is acquired by speaking Urdu numbers into a microphone connected to MSWindows-XP based PC. The data is saved into ‘.wav’ format files. The sound files are processed after passing through a (Simulink) filter, and are saved for further analysis. We recorded the data for fifteen speakers who spoke the same number set, i.e. zero to nine.

79

FDATool

yout2

From Wave File s10_w1.wav Out (22050Hz/1Ch/16b)

Signal To Workspace2

Digital Filter Design1

One S-Function

From Wave File s10_w2.wav Out (22050Hz/1Ch/16b)

Two

|FFT|2 Magnitude Freq FFT2 Vector Scope2

Autocorr A LPC

Three

From Wave File s10_w4.wav Out (22050Hz/1Ch/16b)

Vector Scope

yout1

0.0000000 Display

Mean

To Frame

Waterfall Scope

U( : )

4.1 Pre-Emphasis

Frame Conversion1 Convert 2-D to 1-D Waterfall2

Five From Wave File s10_w6.wav Out (22050Hz/1Ch/16b)

Time

Signal To Magnitude Buffer Frame Conversion Matrix FFT1 Concatenation Workspace1

Four

From Wave File s10_w5.wav Out (22050Hz/1Ch/16b)

Display5

Horiz Cat

To Frame

|FFT|2

0.00000000000000

Autocorrelation LPC

S-Function

From Wave File s10_w3.wav Out (22050Hz/1Ch/16b)

To Wave Device2

By pre-emphasis [7], we imply the application of a normalization technique, which is performed by dividing the speech data vector by its highest magnitude.

0.000000000000000e+000 Display1

Median

Matrix Viewer

Six From Wave File s10_w7.wav Out (22050Hz/1Ch/16b)

0.0000000

RMS

Seven

From Wave File s10_w8.wav Out (22050Hz/1Ch/16b)

Eight From Wave File s10_w9.wav Out (22050Hz/1Ch/16b)

Column Sum

0.0000000 Standard Deviation

CONV Convolution

4.2 Data Length Adjustment

Matrix Viewer

Display2

RMS

Matrix Sum

0.0000000 Display6

Display3

0.0000000 Display4

From Wave File s10_w10.wav Out (22050Hz/1Ch/16b)

Ten

Fig.3 Extended Simulink model for analysis of Urdu

UbiCC Journal - Volume 3

FFT execution time depends on exact number of the samples (N) in the data sequence [XK], and the execution time is minimal and proportional to N*log2(N), where N is a power of two. Therefore, it is often useful to choose the data length equal to a power of two [10]. 4.3 Endpoint Detection

Nine

spoken numbers

In general, the digitized speech waveform has a high dynamic range, and can suffer from additive noise. So first, a Simulink model was used to extract and analyze the acquired data as shown in Fig.3. Another Simulink model was developed for performing analysis such as standard deviation, mean, median, autocorrelation, magnitude of FFT and data matrix correlation. We also tried a few other statistical techniques, however, most of them failed to provide us any useful insight into the data characteristics. We would also like to mention that we had started our experiments by using Simulink, but found this GUI-based tool to be somewhat limited because we did not find it easy to create multiple models containing variations among them. This iterative and variable-nature of models eventually led us to MATLAB’s (text-based) .m files. We created these files semi-automatically by using a PERL-language script; the script was developed specifically for this purpose. Three main data pre-processing steps were required before the data could be used for analysis:

The goal of endpoint detection is to isolate the word to be detected from the background noise. It is necessary to trim the word utterance to its tightest limits, in order to avoid errors in the modeling of subsequent utterances of the same word. As we can see from the upper part of Fig. 4, a threshold has been applied at both ends of the waveform. The front threshold is normalized to a value that all the spoken numbers trim to a maximum value. These values were obtained after observing the behavior of the waveform and noise in a particular environment. We can see the difference in frequency characteristics of the words aik (one) to paanch

80

(five)

in

Fig

4–8

respectively.

4.4 Windowing Speech signal analysis also involves application of a window with a time less than the complete signal. The window first starts with beginning of the signal and then shifted until it reaches the end. Each application of the window to the part of the speech signal results in a spectral vector. 4.5 Frame Blocking

Fig.4 Correlation of the spoken Urdu number aik (one)

Since the vocal tract moves mechanically slow, speech can be assumed to be a random process with slowly varying properties. Hence the speech is divided into overlapping frames of 100 ms. The speech signal is assumed to be stationary over each frame and this property will prove useful in further operations [18]. 4.6

Fourier Transform

The MATLAB algorithm for the two dimensional FFT routine is as follows [9]:

Fig.5 Correlation of the spoken Urdu number dau (two)

fft2(x) =fft(fft(x), ‘); Thus the two dimensional FFT is computed by first computing the FFT of x, that is, the FFT of each column of x, and then computing the FFT of each row of the result. Note that as the application of fft2 command produced even symmetric data, we only show lower half of the frequency spectrum in our graphs. 4.7 Correlation Calculations for correlation coefficients of different speakers were performed [2]-[3]. As expected, the cross-correlation of the same speaker for the same word did come out to be 1. The correlation matrix of a spoken number was generated in a threedimensional form for generating different simulations and graphs. Figures 4 - 8 show the combined correlation extracted for fifteen different speakers for the Urdu number siffar (zero) to paanch (five), that differentiate the number spoken by different speaker for detailed analysis.

UbiCC Journal - Volume 3

Fig.6 Correlation of the spoken Urdu number teen (three)

Fig.7 Correlation of the spoken Urdu number char (four)

81

SPEAKER: s1 w2

700

600

500

400

300

200

100

0

Fig.8 Correlation of the spoken Urdu number paanch (five) Figures 9 - 14 show the FFT magnitude spectrum extracted for fifteen different speakers for the Urdu number siffar (zero) to paanch (five), that again differentiate the number spoken by different speaker for detailed analysis.

0

50

100

150

200

250

300

Fig 11. FFT magnitude spectrum for spoken Urdu number dau (two) SPEAKER: s1 w3

700

600

500

400

300

SPEAKER: s1 w0

800

200

700

100

600

0

0

50

100

150

200

250

300

500

Fig.12 FFT magnitude spectrum for spoken Urdu number teen (three)

400 300 200

SPEAKER: s1 w4

100 400

0

0

50

100

150

200

250

350

300

300

Fig.9 FFT magnitude spectrum for the spoken Urdu number siffar (zero)

250 200 150 100

SPEAKER: s1 w1

500

50 0

450

0

50

100

150

200

250

300

Fig.13 FFT magnitude spectrum for spoken Urdu number char (four)

400 350 300 250 200

SPEAKER: s1 w5

500

150

450

100

400

50

350

0

300

0

50

100

150

200

250

Fig.10 FFT magnitude spectrum for spoken Urdu number aik (one)

300

250 200 150 100 50 0

0

50

100

150

200

250

300

Fig.14 FFT magnitude spectrum for spoken Urdu number paanch (five)

UbiCC Journal - Volume 3

82

Figures 15-19 show the combined magnitude spectrum of correlation extracted for fifteen different speakers for the Urdu number siffar (zero) to paanch (five), shown here for brevity which differentiate the number spoken by different speaker for detailed analysis, while Fig 20 shows surface graph of the number aik (one).

SPEAKER ONE: s1 w4

900 800 700 600 500 400 300 200 100 0

SPEAKER ONE: s1 w0

1800

0

2000

4000

6000

8000

10000

12000

Fig.18 Magnitude spectrum of the correlation of the spoken Urdu number spoken chaar(four) by speaker1

1600 1400 1200 1000 800

SPEAKER ONE: s1 w5

1200

600 1000

400 200 0

800

0

2000

4000

6000

8000

10000

12000

600

Fig.15 The waveform of the correlation of the spoken Urdu number siffar (zero) by speaker-1

400

200

0

SPEAKER TWO: s2 w0

600

0

2000

4000

6000

8000

10000

12000

Fig.19 Magnitude spectrum of the correlation of the spoken Urdu number spoken paanch (five) by speaker1

500

400

300

SPEAKERS 15 WORD 1 surface 200 1

100

0.8 0.6

0

0

2000

4000

6000

8000

10000

0.4

12000

0.2

Fig.16 The waveform of the correlation of the spoken Urdu number siffar (zero) by speaker-2

0 15 15

10 10

5

5 0

SPEAKER THREE:s3 w0

1800

0

Fig.20 The surface plot of the correlation of the spoken Urdu numbers spoken aik (one) by speaker-15

1600 1400 1200 1000

4.8

Creating a Network

800 600 400 200 0

0

2000

4000

6000

8000

10000

12000

Fig.17 Magnitude spectrum of the spoken Urdu number siffar (zero) by speaker-3

To create a feedforward neural network (MLP) object in an interactive way to use the command nntool to open the Neural Network Manager. We can then import or define data and train our network in the GUI or export our network to the command line for training [5]. net = newff(MxMx, [S1 S2 . . . SK], {TF1 TF2 . . . TFK}, BTF, BLF, PF)

UbiCC Journal - Volume 3

83

%where MnMx (p − 1 × 2) matrix of min and max values for x input vectors. %Si Size of ith layer, for K layers. %TFi Transfer function of ith layer, default = tansig. %BTF Backpropagation network training function, default = traingdx. %BLF Backpropagation weight/bias learning function, default = learngdm. %PF Performance function, default = mse. 4.9 Network Structure and Parameters %Neural Network object: ... %subobject structures: inputs: {1x1 cell} of inputs layers: {2x1 cell} of layers outputs: {1x2 cell} containing 1 output targets: {1x2 cell} containing 1 target %biases: {2x1 cell} containing 2 biases inputWeights: {2x1 cell} containing 1 input weight layerWeights: {2x2 cell} containing 1 layer weight ... %weight and bias values: IW: {2x1 cell} containing 1 input weight matrix LW: {2x2 cell} containing 1 layer weight matrix b: {2x1 cell} containing 2 bias vectors 4.10 Network Training / Testing In incremental training the weights and biases of the network are updated each time an input is presented to the network. The cell array format is most convenient to be used for networks with multiple inputs and outputs, and allows sequences of inputs to be presented. The matrix format can be used if only one time step is to be simulated, so that the sequences of inputs cannot be described. There are two basic functions available for the incremental training: learngd and learngdm. Refer to the manual for details. % learning rate net.trainParam.lr = 0.01; net.trainParam.epochs = 6000; % tolerance net.trainParam.goal = 0.01; net=train(net,p,t); %test the network with y_start = sim(net,p); figure(200),plot(t,'g'); 4.11 System Workflow

UbiCC Journal - Volume 3

This section describes the system workflow for creating, initializing, training and running the network on test data. The following sub sections demonstrate the Word Network to demonstrate the system workflow of the demo program. The first step is to create and initialize the Neural Network and set up the network configuration. net=newff(minmax_of_p,[64, 32, 32, {'tansig','tansig','tansig','purelin'},'traingd'); // this is input layer. Layer = 64 // this is hidden layer no 1. Layer = 35 // this is hidden layer no. 2 Layer = 35 // this is output layer. Layer = 1

1;],

4.12 Data Structure Training Data training is an array list of a structure which is presented. Each of the initial four string values are part of the input vector provided to the network. The last string variable represents the target output for the network in the code fragmented below: net.trainParam.show = 50; % x-axis of graph is 50 values apart net.trainParam.lr = 0.01; % learning rate net.trainParam.epochs = 6000; net.trainParam.goal = 0.01; % tolerance net=train(net,p,t); 4.13 Setting and Loading Training Data Following code loads up the default training data into the training data list. The default training data includes hard-coded input and output patterns for a number of different homonyms word scenarios along with their corresponding suggested output. % Minimum array size has been kept 1e10 min_arr_size = 1e10; % Read data from .wav files and calculate magnitudes from FFTs % 15 speakers have spoken the same word pronounced as "siffar" s1_w0 = wavread('s1_w0.wav');

84

s1_w0_norm = s1_w0/max(s1_w0); s1_w0_fft0 = abs(fft2(s1_w0_norm)); min_arr_size =min(min_arr_size, size(s1_w0_norm)); % Minimum array size has been kept 1e10 min_arr_size = 1e10;

In the above code, notice that the first line in the function is the training of the word structure network, which the Word Network uses to first detect the problem in the network and then try to identify the alternate word for the incorrect word in word engine. 4.15 Pattern Detecting

% Read data from .wav files and calculate magnitudes from FFTs % 15 speakers have spoken the same word pronounced as "siffar" s1_w0 = wavread('s1_w0.wav'); s1_w0_norm = s1_w0/max(s1_w0); s1_w0_fft0 = abs(fft2(s1_w0_norm)); min_arr_size= min(min_arr_size, size(s1_w0_norm)); …………..

Once the network is trained, it can be used to detect the correct output for the given input vector. The following functions take the input pattern as an array and use the function to retrieve the output from the network. % Creating new array 'alldat (:,:)' for NN use ...

s15_w0 = wavread('s15_w9.wav'); s15_w0_norm = s2_w0/max(s15_w9); s15_w0_fft0 = abs(fft2(s15_w9_norm)); min_arr_size=min(min_arr_size, size(s15_w9_norm))

alldat(2,1) = 1; alldat(2,2) = 2; alldat(2,3:258) = s1_w2_fft0(1:256);

4.14 Network Training Before using the Neural Networks in the demo program to detect and classify correct word structure as well as able to find out the alternate word for the incorrectly recognized word (by the Speech Recognition engine), we need to train the Word Network with the training data. The train network uses the object to train the word frequency and network 6000 and 1500 times respectively. The following code snippet shows how the word network gets trained.

alldat(1,1) = 1; alldat(1,2) = 1; alldat(1,3:258) = s1_w1_fft0(1:256);

alldat(3,1) = 1; alldat(3,2) = 3; alldat(3,3:258) = s1_w3_fft0(1:256); 4.16 Setting and Loading Test Data The demo program is capable of loading the required input for the neural network from the wave file. The wave file that contains the input sentences to be detected by the neural network, however should be in a specific format so that the program can correctly load and feed the input data to the neural network and generate the correct output. load 'neural_network.mat'

cutoff = 64; for i=1:150; for j=3:cutoff+2; ptrans(i,j-2) = log(alldat(i,j)); end end p = transpose(ptrans); % inputs t = transpose(alldat(:,2)); % target array for i = 1:cutoff minmax_of_p(i,1) = min(p(i,:)); end for i = 1:cutoff minmax_of_p(i,2) = max(p(i,:)); end

UbiCC Journal - Volume 3

% The data of 15 speakers spoken in Urdu from siffar (zero) to nau (nine). % Minimum array size has been kept 1e10 min_arr_size = 1e10; % Read data from .wav files and calculate magnitudes from FFTs % 15 speakers have spoken the same word pronounced as "siffar" s1_w0 = wavread('s1_w0.wav'); s1_w0_norm = s1_w0/max(s1_w0); s1_w0_fft0 = abs(fft2(s1_w0_norm)); min_arr_size= min(min_arr_size, size(s1_w0_norm));

85

% Creating new array 'alldat(:,:)' for NN use ...

alldat(3,1) = 1; alldat(3,2) = 3; alldat(3,3:258) = s1_w3_fft0(1:256); % min_arr_size = min_arr_size/2; min_arr_size = 11000; % 15 speakers have spoken the same word pronounced as "siffar". % To find min the array length so all arrays are clipped to it. s1_w0_fft(1:min_arr_size,1)=s1_w0_fft0(1:min_arr_ size, 1); s2_w0_fft(1:min_arr_size,1)=s2_w0_fft0(1:min_arr_ size, 1); s3_w0_fft(1:min_arr_size,1)=s3_w0_fft0(1:min_arr_ size, 1); s4_w0_fft(1:min_arr_size,1)=s4_w0_fft0(1:min_arr_ size, 1);

The effect of learning rate (α) on training is shown in Fig.25-38. In our experiments, a larger value of α took lesser time to train, although convergence to the steady state value was noisy. It shows that there exists a trade off between learning rate (α), network parameter setting and the epoch for which the algorithm was used. We observed that Fourier descriptor feature was independent of the spoken numbers, with the combination of Fourier transform and correlation technique commands used in MATLAB, a high accuracy recognition system can be realized. Recorded data was used in Simulink model for introductory analysis. The feedforward neural network was trained for different learning rates, goal and epochs. It was found that there is a trade off between learning rate, goal and epochs. The network was trained and tested as shown in Fig 21-30. Performance is 0.0089228, Goal is 0.01

2

10

1

10 Training-Blue Goal-Black

alldat(1,1) = 1; alldat(1,2) = 1; alldat(1,3:258) = s1_w1_fft0(1:256); alldat(2,1) = 1; alldat(2,2) = 2; alldat(2,3:258) = s1_w2_fft0(1:256);

0

10

-1

10

-2

10

For the above model we can examine structure of the created network nnet1 and its all initial parameters. Typing in nnet1 on the command line gives the structure of the top level neural network object. The extract of the structure is as follows: 5.

ANALYSIS & RESULTS

When we compared the frequency content of the same word by different speakers, we found striking similarities among them. This helped us to get more confidence in our initial hypothesis that a single word uttered by a diverse set of speakers would exhibit similar characteristics. Additionally, Fig.23 shows surface graph for the correlation of frequency content among different speakers, for words aik (one).

UbiCC Journal - Volume 3

-3

10

0

1

2

3

4 8 Epochs

5

6

7

8

Fig.21 Training on Class 5with 12 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01 Performance is 0.00942263, Goal is 0.01

3

10

2

10

Training-Blue Goal-Black

% testing the network y_start = sim(net,p); figure(200),plot(t,'g'); hold on; plot(y_start,'b'); legend('actual','predicted'); hold off

1

10

0

10

-1

10

-2

10

-3

10

0

1

2

3

4

5 6 10 Epochs

7

8

9

10

Fig.22 Training on number 6 with 12 speakers Layers (64,35, 35,1) Learning rate 0.01

86

Goal

0.01

Performance is 0.00987841, Goal is 0.01

2

10

Performance is 0.00979687, Goal is 0.01

2

1

10

Training-Blue Goal-Black

10

1

Training-Blue Goal-Black

10

0

10

-1

10

-2

-3

10

10

-3

0

10

20

30 65 Epochs

40

50

60

Fig.23 Training on number 7 with 12 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01

2000

Performance is 0.00994686, Goal is 0.01

2

1

Training-Blue Goal-Black

1

Training-Blue Goal-Black

1000 1500 2459 Epochs

10

10

0

10

-1

0

10

-1

10

-2

10

10

-2

-3

10

10

-3

0

1

2

3

4

5 6 11 Epochs

7

8

9

10

11

Fig.24 Training on number 8 with 12 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01

500

1000 2285 Epochs

1500

2000

Performance is 0.00955099, Goal is 0.01

2

10

Performance is 0.00967774, Goal is 0.01

2

0

Fig.27 Training on 12 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01

1

10

Training-Blue Goal-Black

10

1

10 Training-Blue Goal-Black

500

10

Performance is 0.00912022, Goal is 0.01

2

0

Fig.26 Training on 15 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01

10

10

-1

10

-2

10

10

0

10

0

10

-1

0

10

-1

10

-2

10

10

-2

-3

10

10

0

500

1000

1500

1569 Epochs -3

10

0

2

4

6 12 Epochs

8

10

12

Fig.25 Training on number 9 with 12 speakers Layers (64,35, 35,1)

UbiCC Journal - Volume 3

Fig.28 Training on 10 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01

87

7 64, 10, 10, 10 64.44% 8 64, 15, 15, 10 69.63% 9 64, 20, 20, 10 58.52% 10 64, 25, 20, 10 71.85% 11 64, 25, 25, 10 71.85% 12 64, 30, 30, 1 72.59% 13 64, 35, 35, 1 94% Table 1. Training of Neural Network with different number of speakers

Performance is 0.0098974, Goal is 0.01

2

10

1

Training-Blue Goal-Black

10

0

10

-1

10

-2

10

-3

10

0

200

400

600 1246 Epochs

800

1000

1200

Fig.29 Training on 7 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01

S No

No of Number of Accurac Training Testing y Rate % speakers Speaker 1 15 5 94 2 12 4 95 3 10 3 96.7 4 7 2 95 5 5 2 100 Table 2. Training of Neural Network with different number of speakers showing accuracy rate.

Performance is 0.00996304, Goal is 0.01

2

10

1

10 Training-Blue Goal-Black

The feedforword neural network was trained for number of speakers with same layers, learning rate, goal and epochs. It was found that there is trade off between neural network, learning rate, goal and epochs as shown in table-2.

0

10

-1

10

-2

10

-3

10

0

200

400

600 1243 Epochs

800

1000

1200

Fig.30 Training on 5 speakers Layers (64,35, 35,1) Learning rate 0.01 Goal 0.01 We created and tested the networks in different configurations, especially the hidden layer size. The following table shows the learning accuracy with some of the networks. A maximum accuracy of 94% was achieved with double hidden layers network (64, 35, 35 1). Therefore double-hidden layers should suffice for application as shown in table 1. S No 1 2 3 4 5 6

Neuron count 64, 10, 10 64, 20, 10 64, 30, 10 64, 40, 10 64, 60, 10 64, 80, 10

UbiCC Journal - Volume 3

Learning accuracy 81.48% 81.48% 72.59% 83.70% 62.96% 45.93%

Relationship between network layers and training data can be formulated from the effect of learning rate (α) on training is shown in Fig 21-30. In experiments, a large value of α took lesser time to train, although convergence to the steady state value was noisy. It shows that there exists a trade off between layers of the network (network parameter setting), learning rate (α), and the epoch for which the algorithm was used. 6.

CONCLUSION

In this paper, we presented recognition analysis of Urdu numbers siffar to nau (one to nine), which is totally an unique idea. The data was acquired in moderate noisy environment by word utterances of 15 different speakers. FFT algorithm was used in MATLAB to analyze the data. As expected, we found high correlation among frequency contents of the same word, when spoken by many different speakers. We have investigated creation of neural network models for automatically recognizing individual Urdu numbers to be specific. The feedforward neural network was trained for different learning rates; combined and individually for different goals and

88

epochs. It was found that there is a trade off between learning rate, goal and epochs. It means all values to be adjusted at a level of learning rate, parameter goals and epochs. We are still in further development stage of the system using TI TMS 320C6713. This recognition system could be of many potential applications, for example, voice-driven menu selection in a telephonebased customer service in Urdu speaking countries such as Pakistan/India. 7.

REFERENCES

[1] S K Hasnain, Azam Beg and Samiullah Awan, “Frequency Analysis of Urdu Spoken Numbers Using MATLAB and Simulink” Journal of Science & Technology PAF KIET ISSN 1994862x, Karachi, Dec. 2007. [2] D. O’Shaughnessy. Speech Communication: Human and Machine. Addison Wesley Publishing Co., 1987. [3] T. Parsons. Voice and Speech Processing. McGraw-Hill College Div., Inc, 1986. [4] S K Hasnain, Pervez Akhter, “Digital Signal Processing, Theory and Worked Examples,” 2007. [5] MATLAB User’s Guide. Mathworks Inc., 2006. [6] DSP Blockset (For use with Simulink) User’s Guide Mathworks Inc., 2007. [7] Samuel D Stearns, Ruth A David, “Signal Processing Algorithms in MATLAB,” Prentice Hall, 1996. [8] “TMS320C6713 DSK User’s Guide,” Texas Instruments Inc., 2005.

on Engineering Education, Coimbra, Portugal, September 3-7, 2007. [13] A. Beg. Predicting Processor Performance with a Machine Learnt Model. IEEE International Midwest Symposium on Circuits and Systems, MWSCAS/NEWCAS 2007, Montreal, Canada, August 5-8, 2007, pp. 1098-1101. [14] A. Beg, P.W.C. Prasad, S.M.N.A. Senanayake. Learning Monte Carlo Data for Circuit Path Length. In Proc. International Conference on Computers, Communications & Control Technologies, CCCT 2007, Orlando, Florida, July 12-14, 2007. [15]A. Beg, P.W.C. Prasad, M. Arshad, S K Hasnain. Using Recurrent Neural Networks for Circuit Complexity Modeling", In Proc. IEEE INMIC Conference, Islamabad, Pakistan, December 2324, 2006, pp. 194-197. [16] S K Hasnain, Nighat Jamil, “Implementation of Digital Signal Processing real time Concepts Using Code Composer Studio 3.1, TI DSK TMS 320C6713 and DSP Simulink Blocksets,” IC-4 conference, Pakistan Navy Engineering College, Karachi, Nov. 2007 [17] M M El Choubassi, H E El Khoury, C E Jabra Alagha, J A Skaf, M A AL Alaoui, “Arabic Speech Recognition Using Recurrent Neural Networks,” Symp. Signal Processing & Info. Tech., 2003, ISSPIT 2003, Dec. 2003, pp. 543547. [18] M A Al-Alaoui, R Mouci, M M Mansour, R Ferzli, “A Cloning Approach to Classifier Training,” IEEE Trans. Systems, Man and Cybernetics – Part A: Systems and Humans, vol. 32, no. 6, pp. 746-752, 2002.

[9] G. C. M. Fant. Acoustic Theory of Speech Production. Mouton, Gravenhage, 1960. [10] S K Hasnain, Aisha Tahir, “Digital Signal Processing Laboratory Workbook”, 2006. [11] S. Varho. New Linear Predictive Systems for Digital Speech Processing. PhD dissertation, Helsinki University of Technology, Finland, 2001. [12] A. Beg, W. Ibrahim. An Online Tool for Teaching Design Trade-offs in Computer Architecture. In Proc. International Conference

UbiCC Journal - Volume 3

89

Related Documents

Final - Volume 3 No 2
December 2019 9
Final-volume 3 No 5
December 2019 7
Final - Volume 3 No 6
December 2019 5
Final - Volume 3 No 4
December 2019 25
Volume 110, No. 3
June 2020 7
Volume 3, Issue 2
June 2020 5