信号与系统课程设计报告

  • Uploaded by: Fan Wang
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View 信号与系统课程设计报告 as PDF for free.

More details

  • Words: 3,736
  • Pages: 17
信号与系统

课程设计报告

设计题目:快速傅立叶变换(FFT)的计算机实现 完 成 人:王 璠 班号:电气 0612 班 姓名:王 璠 学号:012006019801

二 OO 八年 十一 月

Fast Fourier Transform Computation Using a C8051 MCU Fan Wang (Huazhong University of Science & Technology, Wuhan, 430074)

Abstract: A method of calculating FFT using a C8051 MCU is introduced in this paper. The software is aimed at computational efficiency and accuracy. By three key techniques: (1)avoidance of multiplication whenever possible; (2)16-bit integer storage; (3)on-chip PLL, a real-time audio spectrum is achieved. The results are displayed on a LCD screen and carefully analyzed. Basic theories of FFT and common problems in practice are discussed in detail.

Ⅰ INTRODUCTION The Fast Fourier Transform(FFT) is an efficient algorithm for digital N-point DFT computation. The algorithm was first used by Gauss in 1805, and rediscovered in the 1960s by Cooley and Tukey. The most widely used method of Cooley-Tukey Algorithm is called Radix-2 Algorithm, which is applied in this paper. FFT is of great importance in current digital processing systems, and there are specially designed processors for this kind of use, for example, DSPs. Unfortunately, the processor introduced in this system---C8051 MCU---is mostly used in simple control systems and is not specially designed for multiplications and additions. Therefore, the purpose of this paper is not a practical industrial solution, but a deep understanding of the basic theories of signal processing systems and the way of putting theories into practice.

Ⅱ BACKGROUNDS A. Discrete Fourier Transform and Fast Fourier Transform For a length N complex sequence x(n), n=0,1,2,…N-1, the discrete Fourier transform(DFT) is defined by N −1

X (k ) = ∑ x(n)WNnk , k = 0,1,2,...N − 1 where WN = e − j 2π / N . n =0

According to this equation, an N point DFT calculation requires N 2 times of multiplication and N(N-1) times of addition. As N increases, the computational -1-

complexity becomes larger. Fast Fourier Transform is a fast structure to compute the DFT. Note that WNnk has two important properties: periodicity and symmetry, which are the fundamentals of FFT algorithm. Using the symbol ((nk )) N to represent the remainder from a division of nk by N, it is easy to prove that W nk = W (( nk )) N . For example, when N=4, we have

W 6 = W 2 ,W 9 = W 1 . Another property can be explained by the equation

W ( nk + N / 2) = −W nk . Again taking N=4 as an example, we have W 3 = −W 1 , W 2 = −W 0 . Hence, the number of multiplications required can be reduced to some extent. When N is a power of 2 , say N = 2 M ,where M is a positive integer, the DFT summation can be split into sums over the odd and even indexes of the input signal: N −1

X (k ) = ∑ x(n)WNnk n =0

=

∑ x(n)W

nk N

+

n ~ even

=

N −1 2

∑ x(2n)W n =0

∑ x(n)W

nk N

n ~ odd

kn N /2

+W

k N

N −1 2

∑ x(2n + 1)W n =0

nk N /2

= G (k ) + WNk H (k ) where G (k ) =

N −1 2

∑ x(2n)W n =0

kn N /2

, H (k ) =

N −1 2

∑ x(2n + 1)W n =0

nk N /2

. Now that the previous N-

point DFT becomes two N/2-point DFTs, the complexity is reduced. According to the N symmetry property, WNk + N / 2 = −WNk , so we have X (k + ) = G (k ) − WNk H (k ) . In 2 conclusion, X (k ) = G (k ) + WNk H (k ) k = 0,1,2,...N / 2 − 1 X (k + N / 2) = G (k ) − WNk H (k ) k = 0,1,2,...N / 2 − 1 Suppose that N = 4 , we have the FFT process as the following: X (0) = G (0) + W40 H (0) X (1) = G (1) + W41 H (1) X (2) = G (0) − W40 H (0) X (3) = G (1) − W41 H (1)

-2-

In the decimation in time FFT, the computation flow graph of the time domain sequence can be briefly illustrated by a butterfly diagram. A diagram of 8-point sequences is shown in Figure 1. The diagram immediately suggests that three loops should be included in the software: the stage loop, the group loop, and the butterfly loop. As the diagram shows, there are v = log 2 N stages, each stage m, has 2 v −m groups and each group has 2 m−1 butterflies.

Fig.1. Butterfly diagram of a 8-point FFT algorithm.

B. FFT Bit-Reversal Algorithm

Since the transformed complex numbers are not in normal order, more storage space is needed by the intermediate number, in order to obtain the right results. To minimize the storage space required for intermediate data, either the input data is required to be scrambled or the output data needs to be unscrambled. The butterfly diagram illustrates that if the input data is in bit-reversed order, the results will be in the normal order; if the input data is in normal order, the output data can be unscrambled to normal by a process of bit-reversal.

-3-

C. Anti-aliasing Filtering and Windowing

If the input signal has an high frequency component which is higher than half of the sampling frequency, it may be mis-interpreted to have lower frequencies which are not exist. This effect is known as frequency aliasing. To avoid this, the input signal should be low-pass filtered. When dealing with practical sytems, the sampling signal is time limited. In time domain, the signal is multiplied by a square wave function (a rectangular window) of the time. But the software makes the assumption that the data is periodic and repeated infinitely. Taking DFT transform of a sampled sinusoidal signal, we obtain a spectrum which has the shape of a sinc function. In this way, the energy of the sinusoidal signal leaks to a higher spectrum. In practice, the main lobe of the sinc function is desired to be narrow and high enough, while the side lobe level should be low. Several sampling windows has been invented to make a practical trade-off of the two. Triangular, Hanning, Hamming and Blackman windows are all common windows in practical, which will be discussed in advance in Section Ⅲ. D. Function Overview of the C8051 MCU

Fig.2. C8051F120 architecture overview.

The processor of the computational system is a high performance C8051 MCU by Silicon Laboratories™—C8051F120. The chip is specially designed for high speed mixed signal processing, meeting the requirements of low level customers. It’s rich analog on-chip sources make it a ideal tool for many mixed signal systems. For example, a 12-bit, 8-channel, 100kHz sample rate ADC, two 12-bit DACs, two analog comparators, a voltage reference, etc. Its pipelined instruction architecture 8051 core -4-

enables it to execute most of instruction set in only 1 or 2 system clocks. Under the help of the on-chip PLL, the peak speed of the chip can reach as high as 100MIPS. At the same time, a 2-cycle 16×16 MAC engine accelerates the computation efficiently.

Ⅲ SOFTWARE DESIGN The computation flow of N point FFT is in the following five steps: 1. Data collection. 2. Windowing the input data. 3. Changing the data into bit-reversed order. 4. FFT computation. 5. Magnitude modification and LCD display. There are five important functions correspond to the five steps. The explanations are as follows: A. ADC0_ISR( )

The 12-bit on-chip ADC0 collects data at a designed a sample rate. When a Timer3 overflow happens, a data-collecting conversation is initiated. The data is then stored in the ADC0H :ADC0L registers in a left-justified form, that is, the higher 12 bits are the data bits, while the last 4 bits are not used. So the range of the input data is from 0x0000 to 0xFFF0. The overflow interval of Timer3 is determined by a configurable macro variable SAMPLE_RATE, so that the sample rate of the system is determined. After the N-point data samples have been collected, ADC0 interrupt is disabled. Then comes the next function. B. WindowCalc( )

Four types of windows are available in this software. Each of them has its own coefficients, which will be multiplied with the time domain data. The coefficient equations are listed below. Type

Equation

1: Triangular

W(n)=1 W(n)=n/(N/2) (0≤n≤N/2)

2: Hanning

W(n)=2-n/(N/2) (N/2
3: Hamming

W(n)=0.54-0.46cos(2πn/N)

4: Blackman

W(n) = 0.42 - 0.5cos(2πn / N) +0.08cos(4πn / N)

0: None(Rectangular)

-5-

The window coefficients are pre-calculated and stored in the header file FFT_Code_Tables.h. The type define variable WINDOW_TYPE can be selected from 0 to 4. The function also converts the single-ended data collected by ADC0 into differential, in order to remove the DC component of the input signal. The process is carried out by XOR the input data with 0x8000, which has the same effect with subtracting 32768 from the input data, so that the data ranges from 32752 to -32768. C. Bit_Reverse( )

From the butterfly diagram, we know that either the input data or the output FFT array requires a bit reverse operation. In this software, the input data is bit-reversed. Take a 8-point FFT as an example: Previous Data

Previous Index

New Index

Desired Data

x(0)

000

000

x(0)

x(1)

001

100

x(4)

x(2)

010

010

x(2)

x(3)

011

110

x(6)

x(4)

100

001

x(1)

x(5)

101

101

x(5)

x(6)

110

011

x(3)

x(7)

111

111

x(7)

To save data storage space, only half of the bit-reverse array is stored, the other half is useless for the data will be exchanged by the first half of operation. For example, the 8-point FFT listed above may use the following bit-reverse array: BRTable[ ]={0,2,1,3};

And the sentence NewIndex=BRTable[PreviousIndex]*2 will help us location the nth reversed index. D. Int_FFT( )

In this algorithm, complex numbers are divided into real parts and imaginary parts. The modified input data has only real parts and is stored in the array Real[]. The later stages will produce imaginary parts as well, which will be stored in Imag[]. Again check the butterfly equation, when WN is divided into real part and imaginary part, the single set of butterfly equation should be:

-6-

Re1 = Re1 + (cos(x) × Re2 + sin(x) × Im2) Re2 = Re1 - (cos(x) × Re2 + sin(x) × Im2) Im1 = Im1 + (cos(x) × Im2 - sin(x) × Re2) Im2 = Im1 - (cos(x) × Im2 - sin(x) × Re2) where

Re1 = ReArray[indexA], Re2 = ReArray[indexB] Im1 = ImArray[indexA], Im2 = ImArray[indexB] x = the angle: 2×pi×(sin_index/NUM_FFT), in radians.

According to Figure.1, the flow chart of the FFT is drawn as the following:

Fig.3. Software flow pattern of Int_FFT().

-7-

To improve the speed of the FFT computation, integer numbers are used instead of float numbers. All of them are set to be 16-bit int. Multiplications are avoided whenever possible, and all divisions are implemented using right shifts. But this method may cause some asymmetrical effects when the number is negative and the deleted bits are non-zero. For example, -3/2=-1, but -3>>1=-2. A one should be added to the result. In general, if m is a signed int or long int, to obtain m / 2 k , the following sentences will be helpful: if((m<0)&&(the lower k bits of m is non-zero)) result=m>>k+1; else result=m>>k;

To minimize the calculation error and minimize the data storage requirement at the same time, the data is multiplied in a 32-bit form and stored in a 16-bit form. A union type of variable is defined to meet the needs. typedef union IBALONG // Integer or Byte-addressable LONG { long l; // long: var.l unsigned int i[2]; // u int: var.i[0]:var.i[1] unsigned char b[4]; // u char: var.b[0]:var.b[1]:var.b[2]:var.b[3] } IBALONG;

The computation is carried out by the following sentences: TempL.l = (long)TempImA - TempReB; //Convert to long. if ((TempL.l < 0)&&(0x01 & TempL.b[3])) //Divided by 2. TempImA = (TempL.l >> 1) + 1; else TempImA = TempL.l >> 1;

Another example is: TempL.i[1] = 0; TempL.i[0] = TempReA; TempL.l = TempL.l >> 1; ReTwid.l += TempL.l; if ((ReTwid.l < 0)&&(ReTwid.i[1])) TempReA2 = ReTwid.i[0] + 1; else TempReA2 = ReTwid.i[0];

//Divided by 65536

Note that TempL and ReTwid in the above two examples both have the type IBALONG. Why does this software have to calculate these divisions? There are two reasons: SinTable[], in which the needed sinusoidal values are stored, has the type of int; the FFT algorithm has an inherent processing gain, which will result in the overflow of data if not handled. Therefore, when the SinTable[] is used, it has to be divided by 65536, and at the end of each butterfly equation, the intermediate variable has to be divided by 2. More details are shown in the appendix.

-8-

E. Spectrum_Display( )

The FFT result is in complex form, of which the real part is stored in Real[], and the imaginary part is stored in Imag[]. Based on the two arrays, the magnitude spectrum and phase spectrum can be easily obtained. This software only calculates power spectrum of the input signal. The spectrum is displayed on a 128×64 point LCD screen.

Ⅳ EXPERIMENT RESULTS AND DISCUSSION An EE1643C Function Signal Generator serves as the system’s testing signal source. The signal is first filtered by an anti-aliasing filter with 32kHz cut-off frequency (1 order passive low-pass), and then sampled by ADC0 at a certain sampling rate. Four configurable macro variables determine the system’s performance. The chart shows what they are. Name of variable

Function

NUM_FFT

Number of points of FFT algorithm

SAMPLE_RATE

Sampling rate of ADC0 in Hz

WINDOW_TYPE

Type of windows

RUN_ONCE

Program runs once when the value is 0,many times when the value is non-zero.

Further discussions are started by changing the values of the above 4 variables or the wave form of the function generator. A. Function Change

(i) Magnitude and Frequency Change of a Sinusoidal Signal The power spectrum displays the energy distribution on every frequency. When the magnitude of a sinusoidal signal changes, the height of the “energy bars” will change.

Fig.TR.1. A 2.5kHz 1.3V sine.

Fig.TR.2 A 2.5kHz 2.0V sine.

-9-

When the frequency of the sinusoidal changes, the “energy bar” which has the same height will move to another place.

Fig.TR.3 A 2.5kHz 2.0V sine.

Fig.TR.4 A 5kHz 2.0V sine.

(ii) Wave Form Change Sinusoidal signals have only one frequency component, just as the above four pictures illustrate. But triangular and rectangular have abundant frequency components.

Fig.TR.5. Triangular signal.

Fig.TR.6. Rectangular signal.

B. Software Configuration Change

(i) NUM_FFT Change More points of FFT transform will result in a more accurate spectrum with lower distortion and energy leakage.

Fig.TR.7. A 256-point square wave

Fig.TR.8. A 128-point square wave

(ii) WINDOW_TYPE Change As mentioned in the background section, window type has some effects on the spectrum leakage. The leaked energy becomes smaller after the process of windowing the input data. Only Blackman window is checked in this experiment, for other windows have similar effects which are hardly distinguished from each other.

- 10 -

Fig.TR.9. Rectangular without windowing.

Fig.TR.10.Rectangular with Blackman window.

(iii) SAMPLE_RATE Change All the above examples are signal samples of a sampling rate of 40kHz. According to the Nyquist sampling theory, frequencies lower than 20kHz can be easily located. So the full range of the screen is set to be 20kHz. But when the sampling rate is divided by 2, a range of 10kHz spectrum is obtained.

Fig.TR.11. 5k sine at 40kHz sampling rate.

Fig.TR.12. 5k sine at 20kHz sampling rate.

(iv) RUN_ONCE Change To check the speed of the system, RUN_ONCE is set to zero. For all audio signals at a sampling rate of 40kHz, the LCD screen never blinks and a steady frequency spectrum of the input time domain signals can be read without any obstacles. This proves the real-time property of the system. C. Property of Symmetry

The real part of FFT output exhibits even symmetry and imaginary part of FFT output exhibits odd symmetry. This results in the symmetry of frequency spectrum. That is, for N-point FFT products, point N/2+1 is the same as point N/2-1, point N/2+2 is the same as point N/2-2, etc. To illustrate this property, set the full range of the screen to 40kHz. Here are the results.

Fig.TR.13. 2kHz square wave at full range.

Fig.TR.14. 13kHz square wave at full range.

- 11 -

Ⅴ CONCLUSION In this paper, a method of Fast Fourier Transform computation based on C8051 MCU is introduced. Computational efficiency and accuracy are improved by the use of three techniques: (1) avoidance of multiplication whenever possible; (2) 16-bit integer storage; (3) on-chip PLL. Basic concepts of DFT and FFT are introduced in detail. Practical problems such as bit-reversal, frequency aliasing and spectrum leakage are carefully discussed and well solved by software or hardware solutions. Some typical wave forms’ spectrums and properties of the FFT spectrums are checked by a LCD screen.

Ⅵ REFERENCES [1] Edward W. Kanmen and Bonnie S. Heck, Fundamentals of Signals and Systems: Using the Web and MATLAB, 2nd ed. Pearson Education, 2002. [2] Julius O. Smith Ⅲ, Mathematics of the Discrete Fourier Transform (DFT) with

Audio Applications, 2nd ed. http://ccrma.stanford.edu/~jos/mdft/ [3] Junli Zheng, Signals and systems, 2nd ed. Higher Education Press, 2000. [4] AN142 “FFT Routines for the C8051F12X Family”, Rev.1.1. Inc. ©Silicon Laboratories, 2003. [5] “FFT Library-Module user’s Guide C28x Foundation Software”. Inc. ©Texas Instruments, 2002.

- 12 -

APPENDIX A. Important Codes //----------------------------------------------------------------------------// Int_FFT //----------------------------------------------------------------------------void Int_FFT(int ReArray[], int ImArray[]) { #if (NUM_FFT >= 512) unsigned int sin_index, g_cnt, s_cnt; // Keeps track of the proper index unsigned int indexA, indexB; // locations for each calculation #endif #if (NUM_FFT <= 256) unsigned char sin_index, g_cnt, s_cnt; // Keeps track of the proper index unsigned char indexA, indexB; // locations for each calculation #endif unsigned int group = NUM_FFT/4, stage = 2; long CosVal, SinVal; long TempImA, TempImB, TempReA, TempReB, TempReA2, TempReB2; IBALONG ReTwid, ImTwid, TempL;

// FIRST STAGE indexA = 0; for (g_cnt = 0; g_cnt < NUM_FFT/2; g_cnt++) { indexB = indexA + 1; TempReA = ReArray[indexA]; TempReB = ReArray[indexB]; // Calculate new value for ReArray[indexA] TempL.l = (long)TempReA + TempReB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempReA2 = (TempL.l >> 1) + 1; else TempReA2 = TempL.l >> 1; // Calculate new value for ReArray[indexB] TempL.l = (long)TempReA - TempReB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) ReArray[indexB] = (TempL.l >> 1) + 1; else ReArray[indexB] = TempL.l >> 1; ReArray[indexA] = TempReA2; ImArray[indexA] = 0; // set Imaginary locations to '0' ImArray[indexB] = 0; indexA = indexB + 1; }// END OF FIRST STAGE while (stage <= NUM_FFT/2) { indexA = 0; sin_index = 0; for (g_cnt = 0; g_cnt < group; g_cnt++) { for (s_cnt = 0; s_cnt < stage; s_cnt++) { indexB = indexA + stage; TempReA = ReArray[indexA]; TempReB = ReArray[indexB];

- 13 -

TempImA = ImArray[indexA]; TempImB = ImArray[indexB]; // The following first checks for the special cases when the angle "x" is // equal to either 0 or pi/2 radians. In these cases, unnecessary // multiplications have been removed to improve the processing speed. if (sin_index == 0) // corresponds to "x" = 0 radians { TempL.l = (long)TempReA + TempReB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempReA2 = (TempL.l >> 1) + 1; else TempReA2 = TempL.l >> 1; // Calculate new value for ReArray[indexB] TempL.l = (long)TempReA - TempReB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempReB2 = (TempL.l >> 1) + 1; else TempReB2 = TempL.l >> 1; // Calculate new value for ImArray[indexB] TempL.l = (long)TempImA - TempImB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempImB = (TempL.l >> 1) + 1; else TempImB = TempL.l >> 1; // Calculate new value for ImArray[indexA] TempL.l = (long)TempImA + TempImB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempImA = (TempL.l >> 1) + 1; else TempImA = TempL.l >> 1; } else if (sin_index == NUM_FFT/4) // corresponds to "x" = pi/2 radians { // Calculate new value for ReArray[indexB] TempL.l = (long)TempReA - TempImB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempReB2 = (TempL.l >> 1) + 1; else TempReB2 = TempL.l >> 1; // Calculate new value for ReArray[indexA] TempL.l = (long)TempReA + TempImB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempReA2 = (TempL.l >> 1) + 1; else TempReA2 = TempL.l >> 1; // Calculate new value for ImArray[indexB] TempL.l = (long)TempImA + TempReB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempImB = (TempL.l >> 1) + 1; else TempImB = TempL.l >> 1; // Calculate new value for ImArray[indexA] TempL.l = (long)TempImA - TempReB; if ((TempL.l < 0)&&(0x01 & TempL.b[3])) TempImA = (TempL.l >> 1) + 1; else TempImA = TempL.l >> 1; } else { if (sin_index > NUM_FFT/4) - 14 -

{ SinVal = SinTable[(NUM_FFT/2) - sin_index]; CosVal = -SinTable[sin_index - (NUM_FFT/4)]; } else { SinVal = SinTable[sin_index]; CosVal = SinTable[(NUM_FFT/4) - sin_index]; } // The SIN and COS values are used here to calculate part of the // Butterfly equation ReTwid.l = ((long)TempReB * CosVal) + ((long)TempImB * SinVal); ImTwid.l = ((long)TempImB * CosVal) ((long)TempReB * SinVal);

TempL.i[1] = 0; TempL.i[0] = TempReA; TempL.l = TempL.l >> 1; ReTwid.l += TempL.l; if ((ReTwid.l < 0)&&(ReTwid.i[1])) TempReA2 = ReTwid.i[0] + 1; else TempReA2 = ReTwid.i[0]; // Calculate new value for ReArray[indexB] TempL.l = TempL.l << 1; TempL.l -= ReTwid.l; if ((TempL.l < 0)&&(TempL.i[1])) TempReB2 = TempL.i[0] + 1; else TempReB2 = TempL.i[0]; // Calculate new value for ImArray[indexA] TempL.i[1] = 0; TempL.i[0] = TempImA; TempL.l = TempL.l >> 1; ImTwid.l += TempL.l; if ((ImTwid.l < 0)&&(ImTwid.i[1])) TempImA = ImTwid.i[0] + 1; else TempImA = ImTwid.i[0]; // Calculate new value for ImArray[indexB] TempL.l = TempL.l << 1; TempL.l -= ImTwid.l; if ((TempL.l < 0)&&(TempL.i[1])) TempImB = TempL.i[0] + 1; else TempImB = TempL.i[0]; } ReArray[indexA] = TempReA2; ReArray[indexB] = TempReB2; ImArray[indexA] = TempImA; ImArray[indexB] = TempImB; indexA++; sin_index += group; } // END of stage FOR loop (s_cnt) indexA = indexB + 1; sin_index = 0; } // END of group FOR loop (g_cnt)

} }

group /= 2; stage *= 2; // END of While loop // END Int_FFT - 15 -

B. Photo of the System

- 16 -

More Documents from "Fan Wang"

April 2020 9
Fft
April 2020 10
Davini_2.pdf
November 2019 29
November 2019 20
Untitled
April 2020 21