Digital Signal Processing: An Introduction and Some Examples of its Everyday Use Dr D. H. Crawford EPSON Scotland Design Centre
Contents • What is DSP? • What is DSP used for? – Speech & Audio processing – Image & Video processing – Adaptive filtering
• DSP Devices and Architectures • DSP at EPSON Scotland Design Centre • Summary & Conclusions Slide 2
What is DSP? • Digital Signal Processing – the processing or manipulation of signals using digital techniques
Input Signal
ADC Analogue to Digital Converter
Digital Signal Processor
Slide 3
DAC Digital to Analogue Converter
Output Signal
What is DSP Used For?
…And much more! Slide 4
Speech Processing
• Speech coding/compression • Speech synthesis • Speech recognition Slide 5
Some Properties of Speech
The blue--- s---p--o---------t i-s--on--the-- k--ey a---g--ai----n------
“oo” in “blue” “e” “ee” “o” “s” “k”in in in in“again” “spot” “key” “key”
Slide 6
Some Properties of Speech Vowels
“oo” in “blue”
“o” in “spot”
“ee” in “key”
•Quasi-periodic •Relatively high signal power Consonants “s” in “spot”
“k” in “key”
•Non-periodic (random) •Relatively low signal power Slide 7
“e” in “again”
Speech Coding TRAU
MSC
64 kbits/s 22.8 kbits/s
BSC 13 kbits/s BTS
Slide 8
Speech Coding – Linear Prediction • Try to predict the current sample value; • Transmit the prediction error. s(n) A(z)
– se(n)
+
Σ
d(n)
…
d(n) +
Σ
sr(n)
+ se(n)
Slide 9
A(z)
Speech Coding – Vocoder Encoder Original Speech Analysis: • Voiced/Unvoiced decision • Pitch Period (voiced only) • Signal power (Gain)
Pitch Period
Decoder Pulse Train
Signal Power V/U Vocal Tract Model
G
Random Noise
LPC-10: Slide 10
Synthesized Speech
Text-to-Speech Synthesis Input text
To be or not to be that is the question
Tu bee awr nawt tu bee dhat iz dhe kwestchun
Text normalization
Parsing
expands abbreviations dates, times, money..etc
semantic & syntactic ‘parts of speech’ analysis of text
Prosody rules Apply word stress, duration and pitch
Waveform generation Phonetic-toacoustic transformation
phonetic form
Pronunciation phonetic description of each word, dictionary with letter-to-sound rules as a back up
Synthesized speech
Text-to-speech synthesis sounds very natural these days. Slide 11
Speech Synthesis Applications • • • •
Speaking clocks Spoken (variable) announcements Talking emails + talking heads for mobile Synthesis of location-based information (e.g. traffic information) • Interactive systems (e.g. catalogue ordering, Yellow Pages, ...) Slide 12
Speech/Speaker Recognition • Speech Recognition – What has been spoken? – Speaker dependent – Recognition system trained for a particular person’s voice. – Speaker independent – Recognition system expected to deal with a wide variety of speakers.
• Speaker Recognition – Who has spoken? • Not easy… Sometimestherearenogapsbetweenwords. Sometim esthereareg aps inthe mid dleofwords.
Accents, dialects and Stress eggsist. Slide 13
Speech Recognition System
Phoneme models
speech
Feature extraction
Phoneme recognition
Word pronunciation
Word recognition
Semantic knowledge
Sentence recognition
Syntactic knowledge
Slide 14
decision
Dialogue knowledge
Digital Audio • Standard music CD: – – – – –
Sampling Rate: 44.1 kHz 16-bit samples 2-channel stereo Data transfer rate = 2×16×44,100 = 1.4 Mbits/s 1 hour of music = 1.4×3,600 = 635 MB
Slide 15
Audio Coding (Cont’d) • Key standards: – MPEG: Layers I, II, and III (MP3); AAC. • used in DAB, DVD
– Dolby AC3, Dolby Digital, Dolby Surround.
• Typical bit rates for 2-channel stereo: – 64kbits/s to 384 kbits/s.
• Subband- or transform-based, making use of perceptual masking properties. Slide 16
Audio Coding (Cont’d) • Typical 3/2 multichannel stereo configuration: Surround Right
Right Centre
Surround Left
Left
• 5.1 channels (3/2) with LFE channel: – Left, Right, Centre, – Left Surround, Right Surround, – Low Frequency Effects (LFE) (Reduced Bandwidth). • LFE loudspeaker can, in general, be placed anywhere in the listening room. Slide 17
Audio Coding – Masking • Auditory Masking: – Spectral: Strong frequency components mask weaker neighbouring frequency components. – Temporal: Strong temporal events mask recent and future events. Spectral Masking
Temporal Masking SPL/dB
SPL/dB
1
freq/kHz
10ms
Slide 18
160ms
time
Masking Example
60
dB
50
40
30
20
10 200
300
400
500 Hz
Slide 19
600
700
800
Image/Video • Still Image Coding: – JPEG (Joint Photographic Experts Group): • Discrete Cosine Transform (DCT) based
– JPEG2000: Wavelet Transform based
• Video Coding: – MPEG (Moving Pictures Experts Group): • DCT-based, • Interframe and intraframe prediction, • Motion estimation.
– Applications: Digital TV, DVD, etc. Slide 20
JPEG Example Original
JPEG (4:1)
JPEG (100:1)
Slide 21
Adaptive Filtering • Self-learning: Filter coefficients adapt in response to training signal. d(n)
+
x(n)
– Σ
W(z)
y(n)
e(n)
• Filter update: Least Mean Squares (LMS) algorithm w(n +1) = w(n) + 2µe(n)x(n) Slide 22
Adaptive Filtering Applications • Echo cancellation (telephone lines) – Used in modems (making Internet access possible!!)
• Acoustic echo cancellation – Hands-free telephony
• Adaptive equalization • Active noise control • Medical signal processing – e.g. foetal heart beat monitoring
Slide 23
Some Other Application Areas • Image analysis, e.g: – Face recognition, – Optical Character Recognition (OCR);
• • • •
Restoration of old image, video, and audio signals; Analysis of RADAR data; Analysis of SONAR data; Data transmission (modems, radio, echo cancellation, channel equalization, etc.); • Storage and archiving; • Control of electric motors. Slide 24
DSP Devices & Architectures • Selecting a DSP – several choices: – Fixed-point; – Floating point; – Application-specific devices (e.g. FFT processors, speech recognizers,etc.).
• Main DSP Manufacturers: – Texas Instruments (http://www.ti.com) – Motorola (http://www.motorola.com) – Analog Devices (http://www.analog.com)
Slide 25
Typical DSP Operations • Filtering • Energy of Signal • Frequency transforms
y ( n) =
L −1
∑ ai x(n − i)
i =0
Pseudo C code for (n=0; n
Slide 26
Traditional DSP Architecture X RAM
ai
x(n-i)
Y RAM
Multiply/Accumulate
Accumulator
y(n)
N.B. Most modern DSPs have more advanced features. Slide 27
DSP at EPSON
“Energy-saving Firmware” EPSON Scotland Design Centre develops a broad range of technologies to minimize power consumption and maximize cost effectiveness in mobile DSP applications. Slide 28
SDC Core Skills DSP
Speech
Audio
Mobile
Services Administration
System modelling
Speech compression
MP3
Baseband processing
Firmware design
Speech Recognition
Other digital audio
Channel coding
CAD Tools
System Integration
Speech synthesis
Performance Assessment
AMR Coding
Computer & Networking
CPU (Oak, ARM) H/w & S/w Co-design
Speech enhancement Speech Testing
System on Chip (SoC)
Slide 29
SDC Firmware Development Algorithm Definition Floating-point and Fixed-point Co-Simulation Co-Design Implementation
COSSAP Matlab ... Behavioural, RTL, Logic ...
Co-Verification MCU, DSP ...
Product Development With Barcelona and Tokyo Design Centres Slide 30
Summary & Conclusions • DSP used in a wide range of everyday applications • Looked at: – Speech coding; Speech synthesis & recognition; – Image/Video; – Adaptive filtering.
• Other areas include: – – – –
Image analysis (e.g. face recognition, OCR, etc.); RADAR/SONAR; Data transmission and reception; And many more…..!! Slide 31