Slide 1
A COMPARISON OF COMMERCIAL SPEECH RECOGNITION COMPONENTS FOR USE IN POLICE CRUISERS 3rd Annual Intelligent Vehicle Systems Symposium Andrew L. Kun Brett Vinciguerra June 11, 2003 A Free sample background from www.powerpointbackgrounds.com
Slide 2
Outline of Presentation Introduction - What, Why and How? Background Speech Recognition Evaluation Program
Software Testing Results and Discussion Conclusion
A Free sample background from www.powerpointbackgrounds.com
Slide 3
Project54 Overview UNH / NHSP / DOJ Integrates Controls Standard Interface
A Free sample background from www.powerpointbackgrounds.com
Slide 4
G PS v e h ic le tr a c k in g
F in g e r p r in t checks
V o ic e com m and
V o ic e re s p o n s e V id e o A Free sample background from www.powerpointbackgrounds.com
C o m p u te r a id e d d is p a tc h
D ig ita l r a d io
C e n tra l d a ta re s o u rc e s : m o to r v e h ic le , c r im in a l, fin g e r p r in ts
R e m o te a c c e s s to v e h ic le r e s o u r c e s C e n tra l d a ta b a s e a c c e s s a n d fo rm s e n try
Slide 5
A Free sample background from www.powerpointbackgrounds.com
Slide 6
Introduction
What was the goal of this research? – Compare SR engine and microphone combinations – Accuracy and efficiency – Quantitatively
A Free sample background from www.powerpointbackgrounds.com
Slide 7
Introduction Why was this research important?
– Limit distraction – Limit frustration – Standard Process
A Free sample background from www.powerpointbackgrounds.com
Slide 8
Introduction How was this goal accomplished?
– 16 combinations (4 engines x 4 mics) evaluated – Speech Recognition Evaluation Program (SREP) • • •
Simulates Classifies Calculates
A Free sample background from www.powerpointbackgrounds.com
Slide 9
Introduction Accuracy
– # of correct commands verses total commands Efficiency
– false recognitions – weighted
A Free sample background from www.powerpointbackgrounds.com
Slide 10
Outline of Presentation Introduction - What, Why and How? Background Speech Recognition Evaluation Program
Software Testing Results Discussion Conclusion
A Free sample background from www.powerpointbackgrounds.com
Slide 11
SR ENGINE OPTIONS
Speed of Speech
Type of Application
User-Dependency
Field of Application
– Discrete – Continuous
– Command-and-control – Dictation – Speaker dependent – Speaker independent – – –
PC Telephone Noise robust
Grammar File
A Free sample background from www.powerpointbackgrounds.com
Slide 12
Comparing SR Engines Field test Simulated tests
– Speaker source – Background noise – Number of speakers
A Free sample background from www.powerpointbackgrounds.com
Slide 13
Accuracy Ratings Not consistent
– Different conditions Hyde’s Law
– ‘Because speech recognisers have an accuracy of 98%, tests must be arranged to prove it’
A Free sample background from www.powerpointbackgrounds.com
Slide 14
Component Requirements Speech Recognition Engine
– Must be SAPI 4.0
Microphone
– Must be far-field – Mountable on dashboard – Cancel noise • Array • Directional
A Free sample background from www.powerpointbackgrounds.com
Slide 15
Outline of Presentation Introduction - What, Why and How? Background Speech Recognition Evaluation Program
Software Testing Results and Discussion Conclusion
A Free sample background from www.powerpointbackgrounds.com
Slide 16
Application A Application H
Application G
Application B
Application Manager
Application F
Application D Application E
A Free sample background from www.powerpointbackgrounds.com
Application C
Slide 17
A Free sample background from www.powerpointbackgrounds.com
Slide 18
LOOP ENGINES LOOP BACKGROUND LOOP COMMANDS
A Free sample background from www.powerpointbackgrounds.com
Slide 19
Obtaining Sound Files Laptop w/ SoundBlaster Earthworks M30BX Background recorded on patrol Speech commands in lab
– Microsoft Audio Collection Tool – 5 Speakers (4 male, 1 female) – 40 phrases
A Free sample background from www.powerpointbackgrounds.com
Slide 20
Processing Sound Files Matlab script
Signal strength = variance(signal) + mean(signal)2 Set volume
and signal-to-noise ratio
A Free sample background from www.powerpointbackgrounds.com
Slide 21
A Free sample background from www.powerpointbackgrounds.com
Slide 22
Control File Structure Background Noises
– WAV filename – Desired SNR – Signal strength – Description of file
Voice Commands
– WAV filename – Number of loops – Signal strength – Phrase
A Free sample background from www.powerpointbackgrounds.com
Slide 23
Outline of Presentation Introduction - What, Why and How? Background Speech Recognition Evaluation Program
Software Testing Results and Discussion Conclusion
A Free sample background from www.powerpointbackgrounds.com
Slide 24
PRODUCTS TESTED Four microphones
– A, B, C and D.
Four SR engines
– 1, 2, 3, and 4.
16 unique combinations
– A1 through D4
A Free sample background from www.powerpointbackgrounds.com
Slide 25
A Free sample background from www.powerpointbackgrounds.com
Slide 26
SR ENGINES
SR Engine 1
– Microsoft SR Engine 4.0
SR Engine 2
– Microsoft SR Engine 4.0
SR Engine 3
– Dragon NaturallySpeaking 4.0
SR Engine 4
– IBM ViaVoice 8.01
A Free sample background from www.powerpointbackgrounds.com
Slide 27
PREPERATION Freshly installed engines Minimum
training Default settings Microphone Set-up Wizard
A Free sample background from www.powerpointbackgrounds.com
Slide 28
TEST SCENERIO Identical conditions 42 phrase grammar 10 speech commands 5 speakers 6 background noises 3 SNR levels
A Free sample background from www.powerpointbackgrounds.com
Slide 29
A Free sample background from www.powerpointbackgrounds.com
Slide 30
Outline of Presentation Introduction - What, Why and How? Background Speech Recognition Evaluation Program
Software Testing Results and Discussion Conclusion
A Free sample background from www.powerpointbackgrounds.com
Slide 31
ACCURACY BY ENGINE 80
Accuracy (%)
70 60 50
MIC A MIC B MIC C MIC D
40 30 20 10 0
ENG 1
ENG 2
A Free sample background from www.powerpointbackgrounds.com
ENG 3
ENG 4
Slide 32
ACCURACY BY MIC 80 70 Accuracy (%)
60 50
ENG 1 ENG 2 ENG 3 ENG 4
40 30 20 10 0 MIC A
MIC B
A Free sample background from www.powerpointbackgrounds.com
MIC C
MIC D
Slide 33
RANKED ACCURACY 80 70 Accuracy (%)
60 50 40 30 20 10 0 Configuration A Free sample background from www.powerpointbackgrounds.com
C2 A2 D2 A1 C1 B2 D1 B1 D4 C4 B4 B3 A3 C3 D3 A4
Slide 34
Efficiency Score
Specific to Project54 False recognitions
A Free sample background from www.powerpointbackgrounds.com
Slide 35
Efficiency Score SAID LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS
HEARD LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS
A Free sample background from www.powerpointbackgrounds.com
LOSS = 0
Slide 36
Efficiency Score SAID LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS
HEARD LIGHTS LIGHTS LIGHTS UNRECOGNIZED LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS
A Free sample background from www.powerpointbackgrounds.com
LOSS = 1
Slide 37
Efficiency Score SAID LIGHTS LIGHTS LIGHTS LIGHTS SIREN OFF LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS
HEARD LIGHTS LIGHTS LIGHTS SIREN ON SIREN OFF LIGHTS LIGHTS LIGHTS LIGHTS LIGHTS
A Free sample background from www.powerpointbackgrounds.com
LOSS = 1.5
Slide 38
Efficiency Score Scoring system
– Correctly recognized = 1.5 – Unrecognised = 0.5 – Falsely recognized = 0 Eff. = ((#correct * 1.5) + (#unrec. * 0.5)) / 13.5
Extreme scores
– All correct – All unrecognised – All falsely recognised
A Free sample background from www.powerpointbackgrounds.com
=> Eff. = 100 => Eff. = 33 => Eff. = 0
Slide 39
RANKED EFFICIENCY 80 Efficiency (max 100)
70 60 50 40 30 20 10 0 Configuration A Free sample background from www.powerpointbackgrounds.com
C2 A2 A1 C1 D2 D1 D4 B2 C4 B4 B1 B3 A3 C3 D3 A4
Slide 40
WINNER
Accuracy – Configuration C2 accuracy = 70.3 %
Efficiency – Configuration C2 efficiency = 72.4
Logical choices – Microphone C – SR Engine 2
A Free sample background from www.powerpointbackgrounds.com
Slide 41
WHY LOW ACCURACIES? Speakers SR experience Limited training Training Environment Default settings Microphone and speaker placement SNR Absolute scores not important
A Free sample background from www.powerpointbackgrounds.com
Slide 42
Outline of Presentation Introduction - What, Why and How? Background Speech Recognition Evaluation Program
Software Testing Results and Discussion Conclusion
A Free sample background from www.powerpointbackgrounds.com
Slide 43
CONCLUSION
The main goal of this research was
– SR engine and microphone combinations – Accuracy and efficiency – Quantitatively
A Free sample background from www.powerpointbackgrounds.com
Slide 44
CONCLUSION
This research was important in order to
– Limit distraction – Limit frustration
A Free sample background from www.powerpointbackgrounds.com
Slide 45
CONCLUSION
The goal was reached by
– Evaluating 16 combinations (4 engines x 4 mics) – Speech Recognition Evaluation Program (SREP) • • •
Simulated Classified Calculated
A Free sample background from www.powerpointbackgrounds.com
Slide 46
CONCLUSION Configuration C2
– Most accurate – Most efficient
SR ENGINE 2 Microsoft SR Engine 4.0 Telephone mode A Free sample background from www.powerpointbackgrounds.com
Slide 47
CURRENT STATUS 9 vehicles on road 300 in production Now support non SAPI 4.0 Evaluating new engines
A Free sample background from www.powerpointbackgrounds.com
Slide 48
MORE INFORMATION www.project54.unh.edu
[email protected]
[email protected]
A Free sample background from www.powerpointbackgrounds.com