Dr. Richard Stern
Professor of Electrical Engineering
Most current speech recognition systems do not yet perform well in difficult acoustical environments, or in different environments from the ones in which they had been trained. This research is concerned with improving the robustness of SPHINX, Carnegie Mellons large-vocabulary continuous-speech recognition system, with respect to acoustical distortion resulting from sources such as background noise, competing talkers, change of microphone, and room reverberation. Several different strategies are being used to address these problems. These include: improved noise cancellation and speech normalization methods, the use of representations of the speech waveform that are based on the processing of sounds by the human auditory system, and the use of array-processing techniques to improve the signal-to-noise ratio of the speech that is input to the system.
Signal Processing in the Auditory System
This research includes both psychoacoustical measurements to determine how we hear complex sounds, and the development of mathematical models that use optimal communication theory to relate the results of these experiments to the neural coding of sounds by the auditory system. Much of this work has been concerned with the localization of sound and other aspects of binaural perception.