"Speech Separation in Noisy Environments Using Deep Neural Networks"

Donald S. Williamson, Faculty Candidate, Ohio State University

February 22nd (Monday), 2:00pm
Harold Frank Hall (HFH), Rm 4164 (ECE Conf. Rm.)

Speech is an essential form of human communication and speech processing has a variety of real-world applications. Hearing aids help individuals with hearing impairment understand speech better, and voice commands are used to interface with consumer electronic devices. In realistic environments, background noise is always present. The performance of speech processing algorithms degrades substantially in noisy environments, as noise may overlap with and mask the speech signal across time and frequency. Many computational techniques have been proposed to address speech separation in noisy environments, but it remains elusive to produce speech estimates that are both intelligible and high quality, especially at low signal-to-noise ratios.

Traditional speech separation systems operate on the magnitude response of the short-time Fourier transform and leave the phase response unchanged. Recent studies, however, show that the phase response is important for quality. In this talk, I will present a novel approach that jointly enhances the magnitude and phase of noisy speech by performing time-frequency masking in the complex domain. The joint estimation of real and imaginary components has led to compelling improvements of speech quality. In addition, I will describe how two common methods for performing separation, time-frequency masking and model-based separation, can be combined to further improve separation performance. Lastly, I will show the benefit of estimating the activations of speech models using deep neural networks.

About Donald S. Williamson:

photo of Donald S. WilliamsonDonald S. Williamson is a Ph.D. candidate in the Department of Computer Science and Engineering (CSE) at Ohio State University. Previously he was a Member of the Engineering Staff at Lockheed Martin where he was a software and systems engineer. He received his M.S. degree in electrical engineering from Drexel University and his B.E.E degree in electrical engineering from the University of Delaware. His main research interests include speech processing, machine learning, and music information retrieval. His doctoral research is recognized by the 2016 CSE Graduate Research Award.

Hosted by: Professor B.S. Manjunath