Mar 17 (Fri) @ 11:30am: "Recursively Adaptive Randomized Multi-Tree Coding (RAR MTC) of Speech with VAD/CNG,” Hoontaek Oh, ECE PhD Defense

Date and Time
Location
Harold Frank Hall (HFH), Rm 4164 (ECE Conf. Rm.)

Abstract

A new form of a tree codec for narrowband speech, “Recursively Adaptive Randomized Multi-tree Coding (RAR MTC) with VAD/CNG”, is developed based on a sample-by-sample analysis-and-synthesis linear predictive model by benchmarking and upgrading the tree coding models suggested by J. D. Gibson, W. Chang, and H. C. Woo. in the 1990s. A simple structure of the Voice Activity Detection/Comfort Noise Generation (VAD/CNG) algorithm is newly applied to the prior speech tree coder to lower the average bitrate by increasing encoding efficiency. A backward adaptive all-pole short-term predictor, which was cascaded to a pitch-based long-term predictor, is replaced with a backward adaptive pole-zero predictor for better input waveform-tracking performance with higher accuracy of prediction. The RAR MTC encodes the initial samples of each voiced region by spanning a 5-level Pitch Compensating Quantizer (PCQ) tree, and then, our randomly interleaved 4-level and 2-level multitree (4-2 MTC) is used to encode the rest of voiced samples with a set of prediction parameters initialized by the 5-level tree coding. A newly developed gain control algorithm for a 2-level tree based on the polarity pattern of the past 5 excitation values advances its gain tracking performance.

In our simulations, the RAR MTC codec achieves the best performance ever for backward adaptive predictive coders at similar rates, and compared to the widely used standard, AMR-NB, which is built on a CELP structure based on a block-based predictive model, it shows very competitive performance with lower delay and more natural tone recovery.

Bio

Hoontaek Oh is a Ph.D. candidate in the Department of Electrical & Computer Engineering at the University of California, Santa Barbara (UCSB). He received a B.S. degree in Electronic and Electrical  Engineering from Hong-Ik University, Seoul, South Korea, and an M.S Degree in Electrical and Computer Engineering from UCSB.

Hosted by: Professor Jerry Gibson

Submitted by: hoontaek <hoontaek@ucsb.edu>