Dec 1 (Mon) @ 1:00pm: "Adaptive Methods for Prediction and Frame Reconstruction in Video Coding," Zeyu Deng, ECE PhD Defense
Location: Henley Hall, Room 1010 (Lecture Room)
Zoom Meeting: https://ucsb.zoom.us/j/6856035342
Abstract
Efficient video compression relies heavily on accurate block prediction and effective frame reconstruction, both of which depend on statistical models that capture the characteristics of the encoded content. Modern block-based video codecs employ a combination of offline-trained models and online adaptation mechanisms. However, offline models often fail to generalize to diverse video content, while online adaptation incurs bitrate overhead when additional statistics must be transmitted to the decoder. These limitations motivate the development of adaptive methods that effectively capture content statistics, together with mechanisms that balance the associated bitrate overhead, which operate effectively within the block-based architecture of contemporary standards.
This dissertation investigates adaptive methods that enhance inter prediction of blocks and frame reconstruction. First, an adaptive transform-domain temporal prediction (TDTP) framework is proposed, replacing pixel-domain averaging of reference blocks in temporal interpolated prediction (TIP) with coefficient-wise temporal modeling in transform domain using backward-adapted linear predictors. By updating correlation statistics from reconstructed blocks without signaling overhead, TDTP captures frequency-dependent temporal correlations and improves prediction accuracy. Second, a frame-level adaptive upscaling scheme for reference picture resampling (RPR) is proposed, where Wiener filters are trained online and used as a frame-adaptive upscaling technique. The bitrate overhead is controlled by comparing the rate–distortion (RD) costs of the predefined filter options against that of the Wiener filters during the encoding process. The compression of the Wiener coefficients is incorporated into the training process, ensuring that the selection of the adaptive filter remains bitrate-efficient. Finally, for temporal prediction of reference frames under complex motion, a superpixel-based object segmentation method is incorporated. Using efficient simple linear iterative clustering (SLIC) followed by entropy-efficient superpixel merging, the approach extracts object-level boundaries. Combined with existing motion vectors, the segmentation information helps identify complex motion patterns such as object occlusion and exposure, thus producing more reliable temporally interpolated reference frames. Across AV2 and VVC test environments, the proposed methods consistently yield promising RD gains, demonstrating the effectiveness of fine-tuned, adaptive prediction and reconstruction strategies in modern video coding.
Bio
Zeyu Deng is a Ph.D. candidate in Electrical and Computer Engineering at the University of California, Santa Barbara, advised by Professor Kenneth Rose. He joined the Signal Compression Laboratory in 2019, where his research focuses on video compression algorithm design and the application of machine learning techniques to next-generation video coding. His academic work has been complemented by internships at Google in 2022, where he contributed to research related to the AV2 video coding standard, and at ByteDance in 2024, where he worked on topics associated with the VVC standard. He received his B.E. in Microelectronic Science and Engineering from Tsinghua University in 2017.
Hosted By: ECE Professor Kenneth Rose
Submitted By: Zeyu Deng <zdeng@ucsb.edu>