Primary supervisor
KokSheik WongMultimedia content such as audio, image, and video are stored and transported in compressed forms. Various standards are designed to encode the content at the highest possible level while minimizing distortion. Some commonly used compression standards include MP3 for audio, JPEG for still image, H.264/AVC for video. Despite the vast differences in signal characteristics, most compression standards have two things in common: transformed-quantized coefficients and scale factor (quantization table in JPEG and AVC). The coefficients are usually coded as a product of sign_bit and magnitude. However, the sign_bit information takes up about 10% of the total file size, which is a significant portion.
This project aims to precisely predict the sign_bit information in compressed contents - why encoded the information when we can predict it correctly? The honours student can either work on MP3 for audio, JPEG for still image, or H.264/AVC (motion vector or coefficient). Alternatively, the student is also welcome to explore the latest coding standards such as JPEG-XL for image, HE-AAC v2 for audio, and VVC for video.
The student may choose to use handcrafted techniques (not very efficient) or deep-learning approaches.
Student cohort
Required knowledge
- Compression standard of interest (to be acquired during the project)
- Knowledge in deep-learning (is a big plus)