How to represent and understand the visual world remains a fundamental problem in computer vision. In this thesis, we mainly study the neural representations for videos. Given a frame index or tiny frame embedding as input, a neural network is overfit on a single video for reconstruction, then all or most video content is stored implicitly in the neural network after training. With little extra information (frame index or embedding), the neural network can reconstruct the original video with high fidelity and we refer to such a representation as neural representation.
Compared to original videos, neural representation is much more compact, and converts the complex video compression problem to model compression with a simple pipeline. The easy and fast video decoding as a simple neural network forward, also makes neural representation a potential efficient video codec. Besides, when converting original videos to neural representation, this conversion is robust to irregular visual patterns and achieves good denoising and inpainting results. Finally, we also explore video editing with neural representation and can save up to 90% editing burden compared to editing on original videos.
Chair: Dr. Abhinav Shrivastava
Department Representative: Dr. Ramani Duraiswami
Members: Dr. Saining Xie (Facebook AI Research)
Dr. Christopher Allan Metzler
Hao is a 4th year PhD student advised by Prof. Abhinav Shrivastava and mainly working on neural video repression now. He got his bachelor and master degree from Huazhong University fo Science & Technology (HUST).