An Empirical Analysis of Recurrent Learning Algorithms In Neural Lossy Image Compression Systems.

Prior work on image compression has focused on optimizing models to achieve better reconstruction at lower bit rates. These approaches are focused on creating sophisticated architectures that enhance encoder or decoder performance. In some cases, there is the desire to jointly optimize both along with a designed form of entropy encoding. In some instances, these approaches result in the creation of many redundant components, which may or may not be useful. Knowing that BP-based learning is plagued with a variety of credit assignment-related issues, adding more components could make the performance even worse. Additionally, adding more components to the system means adding more trainable parameters which complicate the training process and resulting in sub-optimal performance. Here, we propose using approaches that side-step some of the issues of backprop and then build on top of it an effective compression system. According to our experiments, SAB and UORO learning algorithms seem to be the most promising alternatives. Though SAB is still essentially BPTT-based, it combines memory replay with a sparse memory retrieval scheme that reduces some computational burden while maintaining state memory for longer time spans. Another observation of our work is that memorization (or model statefulness) is necessary for capturing the longer-term dependencies implicit in the act of compressing iteratively (even though non-temporal inputs are inputs). A more powerful model with memory can also help in reaching better PSNR when trained across a global set of patches. Note that algorithms like RTRL and UORO (desirably) do not require unrolling the estimator over K steps in time, reducing sequence data storage requirements. However, the noisy rank one approximation trick used in UORO affects the memorization pattern and seems to prevent it from reaching the absolute best performance. Despite such drawbacks, UORO generates good reconstruction (w.r.t. perceptual quality).