Introduction
"Comprehension is compression. You compress things into computer programs, into concise algorithmic descriptions. The simpler the theory, the better you understand something."
- Gregory Chaitin, "The Limits of Reason".
Imagine what you will do when you are asked to draw a cat image: you first come up with an “idea” in your mind of what a cat looks like, then control your hand to follow your "idea" and finish the details. The “idea” in your mind is a high-level compressed representation of a cat image while your hand is like an unzip tool that decodes the compressed representation into an actual data instance. What's more interesting is, your "idea" of a cat image is not just used for drawing, you come up with the same "idea" to help you tell a cat from images of dogs, to find a cat hidden in a crowd, or to recognise a cat in a blurry old video.
We see for us humans, having good understandings of data greatly improves our efficiency to process the data-relevant information. This is similar for AI. Trainditional AI systems that build models for specific tasks learn limited understandings of data that are only applicable to specific tasks. The learnt data understandings can not be reused to help solve other tasks raised from the same data, instead each time a new model has to be trained from scratch. A more efficient AI system should be able to form a deeper "idea" of the data it learns and reuse the "idea" whenever possible, just like humans.
The question then remains how to let AI acquire such good understandings of data like humans do. Recent research in AI shows that deep generative models have a great potential of becoming a generic tool for building understandings of data. Mimicking human behaviours, a deep generative model is asked to explicitly form an efficient compact representation (the "idea") of the data, from which the model should be able to create synthetic data instances indistinguishable from the real ones. The higher quality the synthetic data is, the deeper understandings of data the model obtains. Powered by deep learning technology, deep generative models are now capable of building efficient understanding of complex real data including images, audios and texts, and can enhance the AI systems built on them.
Mimicking humans, a deep generative model creates a high-quality cat image by passing its understanding of "cat" (here a short length of codes) through a deep neural network.