ML Tea: Unsupervised Discovery of Interpretable Structure in Complex Systems
Speaker: Mark Hamilton
Abstract: How does the human mind make sense of raw information without being taught how to see or hear? In this talk we will explore how to build algorithms that can uncover interpretable structure from large collections of unsupervised data like images and video. First, I will describe how to classify every pixel of a collection of images without any human annotations (Unsupervised semantic segmentation) by distilling self-supervised vision models. Second, we’ll see how this basic idea leads us to a new unifying theory of representation learning, and I will show how 20 different common machine learning methods such as dimensionality reduction, clustering, contrastive learning, and spectral methods emerge from a single unified equation. Finally, we’ll use this unified theory to create algorithms that can decode natural language just by watching unlabeled videos of people talking, without any knowledge of text. This work is the first step in our broader effort to translate animals using large scale, unsupervised, and interpretable learners, and the talk will conclude with some of our most recent efforts to analyze the complex vocalizations of Atlantic spotted dolphins.
Bio: Mark Hamilton is a PhD student in William T Freeman's lab at the MIT Computer Science & Artificial Intelligence Laboratory. He is also a Senior Engineering Manager at Microsoft where he leads a team building a large-scale distributed ML products for Microsoft’s largest databases. Mark is interested in how we can use unsupervised machine learning to discover scientific "structure" in complex systems. Mark values working on projects for social, cultural, and environmental good and aims to use his algorithms to help humans solve challenges they cannot solve alone.