MLTea: Score-of-Mixture Training: One-Step Generative Model Training via Score Estimation of Mixture Distributions

Abstract: We propose Score-of-Mixture Training (SMT), a novel framework for training one-step generative models by minimizing a class of divergences called the α-skew Jensen–Shannon divergence. At its core, SMT estimates the score of mixture distributions between real and fake samples across multiple noise levels. Similar to consistency models, our approach supports both training from scratch (SMT) and distillation using a pretrained diffusion model, which we call Score-of-Mixture Distillation (SMD). It is simple to implement, requires minimal hyperparameter tuning, and ensures stable training. Experiments on CIFAR-10 and ImageNet 64×64 show that SMT/SMD are competitive with and can even outperform existing methods.

Bio: Tejas is a final year PhD student in the Signals, Information and Algorithms Lab, advised by Professor Gregory Wornell. His research interests are centered around statistical inference, information theory and generative modeling with a recent focus on fundamental and applied aspects of score estimation and diffusion-based generative models. During his PhD, Tejas has interned at Meta AI, Google Research, Adobe Research and Mitsubishi Electric Research Labs. He is currently a recipient of the MIT Claude E. Shannon Fellowship.

Add to Calendar 2025-02-24 16:00:00 2025-02-24 17:00:00 America/New_York MLTea: Score-of-Mixture Training: One-Step Generative Model Training via Score Estimation of Mixture Distributions Abstract: We propose Score-of-Mixture Training (SMT), a novel framework for training one-step generative models by minimizing a class of divergences called the α-skew Jensen–Shannon divergence. At its core, SMT estimates the score of mixture distributions between real and fake samples across multiple noise levels. Similar to consistency models, our approach supports both training from scratch (SMT) and distillation using a pretrained diffusion model, which we call Score-of-Mixture Distillation (SMD). It is simple to implement, requires minimal hyperparameter tuning, and ensures stable training. Experiments on CIFAR-10 and ImageNet 64×64 show that SMT/SMD are competitive with and can even outperform existing methods.Bio: Tejas is a final year PhD student in the Signals, Information and Algorithms Lab, advised by Professor Gregory Wornell. His research interests are centered around statistical inference, information theory and generative modeling with a recent focus on fundamental and applied aspects of score estimation and diffusion-based generative models. During his PhD, Tejas has interned at Meta AI, Google Research, Adobe Research and Mitsubishi Electric Research Labs. He is currently a recipient of the MIT Claude E. Shannon Fellowship.  TBD

Part of

ML Tea

MLTea: Score-of-Mixture Training: One-Step Generative Model Training via Score Estimation of Mixture Distributions

February 24 2025

Location

Part of

March 17

ML Tea: Aggregating fMRI datasets for training brain-optimized models of human vision

March 10

ML Tea: Unsupervised Discovery of Interpretable Structure in Complex Systems

MLTea: Score-of-Mixture Training: One-Step Generative Model Training via Score Estimation of Mixture Distributions

February 24 2025

Location

Part of

Related Events

March 17

ML Tea: Aggregating fMRI datasets for training brain-optimized models of human vision

March 10

ML Tea: Unsupervised Discovery of Interpretable Structure in Complex Systems