Step-by-Step Distillation: New Method for Small Models to Rival Large Models

Step-by-Step Distillation: New Method for Small Models to Rival Large Models

Machine Heart Reports Editor: Rome Large language models have astonishing capabilities, but they often incur huge costs during deployment due to their size. Researchers from the University of Washington, in collaboration with the Google Cloud AI Research Institute and Google Research, have proposed a solution to this problem by introducing the Distilling Step-by-Step paradigm to … Read more

Knowledge Distillation in Neural Networks – Hinton 2015

Knowledge Distillation in Neural Networks - Hinton 2015

-Distilling the Knowledge in a Neural Network Geoffrey Hinton∗†Google Inc. Mountain View [email protected] Oriol Vinyals† Google Inc. Mountain View [email protected] Jeff Dean Google Inc. [email protected] Abstract A simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then average their predictions.[3] Unfortunately, … Read more