BERT-of-Theseus: A Model Compression Method Based on Module Replacement
©PaperWeekly Original · Author|Su Jianlin School|Zhuiyi Technology Research Direction|NLP, Neural Networks Recently, I learned about a BERT model compression method called “BERT-of-Theseus”, derived from the paper BERT-of-Theseus: Compressing BERT by Progressive Module Replacing. This is a model compression scheme built on the concept of “replaceability”. Compared to conventional methods like pruning and distillation, it appears … Read more