Further Improvements to GPT and BERT: Language Models Using Transformers
Selected from arXiv Authors: Chenguang Wang, Mu Li, Alexander J. Smola Compiled by Machine Heart Participation: Panda BERT and GPT-2 are currently the two most advanced models in the field of NLP, both adopting a Transformer-based architecture. A recent paper from Amazon Web Services proposed several new improvements to Transformers, including architectural enhancements, leveraging prior … Read more