NVIDIA’s 50-Minute BERT Training: Beyond Just GPUs
Selected from arXiv Author:Mohammad Shoeybi et al. Translated by Machine Heart Contributors:Mo Wang Previously, Machine Heart introduced a study by NVIDIA that broke three records in the NLP field: reducing BERT’s training time to 53 minutes; reducing BERT’s inference time to 2.2 milliseconds; and increasing the parameter count of GPT-2 to 8 billion (previously, GPT-2 … Read more