Exploring OpenAI’s Sora Video Model: A Technical Report

Exploring OpenAI's Sora Video Model: A Technical Report

New Intelligence Report Editor: Editorial Department [New Intelligence Guide] OpenAI’s first AI video model Sora has emerged, creating history once again. This technical report, known as a “world model”, was also released today, but specific training details have not yet been made public. Yesterday during the day, “Reality No Longer Exists” began trending across the … Read more

Overview of Latest Transformer Pre-training Models

Overview of Latest Transformer Pre-training Models

Reported by Machine Heart In today’s NLP field, we can see the success of “Transformer-based Pre-trained Language Models (T-PTLM)” in almost every task. These models originated from GPT and BERT. The technical foundations of these models include Transformer, self-supervised learning, and transfer learning. T-PTLM can learn universal language representations from large-scale text data using self-supervised … Read more

Comprehensive Summary of Word Embedding Models

Comprehensive Summary of Word Embedding Models

Source: DeepHub IMBA This article is approximately 1000 words long and is recommended to be read in 5 minutes. This article will provide a complete summary of word embedding models. TF-IDF, Word2Vec, GloVe, FastText, ELMO, CoVe, BERT, RoBERTa The role of word embeddings in deep models is to provide input features for downstream tasks (such … Read more

Bart: Seq2Seq Pre-training Model

Bart: Seq2Seq Pre-training Model

Follow the public account “ML_NLP“ Set as “Starred“, heavy content delivered first-hand! Recently, I have started using Transformer for some tasks, specifically recording related knowledge points to build a relevant and complete knowledge structure system. The following is the article I am going to write; this is the sixteenth article in this series: Transformer: The … Read more

XLNet Pre-training Model: Everything You Need to Know

XLNet Pre-training Model: Everything You Need to Know

Author | mantch Reprinted from WeChat Official Account | AI Technology Review 1. What is XLNet XLNet is a model similar to BERT, rather than a completely different model. In short, XLNet is a general autoregressive pre-training method. It was released by the CMU and Google Brain teams in June 2019, and ultimately, XLNet outperformed … Read more

Understanding ALBERT in Interviews

Understanding ALBERT in Interviews

Follow the WeChat public account “ML_NLP” Set it as “Starred”, heavy content delivered first time! Source | Zhihu Address | https://zhuanlan.zhihu.com/p/268130746 Author | Mr.robot Editor | Machine Learning Algorithms and Natural Language Processing WeChat Public Account This article has been authorized by the author, and secondary reproduction is prohibited without permission. Interviewer: Do you understand … Read more

Implementing DistilBERT: A Distilled BERT Model Code

Implementing DistilBERT: A Distilled BERT Model Code

Source: DeepHub IMBA This article is about 2700 words long and suggests a reading time of 9 minutes. This article takes you into the details of Distil and provides a complete code implementation. This article provides a detailed introduction to DistilBERT and gives a complete code implementation. Machine learning models have become increasingly large, and … Read more

Exploring the Transformer Model: Understanding GPT-3, BERT, and T5

Exploring the Transformer Model: Understanding GPT-3, BERT, and T5

Author: Dale Markowitz Translation: Wang Kehan Proofreading: He Zhonghua This article is approximately 3800 words long and is recommended to be read in 5 minutes This article introduces the currently most popular language model in natural language processing—the Transformer model. Tags: Natural Language Processing Do you know this saying: When you have a hammer, everything … Read more

Do You Really Need GPT-3? BERT’s MLM Model Also Enables Few-Shot Learning

Do You Really Need GPT-3? BERT's MLM Model Also Enables Few-Shot Learning

Follow the official account “ML_NLP“ Set it as “Starred“, delivering heavy content immediately! Source|PaperWeekly ©PaperWeekly Original · Author|Su Jianlin Unit|Zhuiyi Technology Research Direction|NLP, Neural Networks As we all know, GPT-3 is currently very popular, however, everywhere we see promotions for GPT-3, do readers remember the name of the GPT-3 paper? In fact, the paper is … Read more

Exclusive: BERT Model Compression Based on Knowledge Distillation

Exclusive: BERT Model Compression Based on Knowledge Distillation

Authors: Siqi Sun, Yu Cheng, Zhe Gan, Jingjing Liu This article is about1800 words, recommended reading time5 minutes. This article introduces the “Patient Knowledge Distillation” model. Data Department THU backend reply“191010”, get the paper address. In the past year, there have been many groundbreaking advances in language model research, such as GPT generating sentences that … Read more