In-Depth Analysis of ChatGPT’s Development, Principles, Architecture, and Future

In-Depth Analysis of ChatGPT's Development, Principles, Architecture, and Future

Source: Dolphin Data Science Laboratory This article is approximately 6000 words and is recommended for a 12-minute read. This is a deep technical popular science and interpretation article, without excessive technical terms. [ Introduction ] The author of this article is Dr. Chen Wei, who previously served as the chief scientist of a Huawei-affiliated natural … Read more

In-Depth Analysis of Five Major LLM Visualization Tools: Langflow, Flowise, Dify, AutoGPT UI, and AgentGPT

In-Depth Analysis of Five Major LLM Visualization Tools: Langflow, Flowise, Dify, AutoGPT UI, and AgentGPT

In recent years, the rapid development of large language model (LLM) technology has driven the widespread application of intelligent agents. From task automation to intelligent dialogue systems, LLM agents can greatly simplify the execution of complex tasks. To help developers build and deploy these intelligent agents more quickly, several open-source tools have emerged, especially those … Read more

MiniCPM-2B Series Lightweight Model Surpasses Mistral-7B

MiniCPM-2B Series Lightweight Model Surpasses Mistral-7B

Source: Shizhi AI This article has 1838 words and suggests a 5-minute reading time. The Tsinghua NLP Laboratory and Mianbi Intelligent have released the MiniCPM-2B series lightweight model on the wisemodel.cn open-source community, which is considered a performance powerhouse, surpassing Mistral-7B and even outdoing many larger models like 13B and 33B, capable of running directly … Read more

Overview of Latest Transformer Pre-training Models

Overview of Latest Transformer Pre-training Models

Reported by Machine Heart In today’s NLP field, we can see the success of “Transformer-based Pre-trained Language Models (T-PTLM)” in almost every task. These models originated from GPT and BERT. The technical foundations of these models include Transformer, self-supervised learning, and transfer learning. T-PTLM can learn universal language representations from large-scale text data using self-supervised … Read more

Comprehensive Summary of Word Embedding Models

Comprehensive Summary of Word Embedding Models

Source: DeepHub IMBA This article is approximately 1000 words long and is recommended to be read in 5 minutes. This article will provide a complete summary of word embedding models. TF-IDF, Word2Vec, GloVe, FastText, ELMO, CoVe, BERT, RoBERTa The role of word embeddings in deep models is to provide input features for downstream tasks (such … Read more

Bart: Seq2Seq Pre-training Model

Bart: Seq2Seq Pre-training Model

Follow the public account “ML_NLP“ Set as “Starred“, heavy content delivered first-hand! Recently, I have started using Transformer for some tasks, specifically recording related knowledge points to build a relevant and complete knowledge structure system. The following is the article I am going to write; this is the sixteenth article in this series: Transformer: The … Read more

Must-See! Princeton’s Chen Danqi Latest Course on Understanding Large Language Models 2022!

Must-See! Princeton's Chen Danqi Latest Course on Understanding Large Language Models 2022!

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate students, teachers from universities, and researchers from enterprises. The vision of the community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning, especially for the progress … Read more

XLNet Pre-training Model: Everything You Need to Know

XLNet Pre-training Model: Everything You Need to Know

Author | mantch Reprinted from WeChat Official Account | AI Technology Review 1. What is XLNet XLNet is a model similar to BERT, rather than a completely different model. In short, XLNet is a general autoregressive pre-training method. It was released by the CMU and Google Brain teams in June 2019, and ultimately, XLNet outperformed … Read more

Understanding ALBERT in Interviews

Understanding ALBERT in Interviews

Follow the WeChat public account “ML_NLP” Set it as “Starred”, heavy content delivered first time! Source | Zhihu Address | https://zhuanlan.zhihu.com/p/268130746 Author | Mr.robot Editor | Machine Learning Algorithms and Natural Language Processing WeChat Public Account This article has been authorized by the author, and secondary reproduction is prohibited without permission. Interviewer: Do you understand … Read more

Few-Shot NER with Dual-Tower BERT Model

Few-Shot NER with Dual-Tower BERT Model

Delivering NLP technical insights to you every day! Author | SinGaln Source | PaperWeekly This is an article from ACL 2022. The overall idea is to use a dual-tower BERT model to encode text tokens and their corresponding labels separately based on meta-learning, and then perform classification on the output obtained from the dot product … Read more