Detailed Explanation of GLM-130B: An Open Bilingual Pre-trained Model

Detailed Explanation of GLM-130B: An Open Bilingual Pre-trained Model

Source: Contribution Author: Mao Huaqing Editor: Xuejie Table of Contents Related Knowledge GPT BERT T5 Summary Background Introduction Main Contributions and Innovations GLM 6B Custom Mask Model Quantization 1TB Bilingual Instruction Fine-tuning RLHF PEFT Training Strategy Model Parameters Six Metrics Other Evaluation Results Environment Preparation Running Invocation Code Invocation Web Service Command Line Invocation Model … Read more

Easily Build a Knowledge Base with LangChain, LlamaIndex, and OpenAI

Easily Build a Knowledge Base with LangChain, LlamaIndex, and OpenAI

Original Title: Easily Build a Knowledge Base with LangChain, LlamaIndex, and OpenAI In today’s information age, effectively managing and utilizing vast amounts of data has become a key issue. For Python developers, building an intelligent knowledge base system can not only improve work efficiency but also provide strong support for decision-making. Today, I will teach … Read more

Building a Q&A Bot with Local Knowledge Base Using LlamaIndex and Qwen1.5

Building a Q&A Bot with Local Knowledge Base Using LlamaIndex and Qwen1.5

01 Introduction What is RAG LLMs can produce misleading “hallucinations”, depend on information that may be outdated, and are inefficient when handling specific knowledge, lacking deep insights in specialized fields, while also having some deficiencies in reasoning capabilities. It is against this backdrop that Retrieval-Augmented Generation (RAG) technology has emerged, becoming a significant trend in … Read more

How Mianbi Intelligent Surpasses Large Models with MiniCPM

How Mianbi Intelligent Surpasses Large Models with MiniCPM

Cost is the invisible competitive advantage of large models. Author|Liu Yangnan Editor|Zhao Jian Today, the Tsinghua University-affiliated large model company “Mianbi Intelligent” released its first flagship large model “Mianbi MiniCPM”, which has been aptly named “Little Cannon”. According to Mianbi Intelligent’s co-founder and CEO Li Dahai, the parameter scale of Mianbi MiniCPM is 2B, using … Read more

Understanding Google’s Powerful NLP Model BERT

Understanding Google's Powerful NLP Model BERT

▲ Click on the top Leiphone to follow Written by | AI Technology Review Report from Leiphone (leiphone-sz) Leiphone AI Technology Review notes: This article is an interpretation provided by Pan Shengfeng from Zhuiyi Technology based on Google’s paper for AI Technology Review. Recently, Google researchers achieved state-of-the-art results on 11 NLP tasks with the … Read more

BERT Paper Notes

BERT Paper Notes

Author: Prince Changqin (NLP Algorithm Engineer) Bert, Pre-training of Deep Bidirectional Transformers for Language Understanding Note Paper: https://arxiv.org/pdf/1810.04805.pdf Code: https://github.com/google-research/bert The core idea of Bert: MaskLM utilizes bidirectional context + MultiTask. Abstract BERT obtains a deep bidirectional representation of text by jointly training the context across all layers. Introduction Two methods to apply pre-trained models … Read more

Reviewing Progress and Insights on BERT Models

Reviewing Progress and Insights on BERT Models

Authorized Reprint from Microsoft Research AI Headlines Since BERT was published on arXiv, it has gained significant success and attention, opening the Pandora’s box of 2-Stage in NLP. Subsequently, a large number of pre-trained models similar to “BERT” have emerged, including the generalized autoregressive model XLNet that introduces bidirectional context information from BERT, as well … Read more

Training CT-BERT on COVID-19 Data from Twitter

Training CT-BERT on COVID-19 Data from Twitter

Big Data Digest authorized repost from Data Party THU Author: Chen Zhiyan Twitter has always been an important source of news, and during the COVID-19 pandemic, the public has been able to express their anxieties on Twitter. However, manually classifying, filtering, and summarizing the massive amount of COVID-19 information on Twitter is nearly impossible. This … Read more

The Evolution of Pre-trained Large Models from BERT to ChatGPT

The Evolution of Pre-trained Large Models from BERT to ChatGPT

Report by Machine Heart Editor: Zhang Qian This nearly one hundred page review outlines the evolution of pre-trained foundation models, showing us how ChatGPT has gradually achieved success. All successes have a traceable path, and ChatGPT is no exception. Recently, Turing Award winner Yann LeCun was trending due to his overly harsh evaluation of ChatGPT. … Read more

Choosing Between BERT, RoBERTa, DistilBERT, and XLNet

Choosing Between BERT, RoBERTa, DistilBERT, and XLNet

Planning | Liu Yan Author | Suleiman Khan Translation | Nuclear Cola Editor | Linda AI Frontline Overview: Google BERT and other transformer-based models have recently swept the entire NLP field, significantly surpassing previous state-of-the-art solutions in various tasks. Recently, Google has made several improvements to BERT, leading to a series of impressive enhancements. In … Read more