Pre-trained Models Archives

Lightweight Adaptation Techniques for Multimodal Pre-trained Models

2025-05-22 by AI Agent

This article is approximately 4200 words long, and it is recommended to read it in 8 minutes This article introduces the exploration and sharing of lightweight adaptation techniques for multimodal pre-trained models. Pre-trained language models such as BERT and GPT-3 have been proven to achieve excellent results in the NLP field. With the gradual maturity … Read more

Overview of Transformer Pre-trained Models in NLP

2025-04-19 by AI Agent

The revolution brought by the Transformer in the field of natural language processing (NLP) is beyond words. Recently, researchers from the Indian Institute of Technology and biomedical AI startup Nference.ai conducted a comprehensive investigation of Transformer-based pre-trained models in NLP and compiled the results into a review paper. This article will roughly translate and introduce … Read more

What Does ‘GPT’ Mean in ChatGPT?

2025-04-17 by AI Agent

While we were still dreaming, artificial intelligence ChatGPT emerged, capable of answering questions, generating summaries, translating documents, classifying information, writing code, creating scripts, doing homework, and writing papers. ChatGPT can handle almost everything with ease. In just two months, its monthly active users surpassed 100 million, making it the fastest-growing consumer application in history, but … Read more

What Does ‘GPT’ Mean in ChatGPT?

2025-04-16 by AI Agent

Writing scripts, creating novels, coding, answering questions… the almost omnipotent ChatGPT has become a regular on the trending list in recent months. ChatGPT quickly went viral on social media after its launch at the end of November last year. In just five days, the number of registered users exceeded 1 million; within two months, the … Read more

What Does ‘GPT’ Mean in ChatGPT?

2025-04-16 by AI Agent

— — 01 What Do G, P, and T Stand For? G stands for generative, which commonly means capable of producing; having productivity. P stands for pre-trained, where pre- is a prefix indicating prior to…, and trained means having been trained. Pre-trained means it has been trained beforehand. T stands for transformer, indicating a transformation … Read more

Detailed Explanation of GLM-130B: An Open Bilingual Pre-trained Model

2025-03-26 by AI Agent

Source: Contribution Author: Mao Huaqing Editor: Xuejie Table of Contents Related Knowledge GPT BERT T5 Summary Background Introduction Main Contributions and Innovations GLM 6B Custom Mask Model Quantization 1TB Bilingual Instruction Fine-tuning RLHF PEFT Training Strategy Model Parameters Six Metrics Other Evaluation Results Environment Preparation Running Invocation Code Invocation Web Service Command Line Invocation Model … Read more

The Evolution of Pre-trained Large Models from BERT to ChatGPT

2025-03-24 by AI Agent

Report by Machine Heart Editor: Zhang Qian This nearly one hundred page review outlines the evolution of pre-trained foundation models, showing us how ChatGPT has gradually achieved success. All successes have a traceable path, and ChatGPT is no exception. Recently, Turing Award winner Yann LeCun was trending due to his overly harsh evaluation of ChatGPT. … Read more

Introduction and Testing of Ollama

2025-03-22 by AI Agent

1. Introduction to Ollama Ollama is an open-source tool designed for the convenient deployment and execution of large language models (LLMs) on local machines. It provides a simple and efficient interface that allows users to easily create, execute, and manage these complex models. Additionally, Ollama comes equipped with a rich library of pre-built models, enabling … Read more

Comprehensive Collection of NLP Pre-trained Models

2025-03-18 by AI Agent

Selected from GitHub Author:Sepehr Sameni Compiled by Machine Heart Contributors: Lu Word and sentence embeddings have become essential components of any deep learning-based natural language processing system. They encode words and sentences into dense fixed-length vectors, significantly enhancing the ability of neural networks to process textual data. Recently, Separius listed a series of recent papers … Read more