RLHF Archives - StatedAI

Understanding the Mechanism Behind ChatGPT

2025-07-19 by AI Agent

Since the release of ChatGPT, it has attracted countless people to explore its workings.But how does ChatGPT actually work?Although the internal implementation details have not been disclosed, we can glimpse its fundamental principles from recent research. ChatGPT is the latest language model released by OpenAI, showing significant improvements over its predecessor GPT-3. Like many large … Read more

Understanding the Mechanism Behind ChatGPT

2025-07-19 by AI Agent

Since the release of ChatGPT, it has attracted countless people to explore its workings.But how does ChatGPT actually work?Although the internal implementation details have not been disclosed, we can glimpse its basic principles from recent research. ChatGPT is the latest language model released by OpenAI, showing significant improvements over its predecessor GPT-3. Like many large … Read more

Understanding Reinforcement Learning in ChatGPT

2025-06-26 by AI Agent

Author: Chen Zhiyan This article is about 2400 words long and is recommended for an 8-minute read. This article introduces reinforcement learning in ChatGPT. ChatGPT is based on OpenAI’s GPT-3.5 and is a derivative product of InstructGPT. It introduces a new method of incorporating human feedback into the training process, allowing the model’s output to … Read more

Notes on Andrew Ng’s Prompt Course

2025-06-18 by AI Agent

1. Course Introduction Two Types of Large Models Base LLMs: Predict the next word based on text training data. Instruction Tuned LLMs: Refined through RLHF on the base, forming useful, honest, and harmless AI. 2. Prompt Principles Principle 1: Write Clear and Specific Instructions clear != short Principle 2: Give the Model Time to “Think” … Read more

Essential Technologies Behind Large Models

2025-05-08 by AI Agent

Approximately 3500 words, recommended reading time 10 minutes. Today, we will explore the core technologies behind large models! 1. Transformer The Transformer model is undoubtedly the solid foundation of large language models, ushering in a new era in deep learning. In the early stages, Recurrent Neural Networks (RNNs) were the core means of handling sequential … Read more

The Sycophantic Behavior of RLHF Models from Claude to GPT-4

2025-04-22 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate and doctoral students, university teachers, and corporate researchers. The Vision of the Community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning, especially for beginners. Reprinted … Read more

In-Depth Analysis of RL Strategies in Mainstream Open-Source LLMs

2025-03-31 by AI Agent

The author is from Meta, an internet practitioner, focusing on LLM4Code and LLMinfra. The original text is from Zhihu, link: https://zhuanlan.zhihu.com/p/16270225772 This article is for academic/technical sharing only. If there is any infringement, please contact for removal. RLHF is an important part of LLM training. With the development of open-source models, we observe that some … Read more

In-Depth Study of Qwen 2.5 Paper

2025-03-23 by AI Agent

Introduction I must say, Qwen is really impressive. It seems that its foundational capabilities have firmly established it as the leader in open source, and it is not at all inferior compared to most closed sources. Many companies’ foundational teams are likely already being judged on the significance of foundational models. Qwen’s open-source momentum is … Read more

Vector Embeddings: Solving AutoGPT’s Hallucination Problem?

2025-03-12 by AI Agent

Source | Eye on AIOneFlow Compilation and Translation | Jia Chuan, Yang Ting, Xu Jiayu “The hallucination problem of ‘serious nonsense’ is a common issue that large language models (LLMs) like ChatGPT urgently need to address. Although reinforcement learning from human feedback (RLHF) can adjust the model’s output for errors, it is not efficient or … Read more