Anything LLM Archives - Page 5 of 11

RAG Evaluation Guide: Comprehensive Analysis of LLM Performance Assessment Methods

2025-04-22 by AI Agent

Introduction This article will compare the evaluation methods of RAG from a timeline perspective. These evaluation methods are not limited to the RAG process, and the evaluation methods based on LLM are more applicable across various industries. Common Evaluation Methods for RAG In the previous section, we discussed how to use the ROUGE method to … Read more

Comprehensive Introduction to Large Models and RAG

2025-04-22 by AI Agent

This article is about 11,000 words long and is recommended to be read in 6 minutes. This article introduces large models + RAG. 1 Introduction Large Language Models (LLMs) have limitations when handling domain-specific or highly specialized queries, such as generating inaccurate information or “hallucinations.” A promising approach to mitigate these limitations is Retrieval-Augmented Generation … Read more

Understanding Retrieval Augmented Generation (RAG)

2025-04-22 by AI Agent

Click the “Blue WeChat Name” below the title to quickly follow In the era of large models, many new terms have emerged, and RAG is one of them. This article from the tech community, “Understanding RAG (Retrieval Augmented Generation) in One Article,” explains what RAG is, its functions, and the associated challenges. Related historical articles … Read more

2023 Annual Review: Must-Read Books on AIGC, AGI, ChatGPT, and AI Large Models

2025-04-21 by AI Agent

2023 marks a year of explosive growth for large language models in artificial intelligence, with several concepts and English abbreviations emerging this year, leading to confusion and even bewilderment. LLM: Large Language Model, refers to models designed to understand and generate human language. LLMs are characterized by their massive scale, containing hundreds of billions of … Read more

In-Depth Analysis of the Connections Between Transformer, RNN, and Mamba!

2025-04-19 by AI Agent

Source: Algorithm Advancement This article is about 4000 words long and is recommended for an 8-minute read. This article deeply explores the potential connections between Transformer, Recurrent Neural Networks (RNN), and State Space Models (SSM). By exploring the potential connections between seemingly unrelated Large Language Model (LLM) architectures, we may open up new avenues for … Read more

Understanding Transformer: 8 Questions and Answers

2025-04-18 by AI Agent

Originally from AI有道 Seven years ago, the paper “Attention is All You Need” introduced the transformer architecture, revolutionizing the entire field of deep learning. Today, all major models are based on the transformer architecture, yet the internal workings of the transformer remain a mystery. Last year, one of the authors of the transformer paper, Llion … Read more

Overview of Large Multimodal Agents

2025-04-16 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering NLP graduate students, university professors, and corporate researchers. The Vision of the Community is to promote communication and progress between academia, industry, and enthusiasts in the field of natural language processing and machine learning, especially for beginners. Reprinted … Read more

Reflections on the New Generation of Intelligent Agents in the Post-LLM Era

2025-04-15 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community both domestically and internationally, covering audiences including NLP graduate students, university teachers, and corporate researchers. The community’s vision is to promote communication and progress between the academic and industrial sectors of natural language processing and machine learning, especially for the advancement of beginners. … Read more

Understanding AutoGPT and LLM Agents

2025-04-13 by AI Agent

In the past two weeks, projects like AutoGPT and BabyAGI have gained immense popularity. Over the weekend, I spent some time reviewing the code of these AI agent projects and decided to write an article summarizing my technical insights and thoughts on the current advancements in this field for everyone to discuss. From Language Understanding … Read more

The Evolution of DeepSeek’s Janus Series Multimodal Models

2025-04-13 by AI Agent

Introduction From many people’s perspective, DeepSeek’s intensive release of multimodal open-source models before the Spring Festival aims to capitalize on the momentum to take away “ClosedAI”. However, when I checked GitHub, I found that the previous Janus Flow was already several months old, and this Pro version is merely an “ordinary” upgrade for them. It … Read more