Query Optimization Techniques in RAG

A Survey of Query Optimization in Large Language Models

Paper Link:https://arxiv.org/pdf/2412.17558

Published by: Tencent

Large Language Models (LLMs) are becoming increasingly popular, but they also face challenges such as “hallucination” when dealing with domain-specific tasks or those requiring specialized knowledge. Retrieval-Augmented Generation (RAG) technology has emerged as a key method for enhancing model performance, with Query Optimization (QO) playing a central role. This paper provides a detailed overview and classification of query optimization techniques.

The following summarizes all the query optimization methods discussed in this paper. Each optimization method is backed by a paper, and more details can be found in the original text.

Summary:

1. Query Expansion – Internal Expansion

● GENREAD: Utilizes designed instructions to prompt LLMs to generate contextual documents based on the initial query, which are then read by the LLM to produce the final response.

● QUERY2DOC: Generates pseudo-documents through few-shot prompting of LLMs to expand the original query, enhancing the performance of sparse and dense retrieval systems.

● REFEED: First generates initial outputs, then uses a retrieval model to fetch relevant information from a large document set, incorporating it into contextual examples to improve outputs.

● INTER: The retrieval model leverages the knowledge base generated by LLMs to expand knowledge within the query while the LLM optimizes prompts using retrieved documents, creating a collaborative loop.

● HYDE: Uses zero-shot prompting of language models to generate hypothetical documents, which are then encoded by an unsupervised contrastive encoder to identify similar real documents and filter out inaccurate content.

● FLARE: Predicts future content based on the original query and retrieves relevant information. If the generated temporary next sentence contains low confidence markers, it is treated as a new query for further retrieval.

● MILL: Leverages the zero-shot reasoning ability of LLMs to generate diverse sub-queries and corresponding documents, achieving optimal expansion and comprehensive retrieval through mutual verification.

● GENQRENSEMBLE: Employs ensemble-based prompting techniques to generate multiple sets of keywords using zero-shot instructions, enhancing retrieval effectiveness.

● ERRR: Extracts parameter knowledge from LLMs and refines queries using a dedicated query optimizer.

Summary:

2. Query Expansion – External Expansion

● LameR: Combines the query with potential answers to prompt LLMs, with potential answers obtained through standard retrieval procedures on the target set.

● GuideCQR: Optimizes queries using key information from the initial retrieval documents for conversational query reconstruction.

● CSQE: Facilitates the incorporation of knowledge from the corpus into queries, using the relevance assessment capabilities of LLMs to identify key sentences that expand queries.

● MUGI: Utilizes LLMs to generate multiple pseudo-references, combining them with the query to enhance the performance of sparse and dense retrievers.

Summary:

3. Problem Decomposition

● DSP: Passes natural language text between LLMs and retrieval models through complex pipelines, breaking down problems into smaller transformations for easier handling.

● LEAST – TO – MOST: Uses few-shot prompting to decompose complex problems into a series of simpler sub-problems, then solves them sequentially.

● PLAN – AND – SOLVE: Creates a plan that divides the entire task into smaller sub-tasks, then executes them according to the plan.

● SELF – ASK: Introduces the concept of combinatorial gap, emphasizing the challenges faced by models in integrating answers from sub-queries.

● EAR: Applies query expansion models to generate diverse queries, using query re-rankers to select queries that may yield better retrieval results.

● COK: First proposes and prepares preliminary reasoning and answers, identifies relevant knowledge areas, and if there is no consensus on the answer, gradually refines the reasoning.

● ICAT: Induces LLMs to decompose complex queries or generate step-by-step reasoning abilities from relevant task data sources without fine-tuning or manual labeling.

● REACT: Prompts LLMs to interleave generating verbal reasoning traces and actions, achieving dynamic reasoning and interaction with external environments.

● AUTOPRM: Breaks down complex problems into manageable sub-queries and then applies reinforcement learning to iteratively improve the sub-query solvers.

● RA – ISF: Iteratively processes sub-queries, combining text relevance and self-knowledge answering capabilities to mitigate the impact of irrelevant prompts.

● LPKG: Extracts instances based on predefined patterns in open-domain knowledge graphs, transforming complex queries and sub-queries into natural language, enhancing query planning capabilities.

● ALTER: Generates multiple sub-queries using question enhancers, examining the original question from different angles to handle complex table reasoning tasks.

● IM – RAG: Introduces a refiner to improve retrieval outputs, bridging the gap between the reasoning and information retrieval modules, facilitating multi-round communication.

● REAPER: Uses a single smaller LLM to generate plans containing tool calls, call sequences, and parameters for efficient retrieval of complex queries.

● HIRAG: Decomposes the original query into multi-hop queries, answering sub-queries based on retrieval knowledge, and then integrates answers using chain-of-thought methods.

● MQA – KEAL: Stores knowledge edits as structured knowledge units in external memory, iteratively querying external memory and/or target LLMs after decomposing multi-hop queries to generate final responses.

● RICHRAG: Introduces a sub-aspect explorer to analyze input queries, combining multi-faceted retrieval to fetch relevant documents for answering queries.

● CONTREGEN: Proposes a context-driven tree-based retrieval method to enhance the depth and relevance of retrieved content.

● PLAN×RAG: Constructs reasoning plans represented as directed acyclic graphs, decomposing the main query into relevant atomic sub-queries for information sharing.

● RAG – STAR: Integrates retrieval information to guide a tree-based cautious reasoning process, utilizing Monte Carlo tree search to plan intermediate sub-queries and generate answers.

Summary:

4. Query Disambiguation

● ECHOPROMPT: Introduces a query rephrasing sub-task that encourages the model to rephrase the query in its own words before reasoning.

● TOC: Recursively constructs a disambiguation tree for ambiguous queries using few-shot prompting and external knowledge, generating comprehensive answers.

● INFOCQR: Adopts a “rewrite-edit” framework, first rewriting the original query and then editing to eliminate ambiguity.

● ADAQR: Proposes a preference optimization method that trains the rewriter based on retriever preferences to optimize query rewriting.

● MAFERW: Integrates multi-faceted feedback from retrieval documents and generated responses as rewards to explore optimal query rewriting strategies.

● CHIQ: Utilizes the NLP capabilities of LLMs (e.g., resolving co-reference relations and expanding context) to reduce ambiguity in dialogue history, improving the relevance of search queries.

Summary:

5. Query Abstraction

● STEP – BACK: Uses carefully designed prompts to manipulate the initial query, guiding the LLM reasoning process to align outputs more closely with the original query intent.

● COA: Abstracts general chain-of-thought reasoning into reasoning chains with abstract variables, enabling LLMs to utilize domain-specific tools for solving.

● AOT: Constructs the entire reasoning process using an abstract framework, integrating different levels of abstraction, clarifying the objectives and functions of each step.

● Baek et al.: Generates higher-level abstract information as context for existing queries, enriching direct information about the query subject.

● MA – RIR: Defines aspects of queries to facilitate more focused and effective reasoning on different aspects of complex queries.

● META – REASONING: Deconstructs the entity and operation semantics in queries into general symbolic representations, learning universal reasoning patterns.

● RULERAG: Recalls documents supporting queries based on logical rules, generating final responses.

● SIMGRAG: Resolves the alignment issue between query text and knowledge graph structures through a two-stage process, first converting the query into graph patterns, then quantifying the alignment degree.

Previous Selected Content:

RAG:RAG Document Parsing Tool｜9 Commonly Used RAG Open Source Frameworks ｜Long Article Overview of RAG in 2024｜OmniEval: Comprehensive Evaluation Benchmark for RAG Technology in Finance ｜Stop Using PostgreSQL, Milvus is the Best Partner for Multi-Language RAG ｜BM25 and Splade Sparse Vector Interpretation｜HtmlRAG: RAG System for Web Pages｜Enhancing Noise Robustness of Retrieval-Augmented Language Models: Adaptive Adversarial Training Method｜RAFTS: Synthetic Fact Checking through Comparative Argumentation｜ARL2: How to Turn Black Box LLMs into Labeling Masters to Improve Retriever Efficiency｜Enhancing Noise Robustness of Retrieval-Augmented Language Models: Adaptive Adversarial Training Method｜DRAGIN: Dynamic Retrieval-Augmented Generation for Real-Time Information Needs Based on Large Language Models

LLM4Rec:Kuaishou MARM: The Right Approach to Opening the Scaling Law of Recommendation Systems｜WSDM2025 Recommendation System Paper Compilation ｜STAR—Recommendation Algorithm for Large Models without Training ｜LLM – ESR: The Path to Enhancing Large Language Models for Long-Tail Sequence Recommendations｜Google ILM Framework—Enhancing the Application of Large Language Models in Conversational Recommendation Systems｜Kuaishou Release—Quantitative Alignment for Multi-Modal Recommendations｜EmbSum—Using LLMs’ Summarization Ability to Enhance Content Recommendation Ability｜[RecSys’24] Text2Tracks: A Music Recommendation System Based on Generative Retrieval｜How to Address Modality Imbalance in Multi-Modal Recommendation Systems｜Multi-Round Iterative RAG Framework—InteR

LLM Technical Reports:DeepSeek-v3 Core Technology—MLA ｜DeepSeek-v3 Core Technology—DeepSeekMoE ｜DeepSeek-v3 Core Technology—Multi-Token Prediction ｜GLM-PC Base Model, CogAgent-9B Open Source｜QWen2.5 Technical Report Interpretation ｜Finally, DeepSeek-VL2 ｜Alibaba QwQ-32B-Preview Model’s New Breakthrough in AI Inference Capability

Multi-Modal LLM:NIPS2024 | Huawei Noah’s Ark Lab Proposes Unified Vision Base Model UNIT｜PreFLMR: Next-Generation Pre-trained Multi-Modal Knowledge Retriever

LLM Basics:Technologies Behind Reinforcement Fine-Tuning: REFT｜Three-Minute Learning on Transformers (1) Self-Attention｜Large Model Training (2)—RLHF Process Interpretation｜Large Model Training (1) What is RLHF 、Large Model Training (2)—RLHF Process Interpretation ｜Estimation Methods Related to Large Model Efficiency—Memory Usage Estimation｜Estimation Methods Related to Large Model Efficiency—Parameter Estimation｜Estimation Methods Related to Large Model Efficiency—Training Time Estimation｜Hash-Based Automatic Prefix Caching｜Large Model Tokenization Algorithm (2) Byte-Level BPE (Byte-level BPE, B-BPE)｜Large Model Tokenization Algorithm (1) BPE Tokenization Algorithm｜What is Speculative Decoding?｜Large Model Interview Experience—SFT

LLM Applications:ChatBI: From Natural Language to Complex Business Intelligence SQL｜Next-Generation Database Interface: Overview of Text-to-SQL Technology Based on Large Language Models

Summary: 1. Query Expansion – Internal Expansion

1. Query Expansion – Internal Expansion

Summary: 2. Query Expansion – External Expansion

2. Query Expansion – External Expansion

Summary: 3. Problem Decomposition

3. Problem Decomposition

Summary: 4. Query Disambiguation

4. Query Disambiguation

Summary: 5. Query Abstraction

5. Query Abstraction

Leave a Comment Cancel reply

Summary:

1. Query Expansion – Internal Expansion

Summary:

2. Query Expansion – External Expansion

Summary:

3. Problem Decomposition

Summary:

4. Query Disambiguation

Summary:

5. Query Abstraction