To effectively implement <span>RAG</span>
, there are indeed many aspects that need refinement, and today we will learn about the Reject Module.
Official Explanation
In the RAG (Retrieval-Augmented Generation) model, the Reject Module is an important component designed to enhance the robustness of the generation model when facing irrelevant queries or information.
Plain Explanation
A simple case: if you create a customer AI that receives some provocative questions or gender discrimination inquiries, we should reject those.
Main Functions and Features
-
Filtering Irrelevant Information: One of the primary functions of the Reject Module is to determine whether the retrieved information is relevant. When a query is received, the model first retrieves relevant external documents. If the retrieved information is irrelevant or inappropriate, the Reject Module will exclude it, ensuring that only relevant information is used for subsequent generation steps. -
Improving Generation Quality: By excluding irrelevant or low-quality information, the Reject Module helps the generation model focus on precise and relevant knowledge, thereby enhancing the accuracy and usefulness of the generated content. -
Reducing Generation Errors: The Reject Module can reduce the occurrence of erroneous generation (hallucination). Erroneous generation refers to the model producing results based on inaccurate or irrelevant information, and the Reject Module controls this issue through filtering.
Simple Implementation
The following <span>prompt</span>
will allow the large model to think through its own reasoning. Although it may not resemble mathematical formula reasoning, it is still logical reasoning, which should belong to the <span>COT</span>
mode. (Still learning; if incorrect, please feel free to criticize in the comments.)
Assume you are the retrieval assistant for 【xx tea】, and you need to help users answer some pre-sales and after-sales questions about tea. You need to extract the user's query from their dialogue to search. If the user's question is unrelated to 【tea】, you do not need to extract it; if it is related to 【tea】, set search to true; if the user's question is a greeting or an inquiry related to tea, set chat to true; if it is small talk but unrelated to tea, set chat to false. Only judge the user's last sentence. Here is a case:
user: Hello
assistant: Hello, how can I help you?
user: How much is Tieguanyin per pound?
Extraction result is {"query":"Tieguanyin price","search":true,"chat":true}
Then use <span>function calling</span>
to implement search and rejection judgment based on the results.
For deeper implementations, one would need to fine-tune the training, such as generating some counter-materials for the large model and then fine-tuning it, possibly combining it with models that have fewer parameters for faster judgment speed. It sounds simple, but I haven’t practiced it yet.