Mastering RAG Series 3: Routing

LLM (Large Language Model, a powerful new platform) is a powerful new platform, but they are not always trained on data relevant to our tasks or the latest data.

RAG (Retrieval Augmented Generation, a method that connects LLMs with external data sources (such as private data or the latest data). It allows LLMs to use external data to generate their outputs.

To truly master RAG, we need to learn the techniques shown in the figure below:

Mastering RAG Series 3: Routing

We learned about Query Translation in “Mastering RAG Series 2: Query Translation”. In this article, we will move to the next node: Routing.

Mastering RAG Series 3: Routing

Routing refers to intelligently selecting the most suitable retrieval path or reasoning logic based on the user’s query content to obtain answers more efficiently. This dynamic selection process is particularly important in scenarios with multiple data sources, multiple retrievers, or multiple tasks, significantly enhancing the system’s performance and accuracy.

Routing types implemented in RAG:

  • Logical Routing

  • Semantic Routing

Logical Routing

  • Diverting queries based on explicit rules or classification models. For example, classifying queries into different databases, knowledge bases, or APIs.

  • Applicable to structured problems or explicit data source scenarios.

Mastering RAG Series 3: Routing

First, we defined three data sources: python_docs, js_docs, and golang_docs:

Mastering RAG Series 3: Routing

By using with_structured_output, you can format the model’s generation results into specific structured data formats (like JSON or dictionaries).

Mastering RAG Series 3: Routing

Then we defined a prompt that selects the data source based on the programming language in the user’s question:

Mastering RAG Series 3: Routing

Finally, we formed a router chain:

Mastering RAG Series 3: Routing

The router can automatically select the data source based on the programming language in the user’s question.

We can write a simple routing function to select the data source:

Mastering RAG Series 3: Routing

Semantic Routing

  • Dynamic selection of the optimal path based on vectorized semantic similarity. For example, matching the most suitable Prompt or retriever through embedding techniques.

  • Applicable to vague queries or scenarios that require flexible responses.

Mastering RAG Series 3: Routing

For example, we can write two Prompts for answering questions in different fields (Physics and Mathematics):

Mastering RAG Series 3: Routing

Then our Prompt routing function, prompt_router, selects the appropriate Prompt based on the user’s question and the semantic similarity of each Prompt (here using cosine similarity).

Mastering RAG Series 3: Routing

GitHub Link:

https://github.com/realyinchen/RAG/blob/main/03_Routing.ipynb

Leave a Comment