Practical LLM RAG: Key Steps to Unlock Custom LlamaIndex

Practical LLM RAG: Key Steps to Unlock Custom LlamaIndex

1. Introduction to LlamaIndex LlamaIndex is a Python library created by Jerry Liu that enables efficient text search and summarization of large document collections using language models.Developers can quickly add private/custom data to enhance existing LLMs with LlamaIndex. It provides personalized and data-driven responses without the need for retraining large models. Due to the limited … Read more

LlamaIndex: A New Document Summary Index for QA Systems

LlamaIndex: A New Document Summary Index for QA Systems

In this blog post, we introduce a brand new data structure in LlamaIndex: the Document Summary Index. We describe how it helps provide better retrieval performance compared to traditional semantic search, along with an example. https://github.com/jerryjliu/llama_index 1 Background One of the core scenarios for large language models (LLM) is question answering on user data. To … Read more

Advanced RAG – Composable Retrieval with LlamaIndex

Advanced RAG - Composable Retrieval with LlamaIndex

LlamaIndex is a simple and flexible data framework that connects custom data sources with large language models. LlamaIndex provides comprehensive support for RAG. Advanced RAG (Retrieval-Augmented Generation) techniques can be modeled using a composable hierarchical abstraction. The retrieved text can be linked to the following elements: Retriever Text Pipeline Query Engine The retrieval of RAG … Read more

LlamaIndex Practical Application – ChatEngine ReAct Agent Mode

LlamaIndex Practical Application - ChatEngine ReAct Agent Mode

Overview ReAct is an agent-based chat mode built on top of a data query engine. For each chat interaction, the agent enters a ReAct loop: First, decide whether to use the query engine tool and propose appropriate input (Optional) Use the query engine tool and observe its output Decide whether to repeat or give a … Read more

RestAI: An Open Source AIaaS Platform Using LlamaIndex, Ollama, and HF Pipelines

RestAI: An Open Source AIaaS Platform Using LlamaIndex, Ollama, and HF Pipelines

Project Introduction Project: There are various types of agents (projects), each with its own functionality. (rag, ragsql, inference, vision) User: Users represent the system’s users. It is used for authentication and authorization (basic authentication). Each user can access multiple projects. LLMs: Supports any public or local LLM supported by LlamaIndex or Ollama. VRAM: Automatic VRAM … Read more

Implementing Agent Applications with LlamaIndex’s Query Pipeline

Implementing Agent Applications with LlamaIndex's Query Pipeline

In the previous article “The Future of Application Orchestration is Pipeline, LlamaIndex Releases Query Pipeline in Preview to Enhance Application Development Flexibility” we mentioned that LlamaIndex has released a new experimental feature that supports defining a Query Pipeline in a declarative manner to create personalized application workflows, along with a case study for RAG applications. … Read more

Using LlamaIndex to Create Custom Agent Functions

Using LlamaIndex to Create Custom Agent Functions

Overview This article introduces how to use LlamaIndex to write your own Agent handling functions. Note that this article uses a locally deployed LLM supported by Ollama for practical implementation, rather than remotely calling the OpenAI API. The goal of this article is to save the output content to a PDF file and then stop … Read more

LlamaIndex Practical Guide – Overview of Query Engine Usage

LlamaIndex Practical Guide - Overview of Query Engine Usage

Overview The Query Engine is a generic interface that allows you to query data. It accepts natural language queries and returns rich responses. It is typically (but not always) built on one or more indexes through a retriever. You can combine multiple query engines to achieve more advanced functionality. Note: If you want to have … Read more

How LlamaIndex Performs Retrieval Augmented Generation (RAG)

How LlamaIndex Performs Retrieval Augmented Generation (RAG)

The full name of RAG is Retrieval Augmented Generation, which means “retrieval enhanced generation”. LLMs are trained on a vast amount of data, but this training data does not include your data. RAG solves this problem by adding your data to the data that the LLM already has access to. In RAG, your data is … Read more

LlamaIndex Practical Implementation: Agent Database Interaction

LlamaIndex Practical Implementation: Agent Database Interaction

Overview This article implements a simple intelligent Agent that first queries data from a database and then processes the data using utility functions. This is a very common scenario that can be extended to multiple practical situations. Similarly, all experiments in this article are conducted on a local machine with 16C32G Linux (CPU). Data Preparation … Read more