LlamaIndex: A New Document Summary Index for QA Systems

LlamaIndex: A New Document Summary Index for QA Systems

In this blog post, we introduce a brand new data structure in LlamaIndex: the Document Summary Index. We describe how it helps provide better retrieval performance compared to traditional semantic search, along with an example. https://github.com/jerryjliu/llama_index 1 Background One of the core scenarios for large language models (LLM) is question answering on user data. To … Read more

Advanced RAG – Composable Retrieval with LlamaIndex

Advanced RAG - Composable Retrieval with LlamaIndex

LlamaIndex is a simple and flexible data framework that connects custom data sources with large language models. LlamaIndex provides comprehensive support for RAG. Advanced RAG (Retrieval-Augmented Generation) techniques can be modeled using a composable hierarchical abstraction. The retrieved text can be linked to the following elements: Retriever Text Pipeline Query Engine The retrieval of RAG … Read more

Practical Implementation of Context Mode in ChatEngine

Practical Implementation of Context Mode in ChatEngine

Overview The ContextChatEngine class is a contextual chat engine designed to provide a smooth chat experience by retrieving contextual information from the chat and using a language model (LLM) to generate responses based on system prompts. It is a simple chat mode built on top of a data retriever. For each chat interaction: First, retrieve … Read more

LlamaIndex Practical Application – ChatEngine ReAct Agent Mode

LlamaIndex Practical Application - ChatEngine ReAct Agent Mode

Overview ReAct is an agent-based chat mode built on top of a data query engine. For each chat interaction, the agent enters a ReAct loop: First, decide whether to use the query engine tool and propose appropriate input (Optional) Use the query engine tool and observe its output Decide whether to repeat or give a … Read more

RAG: From Theory to LlamaIndex Practice (Detailed Version)

RAG: From Theory to LlamaIndex Practice (Detailed Version)

Abstract Large language models (LLMs) have demonstrated impressive capabilities. However, this does not mean they are error-free; anyone who has experienced ChatGPT’s “hallucinations” can attest to that. Retrieval Augmented Generation (RAG) is a framework designed to make LLMs more reliable by extracting relevant, up-to-date data directly related to user queries. In this article, I analyze … Read more

LlamaIndex Practical Guide – Overview of Query Engine Usage

LlamaIndex Practical Guide - Overview of Query Engine Usage

Overview The Query Engine is a generic interface that allows you to query data. It accepts natural language queries and returns rich responses. It is typically (but not always) built on one or more indexes through a retriever. You can combine multiple query engines to achieve more advanced functionality. Note: If you want to have … Read more

Using LlamaIndex to Create Custom Agent Functions

Using LlamaIndex to Create Custom Agent Functions

Overview This article introduces how to use LlamaIndex to write your own Agent handling functions. Note that this article uses a locally deployed LLM supported by Ollama for practical implementation, rather than remotely calling the OpenAI API. The goal of this article is to save the output content to a PDF file and then stop … Read more

Implementing Agent Applications with LlamaIndex’s Query Pipeline

Implementing Agent Applications with LlamaIndex's Query Pipeline

In the previous article “The Future of Application Orchestration is Pipeline, LlamaIndex Releases Query Pipeline in Preview to Enhance Application Development Flexibility” we mentioned that LlamaIndex has released a new experimental feature that supports defining a Query Pipeline in a declarative manner to create personalized application workflows, along with a case study for RAG applications. … Read more

RestAI: An Open Source AIaaS Platform Using LlamaIndex, Ollama, and HF Pipelines

RestAI: An Open Source AIaaS Platform Using LlamaIndex, Ollama, and HF Pipelines

Project Introduction Project: There are various types of agents (projects), each with its own functionality. (rag, ragsql, inference, vision) User: Users represent the system’s users. It is used for authentication and authorization (basic authentication). Each user can access multiple projects. LLMs: Supports any public or local LLM supported by LlamaIndex or Ollama. VRAM: Automatic VRAM … Read more

LlamaIndex Practical Implementation: Agent Database Interaction

LlamaIndex Practical Implementation: Agent Database Interaction

Overview This article implements a simple intelligent Agent that first queries data from a database and then processes the data using utility functions. This is a very common scenario that can be extended to multiple practical situations. Similarly, all experiments in this article are conducted on a local machine with 16C32G Linux (CPU). Data Preparation … Read more