How to Persistently Store LlamaIndex Vector Indexes

What is the hottest topic in the era of large models?

In addition to ChatGPT, tools like LangChain and LlamaIndex, designed for building large model applications, have also been gaining significant attention. To help everyone get started easily, we launched the 【Decoding LangChain】 tutorial series, and now we present the 【Unveiling LlamaIndex】 series, which you can navigate through as needed.

Returning to LlamaIndex, with the advent of the AGI era, more and more developers are beginning to think about how to effectively utilize large models. However, when building LLM applications, developers generally face three major challenges:

The high cost of using LLMs
LLMs cannot provide the latest information in a timely manner
LLMs lack knowledge in specific professional fields

To address these issues, the mainstream approach in the industry is to adopt two main frameworks: fine-tuning and caching + injection.

Fine-tuning mainly addresses the latter two challenges (lack of accurate information), while caching + injection is aimed at solving the high cost of usage. Additionally, the caching + injection framework is also referred to as the CVP architecture (i.e., ChatGPT + Vector Database + Prompt-as-Code).

In this context, LlamaIndex was born. As a new tool specifically designed for building LLM applications, it abstracts the content from the aforementioned frameworks for users.

This article is part of the 【Unveiling LlamaIndex Series】. Previously, we invited the co-founder of LlamaIndex to explain how to enhance LLM capabilities using private data and provided a detailed introduction to various indexes in LlamaIndex, as well as a brief tutorial on querying LlamaIndex vector storage indexes. In this article, we will focus on how to create and store vector indexes in LlamaIndex, along with two methods for persistently storing vector indexes.

How to Persistently Store LlamaIndex Vector Indexes

01.

Introduction to LlamaIndex

LlamaIndex can be seen as a tool for managing interactions between user data and LLMs. It receives input data and constructs indexes for it, subsequently using those indexes to answer questions related to the input data. LlamaIndex can build many types of indexes based on the task at hand, such as: vector indexes, tree indexes, list indexes, or keyword indexes.

Each index has its advantages and applicable scenarios. For example, list indexes are suitable for scenarios that require processing a large number of documents; vector indexes are suitable for semantic search systems; tree indexes are suitable for handling sparse information; and keyword indexes are suitable for finding specific keywords. (For a detailed introduction, please refer to Cracking the Black Box to Enhance LLM Performance – LlamaIndex)

When using LlamaIndex, we can store and load the above indexes for session management. Generally, the index context can be stored locally. If you want to use a persistent storage engine to store the index for use in subsequent application building processes, please refer to the tutorial below.

02.

Creating and Saving LlamaIndex Vector Indexes

The following tutorial directly utilizes data from the LlamaIndex repository’s example folder (https://github.com/jerryjliu/llama_index/tree/main/examples/paul_graham_essay). Please clone the repository locally and create a notebook in the paul_graham_essay folder, or download the data directly from that folder for local use.

Using a Local Vector Database

In this tutorial, we will use the Milvus Lite version of the open-source vector database Milvus. With the Milvus Lite version, you can run the code directly in the notebook without any additional work.

1. Install Required Software and Environment. However, users need an OpenAI API key to use the GPT model. If you need to store the OpenAI API key in the.env file, make sure to install thepython-dotenv library.

pip install Milvus llama-index python-dotenv

2. Import

Import GPTVectorStoreIndex, StorageContext, and MilvusVectorStore from llama_index

Import default_server from Milvus

Import os and load_dotenv to load the API key

from llama_index import GPTVectorStoreIndex, StorageContext
from llama_index.vector_stores import MilvusVectorStore
from milvus import default_server
from dotenv import load_dotenv
import os
load_dotenv()
open_api_key = os.getenv("OPENAI_API_KEY")

3. Start the Vector Database

Call the start() command on default_server to start the local Milvus Lite instance.
Use MilvusVectorStore to connect to the vector storage, passing in the host and port parameters.

default_server.start()
vectore_store = MilvusVectorStore(
   host = "127.0.0.1",
   port = default_server.listen_port
)

4. Configure the storage context so that LlamaIndex knows where to store the index. Then use GPTVectorStoreIndex to create the index, passing in the documents for index creation and the storage context. We can then query the index as usual.

In this example, we query with the question “What did the author do growing up?” The system will create a vector index for this question, abstracting the semantics of terms like “author” and “growing up.”

storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = GPTVectorStoreIndex.from_documents(
   documents, storage_context=storage_context
)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")

After querying, the response received is as follows:

“Growing up, the author wrote short stories, programmed on an IBM 1401, and nagged his father to buy him a TRS-80 microcomputer. …”

Using a Cloud Vector Database

It is important to note that if you encounter massive data, we recommend using a cloud vector database to store LlamaIndex vector indexes.

The following tutorial uses the Zilliz Cloud vector database, which provides fully managed Milvus services. Before using Zilliz Cloud, please register an account and create a collection.

Unlike Milvus, when using Zilliz Cloud, you need to provide HOST, PORT, USER, and PASSWORD. You can view the host, port, username, and password information in the Zilliz Cloud interface.

✅ The following is a correct code example:

vector_store = MilvusVectorStore(
   host = HOST,
   port = PORT,
   user = USER,
   password = PASSWORD,
   use_secure = True,
   overwrite = True
)

❌ The following is an incorrect code example:

vector_store = MilvusVectorStore(
   host = "127.0.0.1",
   port = default_server.listen_port
)

To explore Zilliz Cloud services, click 【Read Original】 to register.

This article was originally published on Toward AI, and has been reproduced with permission.

🌟The “Searching for CVP Practice Stars in the AIGC Era” thematic event is about to start!

Zilliz will collaborate with leading large model manufacturers in China to select application scenarios, empowering users with vector databases and top technical experts in large models, refining applications together, enhancing implementation effectiveness, and empowering the business itself.

If your application is also suitable for the CVP framework and you are struggling with application implementation and actual results, you can apply to participate in the event directly to receive the most professional help and guidance! Contact email is [email protected].

Author of this Article

How to Persistently Store LlamaIndex Vector Indexes

Yujian TangDeveloper Evangelist at Zilliz

Recommended Reading

Leave a Comment Cancel reply