Milvus Query Merge Mechanism

Milvus Data Segment Merge Mechanism
Milvus Query Merge MechanismIf you missed the previous article, you can click the title to review.
| Query Request Queue
The connection layer of Milvus uses gRPC to provide RPC services externally, and the oatpp framework to provide RESTful services. The maximum number of connections in the server’s gRPC connection pool is set to 20, and multiple query requests from clients are received asynchronously. However, since each query request requires a large amount of computational resources, multiple queries executing simultaneously will compete for resources. Therefore, the connection layer places query requests into a queue, allowing the background query scheduler (Query Scheduler) to take query requests from the end of the queue and execute them one by one.
Milvus Query Merge Mechanism
| Query Merging
To improve QPS (Query Per Second), starting from version 0.8.0, Milvus attempts to merge query requests upon receiving them.
The main basis for improving query efficiency through merging queries is: for queries with a small nq (number of target vectors), the CPU/GPU parallelism is low, and computational resources are partially idle; if multiple queries’ target vectors are combined for computation, it can enhance the utilization of computational resources.
Before client requests enter the queue, an additional request scheduling step has been added to preprocess requests based on different strategies.
The preprocessing of query requests involves: first checking if there are still unprocessed query requests in the queue; if there are, the previously queued query request is compared with the new query request; if the merging conditions are met, the two requests are merged into one and placed in the queue, while the previous query request is removed from the queue:
Milvus Query Merge Mechanism
Merging query requests allows for multiple merges, and the specific number of requests that can be merged is determined by the runtime state of Milvus. Multiple queries must meet the following conditions for merging:
  • The query target is the same collection and queries within the same partition
  • The topk parameter difference does not exceed 200
  • The number of target vectors for merging does not exceed 200
  • Other index-related query parameters must be the same, such as nprobe
Here is a set of examples:
Milvus Query Merge Mechanism
If you understand the principles of vector search, it is not difficult to understand the reasons for setting these merging conditions:
  • The same collection and the same partition limit the search scope, allowing multiple queries to avoid interference only within the same range.
  • nq less than 200 ensures that the computation time is not too long, preventing individual requests from waiting too long.
  • The topk difference of less than 200 is for the convenience of processing the result set.
  • Index-related query parameters must be the same, so that the same process can be followed in the internal ANNS library calculation.
| Merging Queries Improves Query Efficiency
Next, we will test the effect of merging queries using pymilvus.
Hardware Environment Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz 12 Cores
Milvus Version 0.9.1 GPU version
Test Dataset 10 million 128-dimensional randomly generated vectors
Index IVFSQ8, nlist is 2048
Query Parameters Execute 1000 queries, nq is 1, topk is 10, nprobe is 16
Client single-threaded execution script for 1000 queries:
import time
import threading
import numpy as np
from milvus import Milvus, IndexType
from milvus.client.types import MetricType
SERVER_ADDR = "127.0.0.1"
SERVER_PORT = '19530'
COLLECTION_DIMENSION = 128
COLLECTION_NAME = "TEST"
INDEX_TYPE = IndexType.IVF_SQ8
INDEX_PARAM = {'nlist': 2048}
SEARCH_PARAM = {'nprobe': 16}
TOPK = 10
MILVUS = Milvus(host=SERVER_ADDR, port=SERVER_PORT)
def gen_vec_list(nb, seed=np.random.RandomState(1234)):
    xb = seed.rand(nb, COLLECTION_DIMENSION).astype("float32")
    vec_list = xb.tolist()
    return vec_list
def search(vec_list):
    status, result = MILVUS.search(collection_name=COLLECTION_NAME, top_k=TOPK,
                                   query_records=vec_list, params=SEARCH_PARAM)
def multi_search():
    time_start = time.time()
    SEARCH_COUNT = 1000
    vec_list = gen_vec_list(1)
    for k in range(SEARCH_COUNT):
        search(vec_list=vec_list)
    time_end = time.time()
    total_cost = time_end - time_start
    print("search total cost", total_cost, 'sec')
    print('QPS = ', SEARCH_COUNT/total_cost)
if __name__ == "__main__":
    multi_search()
Execute the script 3 times and take the average:
  • Total time for 1000 queries: 7.18 seconds

  • QPS: 139.24

Client multi-threaded execution script for 1000 queries:
import time
import threading
import numpy as np
from milvus import Milvus, IndexType
from milvus.client.types import MetricType
SERVER_ADDR = "127.0.0.1"
SERVER_PORT = '19530'
COLLECTION_DIMENSION = 128
COLLECTION_NAME = "TEST"
INDEX_TYPE = IndexType.IVF_SQ8
INDEX_PARAM = {'nlist': 2048}
SEARCH_PARAM = {'nprobe': 16}
TOPK = 10
MILVUS = Milvus(host=SERVER_ADDR, port=SERVER_PORT)
def gen_vec_list(nb, seed=np.random.RandomState(1234)):
    xb = seed.rand(nb, COLLECTION_DIMENSION).astype("float32")
    vec_list = xb.tolist()
    return vec_list
def search(vec_list):
    status, result = MILVUS.search(collection_name=COLLECTION_NAME, top_k=TOPK,
                                   query_records=vec_list, params=SEARCH_PARAM)
def multi_search():
    time_start = time.time()
    SEARCH_COUNT = 1000
    threads = []
    vec_list = gen_vec_list(1)
    for k in range(SEARCH_COUNT):
        x = threading.Thread(target=search, args=(vec_list,))
        threads.append(x)
        x.start()
    for th in threads:
        th.join()
    time_end = time.time()
    total_cost = time_end - time_start
    print("search total cost", total_cost, 'sec')
    print('QPS = ', SEARCH_COUNT/total_cost)
if __name__ == "__main__":
    multi_search()
Execute the script 3 times and take the average:
  • Total time for 1000 queries:4.93 seconds

  • QPS:202.79

| Welcome to join the Milvus community
github.com/milvus-io/milvus | Source Code
milvus.io | Official Website
milvusio.slack.com | Slack Community
zhihu.com/org/zilliz-11/columns | Zhihu
zilliz.blog.csdn.net | CSDN Blog
space.bilibili.com/478166626 | Bilibili

Milvus Query Merge Mechanism

Leave a Comment