Build a Q&A Search Engine with Bert in 3 Minutes

The renowned Bert algorithm is something most of you have probably heard of. It is a “game-changing” pre-trained model in the field of NLP launched by Google, which has set multiple records in NLP tasks and achieved state-of-the-art results.

However, many deep learning beginners find that the BERT model is not easy to set up, and the difficulty level is quite high. It might take an average person several days to barely get a model up and running.

No worries, the module we are introducing today allows you to build a Q&A search engine based on the BERT algorithm in just 3 minutes. It is the bert-as-service project. This open-source project enables you to quickly set up BERT services on multi-GPU machines (supporting fine-tuning models) and allows multiple clients to use it concurrently.

1. Preparation

Before starting, ensure that Python and pip are successfully installed on your computer. If not, you can visit this article: Detailed Python Installation Guide for installation.

(Optional 1) If your purpose for using Python is data analysis, you can directly install Anaconda: A Great Helper for Python Data Analysis and Mining—Anaconda, which comes with Python and pip built-in.

(Optional 2) Additionally, we recommend using the VSCode editor, which has many advantages: The Best Partner for Python Programming—VSCode Detailed Guide.

Please choose one of the following methods to enter commands to install dependencies::1. For Windows, open Cmd (Start-Run-CMD). 2. For MacOS, open Terminal (command+space, type Terminal). 3. If you are using the VSCode editor or PyCharm, you can directly use the terminal at the bottom of the interface.

pip install bert-serving-server # server
pip install bert-serving-client # client

Please note the version requirements for the server: Python >= 3.5, Tensorflow >= 1.10.

You also need to download the pre-trained BERT model, which can be downloaded from https://github.com/hanxiao/bert-as-service#install. If you cannot access that website, you can download it from https://pythondict.com/download/bert-serving-model/.

You can also reply with bert-as-service in the backend of the Python Practical Handbook to download these pre-trained models.

After downloading, unzip the zip file to a folder, for example, <span>/tmp/uncased_L-24_H-1024_A-16/</span>.

2. Basic Usage of Bert-as-service

After installation, enter the following command to start the BERT service:

bert-serving-start -model_dir /tmp/uncased_L-24_H-1024_A-16/ -num_worker=4

-num_worker=4 indicates that this will start a service with four workers, meaning it can handle up to four concurrent requests. Any additional concurrent requests over four will be queued in the load balancer for processing.

The following shows what the server looks like when it starts correctly:

Using the Client to Get Sentence Encodings

Now you can easily encode sentences as follows:

from bert_serving.client import BertClient
bc = BertClient()
bc.encode(['First do it', 'then do it right', 'then do it better'])

As a feature of BERT, you can obtain the encoding of a pair of sentences by connecting them with <span>|||</span> (with spaces before and after), for example:

bc.encode(['First do it ||| then do it right'])

Using BERT Service Remotely

You can also start the service on one (GPU) machine and call it from another (CPU) machine, as shown below:

# on another CPU machine
from bert_serving.client import BertClient
bc = BertClient(ip='xx.xx.xx.xx') # ip address of the GPU machine
bc.encode(['First do it', 'then do it right', 'then do it better'])

3. Building a Q&A Search Engine

We will find the most similar question to the user’s input from the FAQ list using bert-as-service and return the corresponding answer.

The FAQ list is actually the readme.md of the official documentation, which is also included in the download link I provided.

1. Load all questions and display statistics:

prefix_q = '##### **Q:** '
with open('README.md') as fp:
    questions = [v.replace(prefix_q, '').strip() for v in fp if v.strip() and v.startswith(prefix_q)]
    print('%d questions loaded, avg. len of %d' % (len(questions), np.mean([len(d.split()) for d in questions])))
    # 33 questions loaded, avg. len of 9

A total of 33 questions were loaded, with an average length of 9.

2. Then use the pre-trained model: uncased_L-12_H-768_A-12 to start a BERT service:

bert-serving-start -num_worker=1 -model_dir=/data/cips/data/lab/data/model/uncased_L-12_H-768_A-12

3. Next, encode our questions into vectors:

bc = BertClient(port=4000, port_out=4001)
doc_vecs = bc.encode(questions)

4. Finally, we are ready to receive user queries and perform simple “fuzzy” searches on existing questions.

For this, each time a new query arrives, we will encode it into a vector and compute the dot product doc_vecs , then sort the results in descending order and return the top N similar questions:

while True:
    query = input('your question: ')
    query_vec = bc.encode([query])[0]
    # compute normalized dot product as score
    score = np.sum(query_vec * doc_vecs, axis=1) / np.linalg.norm(doc_vecs, axis=1)
    topk_idx = np.argsort(score)[::-1][:topk]
    for idx in topk_idx:
        print('&gt; %s	%s' % (score[idx], questions[idx]))

Done! Now run the code and enter your query to see how this search engine handles fuzzy matching:

The complete code is as follows, totaling 23 lines of code:

Scroll up to view the complete code

import numpy as np
from bert_serving.client import BertClient
from termcolor import colored

prefix_q = '##### **Q:** '
topk = 5

with open('README.md') as fp:
    questions = [v.replace(prefix_q, '').strip() for v in fp if v.strip() and v.startswith(prefix_q)]
    print('%d questions loaded, avg. len of %d' % (len(questions), np.mean([len(d.split()) for d in questions])))

with BertClient(port=4000, port_out=4001) as bc:
    doc_vecs = bc.encode(questions)

    while True:
        query = input(colored('your question: ', 'green'))
        query_vec = bc.encode([query])[0]
        # compute normalized dot product as score
        score = np.sum(query_vec * doc_vecs, axis=1) / np.linalg.norm(doc_vecs, axis=1)
        topk_idx = np.argsort(score)[::-1][:topk]
        print('top %d questions similar to "%s"' % (topk, colored(query, 'green')))
        for idx in topk_idx:
            print('&gt; %s	%s' % (colored('%.1f' % score[idx], 'cyan'), colored(questions[idx], 'yellow')))

Pretty simple, right? Of course, this is a simple QA search model built on a pre-trained BERT model.

You can also fine-tune the model to make its overall performance even better. You can place your data in a directory and then execute run_classifier.py to fine-tune the model, for example, this case:

https://github.com/google-research/bert#sentence-and-sentence-pair-classification-tasks

It has many other uses, which we will not go into detail here. You can go to the official documentation to learn more:

https://github.com/hanxiao/bert-as-service

Using the Client to Get Sentence Encodings

Leave a Comment Cancel reply