RestGPT Framework: Controlling Real-World Applications via RESTful APIs

©PaperWeekly Original · Author | Yifan Song

Affiliation | PhD Student, Institute of Computational Linguistics, Peking University

Research Area | Natural Language Processing

RestGPT Framework: Controlling Real-World Applications via RESTful APIs
Paper Title:
RestGPT: Connecting Large Language Models with Real-World RESTful APIs

Paper Link:

https://arxiv.org/abs/2306.06624

Code Link:

https://github.com/Yifan-Song793/RestGPT

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

Research Background

Large language models (LLMs) such as ChatGPT and GPT-4 have demonstrated many powerful capabilities, including chain of thought, reasoning, and planning. To enable LLMs to deliver greater value in real-world applications, a current popular research direction is to connect LLMs with external tools or APIs to expand their capabilities.

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

In this paper, we attempt to connect LLMs with RESTful APIs, constructing an agent named RestGPT to control real applications such as music players and movie databases, thereby fulfilling complex user instructions.
RESTful APIs are the foundational specifications of modern web applications, allowing access and control of web resources through HTTP methods (such as GET and POST). Many internet applications, such as Spotify, Gmail, Twitter, and Slack, provide public RESTful APIs. As long as an API’s corresponding description file (OAS) is provided, our designed framework RestGPT can be “plug-and-play” connected with any existing RESTful application, showcasing strong scalability.
However, real APIs also present numerous challenges: 1) Calls to real APIs may encounter various errors and exceptions, necessitating a framework with strong planning capabilities and robustness; 2) Results returned by real APIs are often complex JSON documents, requiring mechanisms for parsing.
RestGPT Framework: Controlling Real-World Applications via RESTful APIs
RestGPT Framework: Controlling Real-World Applications via RESTful APIs

Method Overview

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

To construct an agent with strong robustness and to address the real challenges posed by RESTful APIs, we propose the RestGPT framework. As illustrated, RestGPT mainly consists of three modules: Planner, API Selector, and Executor, with the Executor further including Caller and Parser. The core of each module is an LLM and a prompt describing its function.
To fully understand the functionality of APIs within a limited context, we have different modules read different parts of the OpenAPI Specification document in RestGPT. RestGPT follows a “plan-execute” model, planning the next API call each time and executing it based on feedback results.
During the planning phase, the Planner and API Selector work together to complete coarse-to-fine online planning. First, the Planner generates coarse-grained subtasks in natural language based on user instructions and the results of the previous execution. Next, the API Selector chooses specific APIs based on the subtasks to form a fine-grained API calling plan, which is then executed by the Executor, returning the results to the Planner for the next planning.
When the Planner assesses that the current subtask has not been completed, it issues a “Continue” command and instructs the API Selector to reselect a new API for execution. When the Planner assesses that the user instruction has been completed, it issues an “End” command and returns the final execution result.

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

In the execution phase, the Caller generates call parameters based on the API calling plan and API documentation, calling the RESTful API, while the Parser generates parsing code using the LLM’s code generation capability according to the response schema defined in the OAS, extracting required information from complex JSON responses.

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

New Evaluation Dataset

To better evaluate the performance of RestGPT in solving real user instructions, we introduced a high-quality manually annotated test set called RestBench. This dataset contains two scenarios: LLM accessing the TMDB movie database and LLM controlling the Spotify music player. The dataset includes real user instructions and corresponding standard API calling paths, ultimately covering 100 instructions across 54 APIs in the TMDB scenario and 57 instructions across 40 APIs in the Spotify scenario.
Sample data from RestBench is shown below:
RestGPT Framework: Controlling Real-World Applications via RESTful APIs
We evaluate the model’s performance and execution efficiency using success rate (Success), API calling path correctness rate (Correct Path, CP), and solution length (Solution Length).

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

Experimental Results

RestGPT Framework: Controlling Real-World Applications via RESTful APIs
We implemented RestGPT and several baseline models using text-davinci-003 as the base model and tested the performance of ChatGPT, Llama2, and Vicuna as base models. The experimental results show that RestGPT achieves over 70% execution success rate in both scenarios and outperforms baseline models in execution efficiency as represented by the length of the calling paths.

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

Error analysis shows that errors during the planning phase, specifically from the Planner and API Selector, dominate, indicating that existing LLMs still have room for improvement in planning and reasoning capabilities. Additionally, ChatGPT’s planning ability is significantly inferior to that of text-davinci-003.
RestGPT Framework: Controlling Real-World Applications via RESTful APIs
We conducted scaling experiments on instruction complexity and the number of APIs, leading to the following conclusions: RestGPT’s strong planning capability enables it to solve more complex user instructions (longer standard API calling paths); RestGPT can scale to scenarios with a greater number of APIs.

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

More Examples

RestGPT can handle more complex user instructions, such as in the music player scenario “Help me create a playlist with Jay Chou’s songs, and name it ‘Love Jay'”, and in the movie database scenario “Please give me a picture of the cast of Titanic”.

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

RestGPT Framework: Controlling Real-World Applications via RESTful APIs
RestGPT Framework: Controlling Real-World Applications via RESTful APIs

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

Conclusion

This paper introduces the RestGPT framework, exploring the connection between LLMs and RESTful APIs to build automated agents that control real applications such as TMDB and Spotify. Experiments on the high-quality test set RestBench fully demonstrate that RestGPT can solve complex user instructions in real scenarios. We believe the introduction of RestGPT will promote the application of LLMs in practical scenarios and showcase greater potential and value in both academic and industrial fields.

Further Reading

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

RestGPT Framework: Controlling Real-World Applications via RESTful APIs
RestGPT Framework: Controlling Real-World Applications via RESTful APIs
RestGPT Framework: Controlling Real-World Applications via RESTful APIs

#Submission Guidelines#

Let Your Words Be Seen by More People

How can we ensure that more quality content reaches readers through shorter pathways, reducing the cost for readers to find quality content? The answer is: people you don’t know.

There are always some people you don’t know who know what you want to know. PaperWeekly may serve as a bridge, facilitating collisions of academic inspiration from scholars of different backgrounds and directions, sparking more possibilities.

PaperWeekly encourages university laboratories or individuals to share various quality content on our platform, which can be latest paper interpretations, analysis of academic hotspots, research insights, or competition experience explanations. Our only goal is to make knowledge flow genuinely.

📝 Basic Requirements for Submissions:

• The article must be an original work by the individual and has not been published in public channels. If it is an article already published or pending publication on other platforms, please specify.

• Submissions are recommended to be written in markdown format, with accompanying images sent as attachments, requiring clear images without copyright issues.

• PaperWeekly respects the author’s right to attribution and will provide competitive remuneration for each accepted original first-published article, specifically calculated based on the article’s readership and quality.

📬 Submission Channel:

• Submission Email:[email protected]

• Please include immediate contact information (WeChat) in your submission for us to contact the author as soon as the manuscript is selected.

• You can also add the editor on WeChat (pwbot02) for quick submission, noting: Name – Submission.

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

△ Long press to add PaperWeekly Editor

🔍

Now, you can also find us on 「Zhihu」

Search for 「PaperWeekly」 on the Zhihu homepage

Click 「Follow」 to subscribe to our column

·
·

RestGPT Framework: Controlling Real-World Applications via RESTful APIs

Leave a Comment