Interpreting the JARVIS Project: Connecting ChatGPT and HuggingFace to Solve AI Issues

The latest online sharing session by Machine Heart invited Song Kaitao, a researcher at Microsoft Research Asia, to share their recent open-source project JARVIS.

Recently, large language models (LLMs), represented by ChatGPT, have garnered significant attention in both industry and academia. However, LLMs, which primarily handle text, still face numerous bottlenecks when addressing many complex and challenging AI tasks:

1. Limited by the input and output forms of language models, current LLMs (such as ChatGPT) lack the ability to process complex modal information (such as images, voice, video, etc.).

2. Some complex AI tasks require prior planning, decomposition into multiple subtasks, and coordination of different models’ scheduling and collaborative execution. These requirements exceed the capabilities of LLMs themselves.

3. For certain specific tasks, although LLMs demonstrate excellent performance under zero or low resources, they still lag behind some expert models (e.g., fine-tuned models).

Therefore, how to solve these issues has become a critical step for LLMs towards achieving general artificial intelligence. To this end, the JARVIS project team pointed out that for LLMs to achieve this goal, they should be able to leverage the power of external models. The key lies in finding a suitable connector to link large language models and AI models.

The JARVIS project team noted that any AI model can obtain a textual representation by summarizing its model functions, thus proposing a concept: language is the universal interface that LLMs use to connect with AI models. Based on this idea, they launched JARVIS, a model collaboration system designed to connect LLMs (like ChatGPT) and machine learning libraries (like Hugging Face). This system treats LLMs as the brain and utilizes the power of language to manage different models within various AI communities, with the specific process divided into four steps: task planning, model selection, task execution, and feedback generation.

Within this framework, models possess the ability to decompose and manage different tasks, and can handle complex information across various modalities, including text, voice, images, and video. This framework has also attracted an increasing number of people to explore the application prospects of collaboration scheduling between LLMs and external models or tools.

Interpreting the JARVIS Project: Connecting ChatGPT and HuggingFace to Solve AI Issues

Interpreting the JARVIS Project: Connecting ChatGPT and HuggingFace to Solve AI Issues

Sharing Topic: JARVIS – Connecting ChatGPT and HuggingFace to Solve AI Issues

Guest Speaker: Song Kaitao, researcher at Microsoft Research Asia. He graduated with a Bachelor’s and Ph.D. from Nanjing University of Science and Technology, with research directions including natural language processing, speech recognition, and pre-trained language models, content generation, etc. He has published several papers at top international conferences such as ICML, NeurIPS, ICCV, KDD, ACL, IJCAI, and AAAI.

Sharing Summary: Despite the excellent performance of large language models, there are still many bottlenecks in practically solving some complex AI tasks. In this sharing session, we will introduce how to build connections between large language models and the AI community, as well as multi-model collaboration to tackle more challenging AI tasks, and explore future prospects.

Related Links:

1) SOTA! Model Platform Project Homepage Link:

https://sota.jiqizhixin.com/project/hugginggpt

2) Paper Link:

https://arxiv.org/abs/2303.17580

3) Code Repository:

https://github.com/microsoft/JARVIS

Join the Group for Live Broadcast
Live Broadcast Room: Follow Machine Heart’s Mobile Group video account, broadcast starts at 19:00 Beijing time on April 25.
Interpreting the JARVIS Project: Connecting ChatGPT and HuggingFace to Solve AI Issues
Discussion Group: This live broadcast will include a QA session, and you are welcome to join the discussion group for this live broadcast.

Interpreting the JARVIS Project: Connecting ChatGPT and HuggingFace to Solve AI Issues

If the group has exceeded the number limit, please add the Machine Heart assistant: syncedai2, syncedai3, syncedai4, or syncedai5, with the note ‘JARVIS’ to join.
If you have new work you wish to share or content directions you are interested in, feel free to let us know: https://jiqizhixin.mikecrm.com/fFruVd3

Interpreting the JARVIS Project: Connecting ChatGPT and HuggingFace to Solve AI Issues

Machine Heart · Mobile Group
The Mobile Group is an AI technology community initiated by Machine Heart, focusing on academic research and technical practice topics, bringing community users technical online public classes, academic sharing, technical practices, and close encounters with top laboratories. The Mobile Group will also hold offline academic exchange meetings from time to time and organize talent services, industry technology docking, and other activities. All AI technology practitioners are welcome to join.

Leave a Comment