Prompt Sapper: The Soulmate of Foundation Models and an Innovation Hub for AI Services

Machine Heart Column

Authors: Xing Zhenchang (CSIRO’s Data61), Huang Qing (JXNU), Cheng Yu (JXNU)

The SE4AI team from CSIRO’s Data61 and the Intelligent Software Engineering Laboratory at Jiangxi Normal University jointly developed the world’s first no-code production platform for AI chains, Prompt Sapper, along with corresponding methodologies and an AI service marketplace.Foundation models have brought an unprecedented AI “operating system” effect and a new way of interacting with artificial intelligence, igniting a wave of innovation in AI service development and application.Prompt Sapper was born in response to this trend, aiming to reshape the software landscape and create a collaborative intelligence platform between humans and AI, unleashing everyone’s potential for AI innovation and creating a future where everyone can be an AI innovation master!

Project Link: https://github.com/AI4FutureSE
Main Website for AI Chain: https://www.aichain.online/
Sapper IDE: https://www.promptsapper.tech/
AI Service Marketplace: https://www.aichain.store/

AI Chain Applications Built on AI “Operating Systems”

AI services represent what we call AI chain applications built on foundation models. AI chain applications are a new type of software product that assemble multiple calls to foundation models (and may also call traditional machine learning models, external data, or APIs simultaneously) according to specific workflows to provide certain AI services.

These foundation models include currently very popular generative pre-trained large language models like GPT-4 and image generation models like DALL-E. We can think of these foundation model classes as AI “operating systems”, just like Windows, Linux, and MacOS in the personal computer era, and iOS and Android in the mobile application era. Just as we once called traditional operating system APIs to develop software applications and services, today we can develop AI applications and services by assembling multiple calls to foundation models.

Prompt Sapper: The Soulmate of Foundation Models and an Innovation Hub for AI Services

Unlocking Software 3.0 with Foundation Models

However, as shown in the figure above, there is an essential difference between traditional operating system calls and foundation model calls. Even though natural language is the most natural way for us to express our needs, in the software 1.0/2.0 paradigm, people had to interact with computers using programming languages (like Java, Python, JavaScript, etc.) and focus on solving problems (algorithms, data, model architecture, features, etc.). Now, everything has changed dramatically! Foundation models have unlocked the software 3.0 paradigm, a brand new natural language programming approach—prompt programming—and AI chain engineering based on prompt programming. In the software 3.0 paradigm, people can describe the problems they need to solve in natural language, and foundation models can understand and execute these natural language instructions. This advancement allows people to interact with AI in a more intuitive and natural way, further broadening our possibilities in innovation and problem-solving, bringing the power of AI to a wider audience, and realizing the vision of democratizing AI.

Software 3.0 offers an unprecedented AI interaction experience and intelligent applications. However, we believe that software 3.0 will not completely replace software 1.0/2.0. Instead, these three paradigms will complement and coexist with each other in the future. In their respective areas of advantage, they will play their roles together to promote technological advancement and innovation. This collaborative development is expected to help us better solve complex problems and unleash more innovative potential. Our Sapper IDE supports workers from all three paradigms; for more details, see “Promptsmanship”.

AI 2.0 (Foundation Models) Empowering Software 3.0 (Prompt Programming + AI Chain Engineering)

The Evolution of Machine Learning and Software Engineering

As shown in the figure above, the development of machine learning technology has evolved from early feature engineering and neural network architecture engineering to the goal engineering of the pre-trained fine-tuning paradigm, and then to the recent evolution of prompt/AI chain engineering. Although neural networks and pre-trained fine-tuning are gradually lowering the threshold for utilizing AI, people still face the challenge of the single-domain, multi-model island effect. That is to say, the AI 1.0 era lacked a model that serves as an AI “operating system” to support the development, assembly, and ecosystem of AI applications.

Foundation models, however, possess cross-domain knowledge and can adapt to various complex tasks through in-context learning, bringing us the long-awaited AI “operating system” platform.

Li Kaifu refers to this platformized AI as AI 2.0. He believes that chat tools and text-image creation are just the tip of the iceberg of AI 2.0, and we should not limit our imagination of AI 2.0’s potential. Foundation models have promoted a natural way of human-AI interaction, opening a new chapter in the field of software engineering—the era of Software 3.0. In this era, centered on prompt programming and AI chain applications, we will welcome a more open and democratic environment for AI service development and application. Andrej Karpathy believes that the emerging prompt programming has the potential to expand the number of “programmers” to 1.5 billion. In the job market, positions such as “AI Prompt Engineer” and “Legal Prompt Engineer” have emerged, indicating that this development trend is becoming a reality.

AI Chain Engineering (SE4AIChain): Vision and Goals

Our vision is to reshape the software landscape through generative AI. We are in the construction phase of evolving the capabilities of foundation models into artificial intelligence assistants, which presents two very realistic opportunities: 1. Enabling everyone to create personalized intelligent agents, 2. Allowing people to share and hire intelligent agent services. This enables a broader public to participate in the AI wave and benefit from it. Foundation models and prompt programming hold vast prospects; however, the current state of prompt programming resembles the individual heroism of programming before the emergence of software engineering over 60 years ago. AI service development based on foundation models goes far beyond merely writing dazzling prompts, much like the difference between systematic software engineering and ad-hoc programming.

Therefore, we are committed to building a systematic AI chain engineering infrastructure as the soulmate of foundation models, reshaping the software landscape to unleash the full potential of AI 2.0. Our AI chain engineering infrastructure is a set of AI4SE4AI framework (AI-powered software engineering infrastructure for AI), including a systematic AI chain engineering methodology, which we call “Promptsmanship”, along with corresponding development and deployment tools as supporting weapons for this methodology.

To achieve this vision, we propose the following three goals:

First, summarize the best practices of generalized prompt engineering from the perspective of software engineering, placing prompt engineering within the overall framework of software engineering, and supplementing important software engineering methods that have been overlooked (such as software processes, system design, testing), thus forming a systematic AI chain methodology (the first version has been released).

Second, develop an AI chain integrated development environment that supports the entire process of AI chain development from idea to service, and provide “subtle support” for ordinary people to develop high-quality AI services through materializing AI chain methodology (the first version has been launched).

Third, develop an AI service marketplace to promote the development of the AI service ecosystem and develop responsible AI chain engineering methods and technologies to enhance the transparency, accountability, and safety of AI services (the beta version of the AI service marketplace has been launched).

We are dedicated to establishing an open, collaborative, secure, and sustainable AI chain ecosystem to support the digital and intelligent transformation and upgrading of various industries, allowing them to unleash their creativity and wisdom in the new era of AI, benefiting from it, and achieving the beautiful vision of human-machine symbiosis.

Promptsmanship: A Tailored Software Engineering Methodology for AI Chain Applications

Software engineering has accumulated a wealth of effective methods and practices over decades of development, many of which concepts and experiences can be applied to AI chain engineering. However, with the transformation of human-AI interaction brought about by foundation models, we need to examine and adjust traditional software engineering methods and practices to adapt to the emerging human-centered natural language programming paradigm.

On one hand, large language models (like GPT-4) encode vast amounts of knowledge and possess strong conversational abilities, which we can utilize to help AI chain engineers gain task knowledge, understand problems, and gain inspiration for problem-solving. At the same time, to mitigate the inevitable errors and hallucinations of large language models, we need to have a sufficiently deep understanding of model capabilities (so-called mechanical sympathy) and adopt effective prompt design.

On the other hand, large language models not only change who can develop AI services but also profoundly change what AI services can be developed. This requires us to shift from past code-centered development tools to human-centered tools, enabling ordinary people to focus on problem-solving, and more intuitively analyze, design, build, and evaluate AI chains.

Promptsmanship: A Tailored Software Engineering Methodology for AI Chain Applications

Therefore, AI chain engineering needs to develop a set of development methods and tools that are more adapted to the human-centered natural language programming paradigm, based on traditional software engineering methods and practices, combined with the advantages of large language models, to improve the efficiency and quality of AI chain development. The above figure illustrates the AI chain engineering methodology we proposed based on extensive literature, community-shared experiences, and our own practice.

We define the functional units of AI chains as “workers”. The AI chain involves not only traditional software concepts (such as requirements, object composition and collaboration, object roles) but also AI chain-specific concepts. For example, we distinguish three types of workers (corresponding to three software paradigms: Software 1.0/2.0/3.0), four layers of increasing reasoning capabilities of workers—AI interaction modes, worker stereotypes, and prompt design patterns.

We believe that AI chain engineering is a rapid prototyping process that includes four iterative stages: exploration, design, build, and deploy. Each stage includes concurrent activities (inspired by the Unified Software Process), including task modeling, system design (requirements analysis, task decomposition, AI/non-AI concern separation, workflow rehearsal), AI chain implementation, and testing, but different stages have different focuses (as indicated by the corresponding bar heights). Different activities will generate or refine different AI chain concepts (shown in the blue area below the concepts). Accompanying all activities, we propose a “magic enhances magic” activity that leverages the knowledge and conversational abilities of large language models to help AI chain engineers gain task knowledge, conduct requirements gathering and analysis, and understand model capabilities (mechanical sympathy).

In the AI chain, the “brain” of a Software 3.0 worker is his natural language prompt. These prompts need to clearly define the roles and functions of the workers. Effective prompts require creativity and experimentation, but like traditional programming, can be improved through idioms and patterns. A large body of literature and blogs has proposed many prompt techniques. Based on these prompt techniques and our own practical experience, we have summarized four aspects of prompt design patterns (see the table above): worker stereotypes, prompt considerations, prompt design aspects, and prompt decorations.

Worker stereotypes: We define nine worker meta-roles: input rewriter, splitter, reverse questioner, planner, information inquirer, executor, summarizer, status checker, and validator. To enhance the debuggability, reusability, and composability of AI chains and workers, we recommend that each worker adhere to the single function principle, serving a unique role. Of course, a worker can serve multiple roles simultaneously, but it is important to note that a multi-role worker may become an “epic” worker, who not only fails to perform well but is also difficult to optimize and control.

Prompt considerations: These include three general considerations that can affect prompt performance: Grice’s conversational principles, terminology explanations, and prompt committees.

Prompt design aspects: These need to consider context (including inputs, terminology explanations, personification, and other limitations), instructions, examples, output formats, and content forms (free text, semi-structured text, code-style text). For simple-function workers, they do not need to include all aspects; for example, a simple addition problem can place the input numbers in the question (e.g., “Please calculate 5+2” or “What is 5+2?”). However, for complex-function workers, it is best to distinguish different design aspects and information in semi-structured or even code form. In addition, instructions can include some control logic, but complex control logic is best expressed through explicit collaborative workers and workflows.

Prompt decorations: These are not core functions of the workers, but they can enhance the model’s reasoning ability through “thinking aloud” (self-questioning, reflection) or better customize model behavior and output (personification, context control).

We want to remind everyone that these prompt patterns do not replace creativity and experimentation. Moreover, they are merely tactical-level optimizations. For more challenging tasks, strategic-level AI chain system design often needs to be considered.

AI Chain Production Platform (Sapper IDE)

Our AI chain production platform will differ from existing code-centered development environments because many people without computer and programming backgrounds will develop the AI chain. Therefore, our highest design principle is “human-centered”, reflected in three aspects:

First, we seamlessly materialize the “Promptsmanship” (AI chain engineering methodology) into Sapper IDE, enabling anyone to effectively apply the best AI chain practices and methodologies.

Second, we make full use of the knowledge and conversational abilities of large language models to develop intelligent co-pilots, providing full-process AI chain development support for non-technical personnel.

Third, we provide a no-code AI chain analysis, design, development, and deployment process, allowing anyone to easily turn ideas into AI services.

Our AI chain IDE can be regarded as an “incubator for AI services” because its main function is to build AI services on the shoulders of foundation models. These services not only directly meet people’s needs for AI chain development but also inspire them to explore more possibilities and help them create more excellent AI services. We believe this will be an era of unlimited AI innovation, and our Sapper IDE will become the “tool for unlocking this unlimited potential” across various industries!

We have developed or are developing various AI service demonstrations in Sapper IDE, covering education, vocational training, creative writing, gaming, software engineering, and more. We also welcome the community to create more creative AI services using Prompt Sapper and share them in our AI service marketplace.

Exploration View

The exploration view supports activities in the task exploration and preliminary design stages, allowing users to gain a rough task model, understand task challenges, and gain an initial understanding of task steps, workflows, input/output data, and prompt effectiveness.

As shown in the figure, the left side of the exploration view is a chatbot (currently wrapping the GPT-3.5 API), which works similarly to a regular chatbot (e.g., ChatGPT). The chatbot allows users to engage in any type of conversation with the large language model (LLM). Of course, we assume that users will converse around the AI services they need to develop. Unlike ordinary chatbots, the exploration view is equipped with a co-pilot based on LLM (currently GPT-3.5), which automatically collects and analyzes the conversations between users and the LLM to obtain task backgrounds that may be relevant to subsequent AI chain analysis, design, and development (e.g., required functionalities, user preferences, things to avoid, etc.). This co-pilot itself is an AI chain service built on LLM (currently GPT-3.5). It operates in a non-intrusive manner and dynamically records notes based on the conversations between users and the LLM, as shown in the Task Note panel on the right side of the figure.

Design View

The design view supports the main activities of the design phase and plays an important role in bridging the exploration and build phases. Therefore, it has two main functions: requirements analysis and AI chain framework generation, supported by two LLM-based co-pilots.

Requirements Analysis

The left side of the design view features an LLM-based requirements analysis chatbot (another AI chain service). Unlike the free-form chatbot in the exploration view, the requirements analysis chatbot acts as a continuous reverse questioner, working as follows:

1) Users enter a task description (usually a vague description of the desired content) in the inquiry box to start the conversation.

2) The requirements analysis chatbot guides users to clarify specific task requirements through a series of open-ended questions based on the initial task description and task notes collected in the exploration view (if available).

3) The requirements analysis chatbot gradually integrates users’ responses into the task description (displayed in the task requirements box in the upper right corner).

Of course, if users feel they already have a clear requirement and do not need the help of the requirements analysis chatbot, they can directly input their requirements in the task requirements box.

AI Chain Framework Generation

When users believe that task requirements have been clearly defined, they can click the “Generate AI Chain Skeleton” button below the task requirements box to request the AI chain framework generation co-pilot to create the main steps needed to complete the task and three candidate prompts for each step. The “Generate AI Chain Skeleton” works as follows:

1) It converts the overall task description into main steps and provides names and descriptions for each step.

2) It recommends three candidate prompts for each step, which users can modify accordingly.

3) Users can manually add control flows, delete or reorder steps, etc.

4) Users can use structured forms to edit the generated prompts, setting inputs and execution engines for the steps.

Through this process, users can conveniently generate the framework of the AI chain and further modify and refine the generated framework. Finally, clicking the “Generate AI Chain” button at the bottom right of the design view, Sapper IDE will automatically create workers for each step based on the AI chain framework and assemble them into a block-based AI chain, which can be viewed, edited, and executed in the programming view.

Programming View (Block View)

We use block-based visual programming to support the implementation, execution, and debugging of AI chains. The current implementation is based on the open-source Blockly project. In the left panel, users can access blocks from the Units, Code, Prompts, Variables, and Engines toolboxes to build the AI chain. For a description of the relevant blocks, please visit our documentation

(https://www.aichain.online/public/content%20pages/sapperide/blockview.html).

To make it more intuitive for users to create and modify workers, all visual programming operations can be triggered directly on the worker/container blocks. Clicking the “+” icon on the right side of the slot allows users to directly add or edit the corresponding block for that slot. Users can drag and drop block templates from the toolbox into the AI chain editor, assembling blocks by dragging and dropping in the editor. Users can scale the editor or center selected blocks in the editor by clicking the “+”, “-“, and “aim” buttons on the right side of the editor.

Users can run or debug the AI chain through the “AI Chain Execution” menu. When a worker is running, the “bug” signal light in the upper left corner of the worker block will light up. The actual prompts and engine outputs used during execution will be output to the block console. User inputs required for execution will be entered in the block console.

In debug mode, workers will execute one by one. When the current worker finishes running, execution will pause, allowing users to check whether the output in the block console meets expectations. If the results are as expected, users can continue executing the next worker. Alternatively, users can modify the prompt of the current worker in the prompt console and then rerun the current worker.

If a worker block is placed in an output block, its output will be displayed in the output window at the bottom right. This window will not display the output of workers not placed in output blocks, nor will it display prompts.

The block console helps AI chain engineers debug the AI chain, so it contains prompt information and intermediate execution results. The output window in the bottom right allows engineers to check the final output of the AI chain that end users will see.

Prompt Hub

The Prompt Hub of Sapper IDE provides a centralized prompt management system, allowing users to easily share and reuse prompts across AI chain projects. Through the Prompt Builder and Prompt Base toolboxes, users can create, edit, import, and export prompts, making AI chain project development more efficient and convenient.

Users can create or edit prompts in a structured way through four aspects: context, instructions, examples, and output formats. This helps achieve more accurate functionalities in AI chain projects. Future design views will allow users to search the prompt library or receive automatic prompt recommendations, further enhancing development efficiency.

Prompt Hub also supports downloading prompts to local files or uploading prompts from local files to the IDE, facilitating synchronization of prompt information across different devices.

Engine Management

The engine management function allows users to easily share and reuse various engines across AI chain projects, such as foundation models, traditional machine learning models (currently under development, stay tuned), and external APIs. The IDE comes with three foundation models: gpt-3.5-turbo, text-davinci-003, DALL-E, and a Python standard REPL shell.

In the FM Engines toolbox, users can flexibly create and configure foundation model engines, adjusting parameters such as Temperature, Maximum length, Top P, Frequency penalty, and Presence penalty. Clicking “Save Engine to FM Engine” saves the engine for later editing or exporting to projects.

Finally, users can download engine information to local files or upload it to the IDE from local files.

AI Chain Project Management

Through the “Project Management” menu, users can create new AI chain projects, download the current project to the local disk, or open projects on the local disk in the IDE. By clicking the “Download Code” button, users can download the backend code implementing the AI chain to the local disk for use in other software projects. Note that executing the downloaded AI chain code requires the sapperchain Python library (currently not open-sourced).

In the “Recent Project” menu, we pre-installed a demo project “Hui Xiao Shi” to facilitate users’ learning.

If users want to open-source their AI chain projects, they can share the projects in the AI chain marketplace (currently under development). The IDE provides a creative co-pilot to generate brief descriptions and images for projects based on task requirements and worker prompts.

Now, the IDE supports deploying AI chains as local web services, facilitating manual deployment to external cloud servers. We will later launch an automatic cloud service deployment feature.

IDE Function Demonstration Tutorial

The Unique Features of Prompt Sapper

Prompt Sapper is inspired by many outstanding projects and tools, such as ChatGPT, AutoGPT, LangChain, no-code AI, and a wealth of prompt engineering literature and tools. However, we are unique in the following three aspects:

The Spectrum of Human-AI Interaction

1. Emphasizing collaborative intelligence between artificial intelligence and human users. As shown in the figure above, it seamlessly integrates human intelligence with artificial intelligence through AI chains, effectively solving complex problems and achieving common goals. This collaborative intelligence promotes overall efficiency, reduces error rates, and empowers human users to fully leverage the potential of artificial intelligence. This unique approach distinguishes Prompt Sapper from existing human-driven chatbots (like ChatGPT) and AI-dominated agent frameworks (like AutoGPT), highlighting its innovative and unique value proposition.

2. Lower requirements for computing and programming skills. Prompt Sapper significantly lowers the threshold for creating complex AI services that meet user needs. It introduces a set of LLM-based virtual product managers, architects, and prompt engineers to help users acquire domain knowledge, analyze task requirements, and build AI chains. Additionally, Prompt Sapper provides an intuitive and user-friendly interface, allowing users to easily interact with artificial intelligence and prototype AI capabilities without requiring advanced computing or programming skills. This approach expands the population benefiting from advancements in artificial intelligence, highlighting Prompt Sapper’s unique position in the field of AI.

3. A systematic AI4SE4AI framework. Prompt Sapper places a high value on the close integration of software engineering and artificial intelligence, aiming to create a systematic AI4SE4AI framework. Within this framework, Prompt Sapper leverages AI technologies to significantly enhance the efficiency of software engineering processes, such as requirements analysis, AI chain design, building, and testing. At the same time, Prompt Sapper follows and extends the best practices of software engineering to adapt to the new software environment driven by AI 2.0 and Software 3.0. This AI4SE4AI framework not only greatly improves the efficiency and quality of AI service development but also supports flexible service reuse and assembly, as well as continuous improvement and optimization of AI services to meet evolving demands.

The table below summarizes the comparison of Prompt Sapper with important related technologies; for details, please refer to our documentation (https://www.aichain.online/public/content%20pages/sappervsothers.html)

AutoGPT vs. Prompt Sapper

LangChain vs. Prompt Sapper

No-code AI vs. Prompt Sapper

Prompt Engineering vs. Prompt Sapper

Looking Ahead

We are in an exciting era of AI and software engineering information, witnessing how technological advancements are changing the world. Prompt Sapper, in conjunction with foundation models and software engineering, will continuously explore the best practices and methodologies of AI chain engineering, promoting the development and popularization of AI chain engineering. We plan to adopt a “go out” and “bring in” approach to bridge the last mile between AI chain development and end users, bringing AI chain engineering methodologies, tools, and practices to more developers and users, and promoting the development of the AI service marketplace and ecosystem. We believe that AI chain engineering will become one of the core technologies of the future, widely applied across various fields and industries, creating more value and benefits for humanity. We can envision that AI chains will help us solve problems more quickly, improve work efficiency, provide more personalized services, and drive society and economy toward a more intelligent future.

For reprints, please contact this public account for authorization

For submissions or inquiries: [email protected]

Leave a Comment Cancel reply