This article is contributed by Agora.

Based on the open-source framework TEN Framework, Agora recently launched its latest conversational AI engine. This is a “plug-and-play” deployment solution for conversational AI. With just 2 lines of code and 15 minutes, even large text models like DeepSeek can quickly transform into conversational multimodal models, capable of engaging dialogue.

Through the official demo website, you can experience various use cases of conversational AI, including smart assistants, emotional companionship, speaking practice, and intelligent customer service.

Come and experience real-time dialogue with AI 👇

https://conversational-ai.shengwang.cn

Deploy Conversational AI Agent in 15 Minutes with 2 Lines of Code

The beta testing phase is temporarily free. For more product introductions and documentation: https://www.shengwang.cn/ConversationalAI/

Recently, DeepSeek has taken the world by storm, with its unique deep thinking + networking mode astonishing users with its usability. If you want to break free from text interactions with AI and engage in more realistic voice conversations, Agora’s conversational AI engine allows you to achieve this in just 15 minutes.

Today, Agora’s conversational AI engine beta version is officially online, supporting developers to enable service access APIs in the Console backend and to adjust parameters, test, and generate code in the Playground. With just 2 lines of code, you can deploy a conversational AI agent based on large models in 15 minutes.

Agora’s conversational AI engine official website is also live, allowing developers to learn about product features and apply for the latest demo applications through the website.

Deploy Conversational AI Agent in 15 Minutes with 2 Lines of Code

5 Key Features to Make Your Large Model Conversational

Instant AI Voice Response: AI responds to your questions instantly, with voice dialogue latency as low as 650ms

Voice Locking: Blocks 95% of environmental voices and noise interference, accurately recognizing the speaker’s voice

Smart Interruption: Simulates real human conversation rhythm, allowing you to interrupt the dialogue with AI anytime, with a response time as low as 340ms

Full Model Compatibility: Achieves compatibility with all models such as DeepSeek and ChatGPT, and supports over 30,000 terminal device types with audio and video SDKs, addressing multi-device compatibility concerns

Resilient in Weak Networks: Enables smooth conversations between humans and AI in weak network environments such as subways and underground garages

Agora’s conversational AI engine will provide developers with an unparalleled conversational experience and extremely simple development deployment,even large text models like DeepSeek can quickly transform into conversational multimodal models, capable of engaging dialogue. At the same time, the conversational AI engine also supports stable full-version DeepSeek based on Alibaba Cloud and Tencent Cloud, eliminating concerns about “server busy, please try again later”.

How to Quickly Deploy the Conversational AI Engine

Developers can quickly call Agora’s conversational AI engine RESTful API for voice interaction with AI. By following the process below, you can achieve from opening the Console backend to saying ‘Hello Agent’ with just 2 lines of code in 15 minutes, significantly lowering the development threshold.

Pre-requisites:

1. You have activated the service in the Agora Console and obtained the App ID, temporary Token, customer ID, and customer secret information.

2. You have contacted Agora technical support to enable the Agora conversational AI engine for your project.

3. Your app has implemented basic real-time audio and video functionality.

4. You have obtained the API key and callback URL from the large model provider.

5. You have obtained the API key from the text-to-speech (TTS) provider.

Once the above pre-requisites are met, you can interact with the agent via voice. The specific process is shown in the figure below:

Overall, the deployment process consists of three core steps: Step 1: Join the RTC Channel, call ‘joinChannel’ in your app to join an RTC channel.

Step 2: Create the Conversational Agent: Call ‘Create Conversational Agent’ to create an agent instance, passing in the channel name and Token used in the previous step to allow the agent to join the same RTC channel. After completing this step, Agora will recommend that you go to ‘Console – Conversational AI Engine – Playground’ to quickly experience dialogue with AI, correctly configure various parameters, and after completing the experience, click the View code in the upper right corner to copy the automatically generated server-side API call sample code.

Step 3: Stop the Conversational Agent: After the conversation ends, call ‘Stop Conversational Agent’ to let the agent leave the RTC channel.

For more detailed deployment processes, click on the “Read Original” link at the bottom of the article to view the official documentation center.

Additionally, Agora’s conversational AI engine is temporarily free during the beta testing phase. Developers from all industries are welcome to participate in deployment and communicate with us. You can also apply for the latest demo application of Agora’s conversational AI engine through the Agora official website or the QR code below.

Join Our Voice Agent Community

The RTE developer community continues to focus on Voice Agent and the next generation of voice-driven human-computer interaction interfaces. If you are also interested and look forward to communicating with more developers (online/offline meetups and study note sharing every month), feel free to join our community WeChat group to explore new paradigms of real-time interaction between humans and AI together.

Join us: Add WeChat Creators2022, note your identity and purpose (company/project + position + group addition), priority will be given to those who provide complete notes for group addition.

More Voice Agent Learning Notes:

How to Play with Multimodal AI? Here are 18 Ideas

AI Reshapes Religious Experiences, Can Voice Agents Be the Breakthrough?

TalktoApps Founder: Voice AI Increased My Productivity by 5 Times; Voice Input is the Future of Human-Computer Interaction

a16z Latest Voice AI Report: Voice Will Become a Key Entry Point, But Not the Final Product Itself (Including Latest Diagram)

What Do Conversational AI Hardware Developers Care About? Low-Latency Voice, Visual Understanding, Always-On, Edge Intelligence, Low Power Consumption… | RTE Meetup Review

2024, The Year of Voice AI; 2025, Voice Agents Are Set to Explode | Annual Report Released

Conversation with Google’s Project Astra Research Director: Building a General AI Assistant, Proactive Video Interaction and Full-Duplex Dialogue Are Future Focus Areas

This Voice AI Company Raises $27 Million and Predicts Voice Technology Trends for 2025

Voice as an Entry Point: How AI Voice Interaction Reshapes the Next Generation of Intelligent Applications

Gemini 2.0 is Here, and These Voice Agent Developers Have Already Started Exploring…