WindSurf Update Testing & Open Source Multimodal AI Creation App

Hello everyone, I’m Kate.

Do you remember the English version of the AI creation app I shared yesterday? A user left a message asking if there is a Chinese voice version. Now, it’s finally here! And this time, it’s still open source!

In this video, I will take you on a deep dive into the latest updates of WindSurf and see what new breakthroughs it has made in web searching, automated memory, and code execution. More importantly, I will also share a multifunctional AI creation app built on WindSurf, allowing you to witness how to generate stories, voiceovers, sound effects, and painting prompts all in one click.

WindSurf Latest Update: More Than Just “Surfing the Web”

This update of WindSurf is not just about a nice name; its functionalities have been genuinely upgraded:

  • Web Search: Now you can converse with it in a more natural language to help you retrieve information. It supports direct URL input or recognizes webpage content through the <span>@web</span> command, making it convenient and quick.
  • Automated Memory: The newly added automated memory function allows the AI to remember your preferences and settings for smarter subsequent use.
  • Code Execution: The <span>Cascade</span> code execution function has been optimized to better utilize the terminal shell of the underlying ID. After conversing with WindSurf, an icon will appear in the upper right corner saying <span>Go to Terminal</span>, allowing you to see terminal information on the left side. This avoids reopening ports and makes management easier.

Testing WindSurf Updates: Is It Really Useful?

I have experienced the new features of WindSurf, and my overall impression is:

  • Search Function: The accuracy of search results has improved, capturing detailed webpage content and providing installation steps. However, it still uses some older commands, which is not intelligent enough.
  • Memory Function: The automated memory function currently has some issues and tends to forget previous content.
  • Code Execution: The improvements in <span>Cascade</span> code execution are indeed practical, allowing real-time visibility of terminal output, which is convenient for debugging.

Multimodal AI Creation Application: Build Your Own Studio

The main event is here! Based on WindSurf, I integrated the DeepSeek, MiniMax (Conch), Replicate, and Elevenlabs APIs to create a powerful AI creation application:

  • DeepSeek: Used for generating stories, AI painting prompts, and providing suggestions for sound effect prompts and their placement.
  • MiniMax (Conch): Provides high-quality Chinese voice synthesis.
  • Replicate: Used for generating English voice (the voice of Nicole in Kokoro TTS is great!).
  • Elevenlabs: Provides a rich library of sound effects.

So, how do you use it?

  1. Story Generation: Input the theme of the story you want to tell, choose the language and voice, and the application will automatically generate text and voiceover. You can also set the word count to make the story fit your needs better.

  2. AI Painting Prompts: While generating the story, the application can also generate AI painting prompts for you, including scenes, characters, lighting, actions, etc., allowing you to easily create beautiful images. The prompts are downloadable!

  3. Sound Effect Generation: Click the button to generate sound effects instantly, and you can also choose to generate individual sound effects. Meanwhile, the application will provide suggestions for the insertion positions of sound effects based on the story content, making your work more vivid.

  4. Multilingual Text-to-Speech: You can choose Chinese and English for text-to-speech, where Chinese uses Conch, and English uses Kokoro.

Version Management and Documentation Management: Making Development More Efficient

During the development process, I deeply realized the importance of version management and documentation management:

  • Version Management: Timely Git management allows for easy backtracking and collaborative development.
  • Documentation Management: Detailed documentation can help the AI better understand your needs. For example, I informed WindSurf of the API documentation for DeepSeek, MiniMax, and Replicate Kokoro, allowing it to better call these services.

Feature Demonstration: Seeing is Believing

Having said so much, it’s better to see the results directly!

(Time: 8:30)

  • Story Generation: Click the “Generate Story” button, and the AI instantly generates an interesting little story.
  • Text-to-Speech: Click the “Generate Audio” button, and you can hear the story read by Conch or Kokoro.
  • Sound Effect Generation: Click the “Generate Sound Effect” button to hear various background sound effects.
  • AI Painting Prompts: Click the “Generate Prompts” button, and the application will automatically generate multiple AI painting prompts.
  • Sound Effect Placement Suggestions: Based on the story content, the application will provide suggestions for the placement of sound effects.

This project is open source, and everyone is welcome to experience it.

Resource Link:

  • https://github.com/nicekate/Al-StoryLab

Advertisement

In the past, I have created over 270 original AI-themed articles, and I am confident in continuing to write because this is my hobby, and I am very passionate about it.

If you enjoy my articles and videos, feel free to join my knowledge planet, where I will share the latest AI news, source code, and answer your questions. See you next time!

WindSurf Update Testing & Open Source Multimodal AI Creation App
WindSurf Update Testing & Open Source Multimodal AI Creation App

For historical articles, please see here:

Multifunctional Content AI Creation Application Based on DeepSeek, Kokoro, and Replicate: One-Stop Generation of Stories, Podcasts, Images, and Audio

Detailed Explanation of Kokoro TTS: Efficient Text-to-Speech with 82M Parameters | Includes Local Deployment Tutorial

Using AI to Create a Video Content Analysis Tool: One-Click Video Insights | Integrating DeepSeek + AI IDE

WindSurf + DeepSeek Create AI Smart Flashcards: Full Process Demonstration from Interface Design to Function Implementation | Open Source Sharing

Open Source Rising Star! Microsoft 14B Strongest Small Model Phi-4 Local Deployment and Performance Testing

LiveKit + Groq: Creating Low-Latency, Real-Time Interruptible AI Voice Conversation Applications | Installation and Configuration Full Tutorial

Browser-Use WebUI Tutorial: Easily Achieve Browser Automation | Supports Gemini/DeepSeek and Other AI Models

3 Lines of Code to Implement AI Agent | Deep Analysis of smolagents: Hugging Face’s Latest Agent Construction Library

Replit Agent vs. Boltnew: Full Stack Development Practice of AI Sound Effect Generators, Who is Better?

[Testing] Is DeepSeek V3 Really That Amazing? Partnered with Roo Cline, Comparing Claude and o1 Programming Abilities

DeepSeek V3 Testing: Comparing with Claude 3.5 Sonnet and o1 Pro Coding Abilities

Leave a Comment