Hello everyone! This is a channel focused on AI agents~
The AI programmer Devin, priced at $500/M, has gained popularity in the domestic internet circle in just a few days. However, feedback from users who have tried it has generally been positive, but the $500 price tag is likely mainly for enterprise use.
One interesting aspect of Devin is that it behaves more like an organized intern. It knows to formulate a plan first and then continuously updates the plan’s progress during execution. This makes it easier for us to track the current progress of the AI, while preventing it from deviating from the original plan, allowing for deeper thinking and improved task completion quality.
Although this feature seems quite advanced, implementing it with Cursor is actually quite simple!
For Cursor, there is a special file called .cursorrules located in the root directory of the opened folder. Its uniqueness lies in allowing us to modify Cursor’s prompts to the backend LLM. In other words, everything in this file becomes part of the prompts sent to the backend AI (like GPT or Claude).
For example, we can put the planned content into this file, so that every time we interact with Cursor, it receives the latest version of the plan. We can also provide more detailed instructions in this file, such as prompting it to think and formulate a plan at the start of a task and update the plan after completing each step.
Since Cursor can use an Agent to modify files, and .cursorrules itself is a file, this creates a closed loop. It automatically reads the file content each time, understands the latest updates, and after thinking, writes the updated progress and next steps back into this file, ensuring we always have the latest updates.
The ability of self-evolution can also be achieved similarly. In the .cursorrules file, we add prompts that allow Cursor to reflect on its mistakes when users correct it and consider if there are reusable lessons to be documented. If so, it will update the relevant parts of the .cursorrules file. This way, it can accumulate knowledge specific to certain projects.
If using Windsurf, it is slightly different; possibly for security reasons, it does not allow the AI to directly modify the .windsurfrules file. Therefore, we need to split it into two parts, using another file like scratchpad.md. In the .windsurfrules file, we mention that before each thinking process, it should check the Scratchpad and update the plan there. This indirect method may not be as effective as directly modifying it in .cursorrules because it still requires the AI to call the Agent and think based on feedback results, but actual testing shows it is feasible.
Finally, there is the use of extended tools.
Compared to Cursor, one of Devin’s main advantages is the ability to use more tools. For example, it can call a browser for searches, browse web pages, and even use its own brain to intelligently analyze content via LLM. Although Cursor does not support these features by default, we can directly control Cursor’s prompts through .cursorrules, and it has command execution capabilities, creating another closed loop. We can prepare pre-written programs, such as Python libraries or command line tools, and then introduce their usage in .cursorrules, allowing it to learn and use them naturally to complete its tasks.
However, there are still three limitations.
-
Devin has image understanding capabilities, so it can perform front-end interactions and testing, but due to Cursor’s limitations, we cannot pass images as input to the backend AI—this requires changes to its implementation. -
Devin will not be recognized as a bot by website anti-crawling algorithms, but during data collection, I often encounter CAPTCHA or blocking while testing web searches. This may require further exploration. -
Since Devin runs in a fully virtualized cloud environment, it can safely execute various commands without worrying about LLM attacks or executing dangerous commands by mistake. Even if the entire system is deleted, simply starting a new container allows continued use. However, Cursor running on a local computer raises strong security concerns, so its Agent requires our manual confirmation. On the 17th of last week, Cursor launched the Yolo mode, but it still does not support Docker development, although it is gradually considering this aspect. As for Windsurf, it has a direct connection to Docker containers and can run them, or if we have configuration files, it can help you start a new Docker container, perform some initialization, and map local folders, greatly improving security.
Next, I will provide the modified cursorrules for reference.
# Instructions
During your interaction with the user, if you find anything reusable in this project (e.g. version of a library, model name), especially about a fix to a mistake you made or a correction you received, you should take note in the `Lessons` section in the `.cursorrules` file so you will not make the same mistake again.
You should also use the `.cursorrules` file as a scratchpad to organize your thoughts. Especially when you receive a new task, you should first review the content of the scratchpad, clear old different tasks if necessary, first explain the task, and plan the steps you need to take to complete the task. You can use todo markers to indicate the progress, e.g.
[X] Task 1
[ ] Task 2
Also update the progress of the task in the Scratchpad when you finish a subtask.
Especially when you finished a milestone, it will help to improve your depth of task accomplishment to use the scratchpad to reflect and plan.
The goal is to help you maintain a big picture as well as the progress of the task. Always refer to the Scratchpad when you plan the next step.
# Tools
Note all the tools are in python. So in the case you need to do batch processing, you can always consult the python files and write your own script.
## LLM
You always have an LLM at your side to help you with the task. For simple tasks, you could invoke the LLM by running the following command:
py310/bin/python ./tools/llm_api.py --prompt "What is the capital of France?"
But usually, it's a better idea to check the content of the file and use the APIs in the `tools/llm_api.py` file to invoke the LLM if needed.
## Web browser
You could use the `tools/web_scraper.py` file to scrape the web.
py310/bin/python ./tools/web_scraper.py --max-concurrent 3 URL1 URL2 URL3
This will output the content of the web pages.
## Search engine
You could use the `tools/search_engine.py` file to search the web.
py310/bin/python ./tools/search_engine.py "your search keywords"
This will output the search results in the following format:
URL: https://example.com
Title: This is the title of the search result
Snippet: This is a snippet of the search result
If needed, you can further use the `web_scraper.py` file to scrape the web page content.
# Lessons
## User Specified Lessons
- You have a python venv in ./py310.
- Include info useful for debugging in the program output.
- Read the file before you try to edit it.
- Use LLM to perform flexible text understanding tasks. First test on a few files. After success, make it parallel.
## Cursor learned
- For website image paths, always use the correct relative path (e.g., 'images/filename.png') and ensure the images directory exists
- For search results, ensure proper handling of different character encodings (UTF-8) for international queries
- Add debug information to stderr while keeping the main output clean in stdout for better pipeline integration
- When using seaborn styles in matplotlib, use 'seaborn-v0_8' instead of 'seaborn' as the style name due to recent seaborn version changes
# Scratchpad
Three tools
-
llm_api uses vllm deployed Qwen
from openai import OpenAI
import argparse
def create_llm_client():
client = OpenAI(
base_url="http://192.168.180.137:8006/v1",
api_key="not-needed" # API key might not be needed for local deployment
)
return client
def query_llm(prompt, client=None, model="Qwen/Qwen2.5-32B-Instruct-AWQ"):
if client is None:
client = create_llm_client()
try:
response = client.chat.completions.create(
model=model,
messages=[
{"role": "user", "content": prompt}
],
temperature=0.7,
)
return response.choices[0].message.content
except Exception as e:
print(f"Error querying LLM: {e}")
return None
def main():
parser = argparse.ArgumentParser(description='Query an LLM with a prompt')
parser.add_argument('--prompt', type=str, help='The prompt to send to the LLM', required=True)
parser.add_argument('--model', type=str, default="Qwen/Qwen2.5-32B-Instruct-AWQ",
help='The model to use (default: Qwen/Qwen2.5-32B-Instruct-AWQ)')
args = parser.parse_args()
client = create_llm_client()
response = query_llm(args.prompt, client, model=args.model)
if response:
print(response)
else:
print("Failed to get response from LLM")
if __name__ == "__main__":
main()
-
tools/search_engine uses DuckDuckGo
import argparse
import sys
import traceback
from duckduckgo_search import DDGS
def search(query, max_results=10):
"""
Search using DuckDuckGo and return results with URLs and text snippets.
Uses the HTML backend which has proven to be more reliable.
Args:
query (str): Search query
max_results (int): Maximum number of results to return
"""
try:
print(f"DEBUG: Searching for query: {query}", file=sys.stderr)
with DDGS() as ddgs:
results = list(ddgs.text(
query,
max_results=max_results,
backend='html' # Use only the HTML backend
))
if not results:
print("DEBUG: No results found", file=sys.stderr)
return
print(f"DEBUG: Found {len(results)} results", file=sys.stderr)
for i, r in enumerate(results, 1):
print(f"\n=== Result {i} ===")
print(f"URL: {r.get('link', r.get('href', 'N/A'))}")
print(f"Title: {r.get('title', 'N/A')}")
print(f"Snippet: {r.get('snippet', r.get('body', 'N/A'))}")
except Exception as e:
print(f"ERROR: Search failed: {str(e)}", file=sys.stderr)
print(f"ERROR type: {type(e)}", file=sys.stderr)
traceback.print_exc(file=sys.stderr)
sys.exit(1)
def main():
parser = argparse.ArgumentParser(description="Search using DuckDuckGo API")
parser.add_argument("query", help="Search query")
parser.add_argument("--max-results", type=int, default=10,
help="Maximum number of results (default: 10)")
args = parser.parse_args()
search(args.query, args.max_results)
if __name__ == "__main__":
main()
Alright, that’s all I wanted to share today. If you’re interested in building AI agents, don’t forget to like and follow!~