Comparing Generative AI in Academic Writing: DeepSeek, ChatGPT, and More

With the rapid development of artificial intelligence (AI) technology, especially the advancements in large language models (LLM), generative artificial intelligence (Generative AI) is increasingly being applied in academic writing.

Recently, researchers from the University of Waterloo and other institutions published a preprint paper titled: Generative AI in Academic Writing: A Comparison of DeepSeek, Qwen, ChatGPT, Gemini, Llama, Mistral, and Gemma. This paper aims to evaluate the performance of several mainstream large language models (including DeepSeek v3, Owen 2.5 Max, ChatGPT, Gemini, Llama, Mistral, and Gemma) in academic writing, particularly their ability to generate high-quality academic content.

Follow the 【Antibody Circle】 public account, reply in the background: Join group, note: DeepSeek

to receive the full version of the materials

This paper provides a detailed evaluation and comparison of generative artificial intelligence (Generative AI) in academic writing, offering important references for future research.

The authors used 40 academic articles on the topics of “digital twins” and “healthcare” to generate text using generative AI tools and evaluated the generated text as follows:

1. Plagiarism detection: Using the iThenticate tool to check the plagiarism rate of the text.

2. AI detection: Using StealthWriter.ai and Quillbot.com to detect whether the text was generated by AI.

3. Word count comparison: Comparing the word counts of texts generated by different large language models.

4. Semantic similarity: Using ChatGPT, DeepSeek v3, and Owen 2.5 Max tools to evaluate the semantic similarity of the generated text to the original text.

5. Readability assessment: Using Hemingway Editor, Grammarly, and WebFX tools to assess the readability of the text.

The results show:

In terms of cost-effectiveness, DeepSeek v3 has the highest cost-performance ratio.

In terms of text generation capability, Owen 2.5 Max and DeepSeek v3 generated the most text, with detailed content, while Mistral 7B and Deepseek-coder-v2 16B generated more concise texts.

In terms of plagiarism rate, the text generated by ChatGPT 4o mini had the highest plagiarism rate (57%), while Llama 3.1 8B had the lowest (9%).

In terms of AI detection, almost all texts generated by large models could be identified as AI-generated by AI detection tools, though the traces of AI varied.

In terms of readability, all texts generated by large models performed poorly in readability, especially with generally low scores from the Hemingway Editor.

In terms of semantic similarity, all texts generated by large models maintained a high semantic similarity to the original text, particularly with Owen 2.5 Max and DeepSeek v3 showing the most consistency.

The research indicates that Owen 2.5 Max and DeepSeek v3 perform well in academic writing tasks, especially in generating detailed content. However, different models have their advantages in different scenarios, for example, Llama 3.1 8B performs better in terms of plagiarism rate and readability. Future research could further explore how to optimize these models to improve the quality and readability of generated text and reduce traces of AI generation.

The authors further point out that future improvements in generative artificial intelligence in academic writing can include expanding datasets, using larger datasets to evaluate the generalization ability of models; exploring how to effectively combine AI tools with human users through human-machine collaboration to improve the quality of academic writing; optimizing the model generation process to reduce traces of AI generation and make texts more natural. Furthermore, there is a need for further research on the ethical and legal issues of AI-generated content (such as copyright, citation rules, etc.).

Paper link: https://www.researchgate.net/publication/388681921

Scan the WeChat QR code to add the editor of the Bioproduct Circle, and those who meet the conditions can join the

Bioproduct WeChat group!

Please specify: Name + Research Direction!

Comparing Generative AI in Academic Writing: DeepSeek, ChatGPT, and More

Statement

Notice

Statement

All articles reposted by this public account are for the purpose of conveying more information, and the sources and authors are clearly indicated. If any media or individual does not wish to be reposted, please contact us ([email protected]), and we will immediately delete. All articles only represent the author’s views, not the views of this site.

Comparing Generative AI in Academic Writing: DeepSeek, ChatGPT, and More

Leave a Comment Cancel reply