Anti-Plagiarism Image Recognition System Exposes Paper Falsifiers

Welcome to click “Science Net” ↑ to follow us!

There has long been automated technology for detecting text plagiarism, which is widely used, but identifying repeated use of images is much more challenging, especially since users often edit images using Photoshop, making it difficult to prevent. However, with the latest image recognition systems, scientists have established new methods for identifying images that have been deliberately edited, allowing for the detection of many cases of image reuse that are typically undetectable. This will inevitably lead to the exposure of millions of papers, leaving many authors sleepless. According to preliminary research, approximately 0.6% of papers may contain fabricated images, which translates to about 14,000 papers out of 2.5 million academic papers published annually. The latest research was published on February 22 on the preprint server BioRxiv, where Daniel Acuna’s machine learning team from Syracuse University, New York, used an algorithm to process thousands of biomedical papers to identify reused images. If journal editors adopt similar methods, many papers could be blocked from publication due to this issue. Currently, the authors have not disclosed the algorithm, but the research team has communicated with Lauran Qualkenbush, the director of the integrity office at Northwestern University, and they are considering using this technology. They have also communicated with the vice president of integrity at the American Research Association, and application testing may occur within the year. In early 2015, Acuna and two colleagues used an algorithm to extract over 2.6 million image datasets from 760,000 articles available in the PubMed database, including images of microscopic cells, tissues, and gel blots. They then magnified the most characteristic parts of these images, focusing on areas with the most noticeable color and grayscale variations, and created a digital fingerprint for each image using this information. By removing labeled arrows or flowchart components from the images, the team obtained around 2 million images. To avoid comparing all images, they conducted a comparative analysis based on the first and corresponding authors of the papers. Their method can identify repeated images that have been edited through rotation, resizing, contrast, or color adjustments. They then manually verified the analysis results of 3,750 flagged images, which were deemed to have a possibility of duplication. According to their findings, about 1.5% of papers in the database contain suspicious images, and 0.6% of papers may contain fabricated images. To reprint this article, please contact the original author for authorization, and please indicate that this article is from Sun Xuejun’s Science Net blog. Link: http://blog.sciencenet.cn/blog-41174-1100988.html

Anti-Plagiarism Image Recognition System Exposes Paper Falsifiers

Follow us

WeChat ID:sciencenet-cas (← Long press to copy) or long press the QR code below

Anti-Plagiarism Image Recognition System Exposes Paper Falsifiers

Leave a Comment