The Traditional Chinese Medicine (TCM) Health Intelligent Research and Development Center was established in March 2014, and has long been committed to the construction of TCM health information resources and the research and application of intelligent technologies. Since 2018, the center has completed the construction of a big data platform for TCM literature. As of now, the platform has collected over 1.3 million journal articles and more than 1,000 books.
The OCR recognition function is one of the main features of the literature big data platform, processing images into text, and achieving structured processing, which will greatly enhance the utilization of data.
To facilitate users in viewing and locating the content of books, the platform has improved and implemented the processing function of book directories. After OCR scanning and recognition, the directory content is first organized to achieve a tree structure, greatly facilitating users in quickly locating the content of the books.
As shown in the figure below, the left side displays the original directory content, while the right side shows the text recognition results in the peach-colored editing box:
First, the directory recognition results are proofread, and specific markers are added according to the chapter page numbers. As shown in the figure below:
After successfully extracting the directory, the results are shown in the figure below:
If you have service needs for text extraction from paper or electronic books, medical cases, etc., please feel free to contact us!

Copyright Notice:
-
This article uses images generated by AI; if there is any infringement, please delete.
-
Copyright belongs to the relevant rights holders. If there is any improper use, please contact us.
-
This public account shares articles for learning and communication purposes only; please do not use them casually for medication.

Platform URL: http://www.tcmkb.cn
Customer Service Phone: 010-64089619
Customer Service QQ: 2778196938