Click the blue text above to follow us, and click ‘Write a Comment’ at the end of the article to express your views.
e-Medical Pang Tao
The medical industry is gradually evolving from informatization to intelligent construction due to the continuous deepening of informatization and strong policy promotion. As a technology that makes information production more effective and efficient, voice recognition is undoubtedly the foundation of intelligence.
In fact, voice recognition technology has long been applied in the medical industry abroad, playing an important role in scenarios that require a lot of writing—whether in admission and discharge records, nursing reports, surgical records, or in the work settings of medical technology departments such as radiology and pathology. The development of voice recognition technology is driving the medical industry’s informatization towards intelligent development.
In China, a number of enterprises with independent intellectual property rights in voice recognition technology have keenly captured the vast prospects of the medical industry, providing customized models and professional corpora specifically for the medical field, making its application effect in medical environments even better. Voice recognition technology is deeply integrating into various nodes of medical informatization, and as these companies delve further into the medical industry, more application scenarios that combine with medical processes will emerge.
*It is recommended to browse this image in landscape mode on your phone
Voice recognition technology was first applied in the healthcare field by Philips Electronics in 1994, mainly used in voice dial appointment, voice command control, and report entry.
Philips Electronics’ voice recognition system terminal SpeechMagic, aimed at the medical field, has now been seamlessly integrated by over 250 medical solution providers, achieving voice-to-text conversion in the medical field. Siemens Medical Solutions Group provided a personalized solution for Vienna Hospital, successfully optimizing the hospital’s medical text entry process with the voice recognition system, reducing the time for medical staff to submit reports from 10.5 hours to 6.5 hours, saving 38% of the workload.
In 2010, IBM deployed the Watson system in doctors’ offices. Watson can “understand” doctors’ natural language questions while quickly analyzing a mountain of medical research data to provide answers.
It is worth mentioning Nuance, the largest company specializing in the research and sales of voice recognition software, image processing software, and input method software, holding nearly 80% of the global voice technology market and over 1000 related patents. Nuance provides medical voice transcription services for the U.S. healthcare industry, utilizing patented internet voice and data distribution technology, customer base technology, and medical language experts to convert doctors’ voice recordings into electronic medical record files. Unfortunately, the company currently does not have Chinese voice recognition products in the medical field.
According to statistics, the proportion of voice recognition entry used in American clinics has reached 40%~60%, mainly in departments such as radiology, pathology, and emergency rooms, significantly improving work efficiency. Traditionally, American doctors used recording devices to record verbal medical orders and condition records, and secretaries manually entered the data into computers based on the recordings, which were then modified and signed by the doctors.
With advancements in computer performance and improvements in voice recognition technology, recent years have seen research and applications of voice recognition technology in the medical field in China.
Central South University researched the application of isolated word voice recognition algorithms in medical instruments, making the use of medical instruments (such as portable pulmonary function testers and home cardiovascular devices) more convenient and natural for the elderly and disabled. However, this research only recognizes isolated words with independent meanings, reducing the difficulty of voice recognition and making it suitable for simple application scenarios.
Huazhong University of Science and Technology conducted research on voice processing in electronic medical records, focusing on voice compression and storage with electronic medical records. The research only required identifying which voices needed compression and which did not, as its main purpose was to retain voice recordings as legal evidence in case of medical disputes, without recognizing the content of the voice.
Zhoushan Third People’s Hospital explored the application of voice recognition in medical equipment, achieving experiments to control high-pressure injection pumps using voice recognition, aiming to study the operation of medical devices through voice recognition. This research requires high accuracy in voice recognition; otherwise, incorrect actions of medical devices could endanger patients’ lives. However, the voice recognition processing only needs to provide feedback on fixed commands corresponding to the actions of medical devices.
Shanghai Medical Instrumentation College researched and designed an imaging department diagnostic report generation system based on voice recognition technology. The design combines a voice recognition engine and diagnostic report system, utilizing Microsoft Speech SDK for development. However, this is only a design, and we have not yet seen the final implementation and application of the design.
Until 2015, Peking Union Medical College Hospital collaborated with Beijing Yunzhisheng Information Technology Co., Ltd. (hereinafter referred to as “Yunzhisheng”) to sort out the business of various departments in the hospital, organizing over 40GB of medical text data, and processing these materials for classification and retrieval, allowing the customized voice model to cover key information such as common diseases, drug names, and operational steps used in various departments, achieving a voice recognition accuracy rate of over 95%, which provides an opportunity for voice recognition technology to be widely implemented and applied in various departments of the hospital.
Customized Support for the Application of Voice Recognition Technology

In the face of voice recognition technology, the key emphasis for hospitals is: accuracy, speed, solving accent issues, and effective noise reduction. Whether voice recognition technology can have deeper applications and promotion in the medical industry depends on its satisfactory performance in these areas.
Regarding how voice recognition technology can better serve the medical industry, Dr. Liu Shengping, a senior AI technology expert at Yunzhisheng, stated that due to the special nature of the medical industry, a lot of customization and optimization work needs to be done.
Dr. Liu Shengping emphasized: “The special nature of the medical environment must be given great attention; general voice recognition systems cannot meet the application requirements in medical institutions.” At the same time, Dr. Liu Shengping also acknowledged that the continuous promotion of applications and the development of technology are mutually reinforcing. Taking Yunzhisheng as an example, the accuracy of the voice recognition engine continues to improve through the accumulation of user data on the cloud platform.
Looking at the development of voice recognition technology, earlier technologies were based on PC, while now they are cloud-based. According to Dr. Liu Shengping, response speed is a very important consideration. Through powerful cloud computing and algorithm optimization capabilities, the response time for voice recognition results can be controlled within about 100 milliseconds, effectively solving the speed issue.
When training voice models, Yunzhisheng uses the accents from various regions in China as the basis, gathering accent data from thousands of people. “By training a large model with this data, we can basically solve the adaptability to accents.”
Although the information systems in hospitals are becoming increasingly automated, improving the quality of documents and input efficiency through templates, when facing the data of each patient, the system needs to reflect as much personalized content as possible, and this is where voice recognition technology will play a huge role.
Voice recognition is an interdisciplinary field involving signal processing, pattern recognition, probability theory and information theory, vocal mechanism and auditory mechanism, artificial intelligence… It is not a new phenomenon, but its entrance into application scenarios has only been in the last twenty years. While there have been achievements, it cannot be considered a “perfect score”—especially in the medical industry.
The medical industry is undergoing a transformation—using informatization methods to enhance productivity. There is a market and a demand; this is a great opportunity for voice recognition technology to enter the medical industry. However, facing this special industry and the high threshold of voice recognition technology, those companies with a “nail spirit” willing to delve into industry needs have a better opportunity to grow alongside the industry.
© The above content is sourced from
e-Medical August 2016 original article, please indicate the source when reprinting.