Why We Need Medical Knowledge Graphs?

Artificial intelligence is becoming commonplace. In the medical field, doctors are increasingly recognizing the efficiency and therapeutic value brought by AI in disease diagnosis. To achieve medical artificial intelligence, it is necessary to construct a medical knowledge graph to meet the application needs of knowledge in the medical field.

◆ ◆

What is a knowledge graph?

A knowledge graph, as the name suggests, is a way of representing knowledge in the form of a graph, essentially describing the attributes of things and the relationships between things through a structured rather than purely textual approach.

A knowledge graph consists of nodes and edges, where nodes represent entities, concepts, or attribute values. Any object, location, or person can be a node; edges represent the attributes of entities or the relationships between entities, while triples are the basic representation form of knowledge graphs.For example, a node could be an organization, such as cardiology, or a disease, such as hypertension; the edge describes the “disease-department” relationship between hypertension and cardiology, where (hypertension, department, cardiology) is a simple triple.

Why We Need Medical Knowledge Graphs?
Example of a Knowledge Graph

◆ ◆

Why do we need medical knowledge graphs?

With the gradual maturity of technologies such as natural language processing, medical artificial intelligence has become a strong driving force for industry development. However, to enable machines to possess intelligence and understand and think like humans, it is essential to equip machines with knowledge.

In the highly specialized medical field, such knowledge often comes from content that has undergone strict industry review and reached consensus, such as textbooks, clinical guidelines, and clinical pathways. However, most of this information is presented in an unstructured form, resulting in complex data processing and high development thresholds, which limit the release of knowledge capabilities, not to mention enabling machines to understand and apply knowledge.

If we organize this information in a machine-readable format in advance, would it be feasible for machines to read it directly?

The answer is feasible. Establishing medical knowledge graphs based on these medical materials can transform free-text content into a fully structured, computer-understandable format. Compared to unstructured text data, structured knowledge graphs can represent human knowledge in a clearer and more accurate way, making it easier for machines to read and apply.

Why We Need Medical Knowledge Graphs?
Structured Example of a Knowledge Graph

◆ ◆

Application Scenarios of Medical Knowledge Graphs

The value of using medical knowledge graphs can also be explored from mainstream application scenarios. Currently, medical knowledge graphs are mainly applied in semantic search, knowledge Q&A, and clinical decision support.

1. Semantic Search. Traditional search primarily relies on keyword searches, which limit the search engine’s processing of queries to the surface form of words, lacking knowledge processing and understanding capabilities. The knowledge graph describes the classification, attributes, and relationships of things, possessing rich semantic information that can be applied to semantic search to accurately capture the true search intent behind the user’s input, thereby returning the search results that best meet user needs.

2. Knowledge Q&A. Knowledge-Based Question Answering (KBQA) is also known as knowledge Q&A, which mainly relies on large knowledge bases to transform users’ natural language questions into structured query statements, directly extracting the answers users need from the knowledge base. Medical knowledge is more specialized than in other fields, making it difficult for non-professionals to accurately find answers to related questions by independently understanding a pile of resource documents. Therefore, KBQA based on medical knowledge graphs can help patients obtain answers to questions more quickly and conveniently, suitable for medical knowledge popularization, intelligent guidance, and self-diagnosis.

3. Clinical Decision Support. Clinical Decision Support (CDS) refers to the use of relevant, systematic clinical knowledge and basic patient information and condition information to provide clinical medical staff with information to enhance medical-related decisions and actions, improving the quality of medical care and service levels. Systems with clinical decision support functions are called Clinical Decision Support Systems (CDSS). The medical knowledge graph contains rich basic medical knowledge, which can serve as an important component of the CDSS knowledge base, providing interpretable evidence for the recommendations of CDSS, assisting the system in providing decision support for clinical medical personnel in aspects such as auxiliary diagnosis, treatment plan recommendations, and reasonable medication checks.

With the development of technology and the industry, knowledge graphs are currently also applied in scenarios such as assisting drug development and public health event warnings.

1. Drug Development. In recent years, AI-assisted drug development has gradually become a research hotspot. Knowledge graphs can be applied to the cluster analysis of knowledge in drug development, helping to propose new verifiable hypotheses, thereby accelerating the drug development process and reducing development costs.

2. Response to Public Health Events. In epidemiological investigations and the analysis and warning of epidemic events, knowledge graphs can provide significant assistance through the concept of graph-stored data. For example, using knowledge graphs can visually represent information such as personnel distribution, activity trajectories, and onset times in epidemiological investigations, and the information presented in the graph can facilitate the analysis of correlations between cases and quickly identify sources of infection.

◆ ◆

What are the challenges in building medical knowledge graphs?

Medical knowledge primarily has characteristics such as diverse terminology, dispersed knowledge, uneven quality, and complexity, which pose certain challenges for the construction and application of medical knowledge graphs.

1. Diverse Terminology. Medical terminology refers to the terms that denote general concepts in the medical professional field, widely distributed in medical records, medical literature, and medical textbooks. Currently, no standardized and unified medical terminology standards have been established in China, which makes it somewhat arbitrary for practitioners to use medical terminology in different contexts, leading to the common phenomenon of multiple terminological expressions for the same medical concept. Additionally, issues such as typographical errors caused by foreign translations also contribute to terminology diversity.

2. Dispersed Knowledge. For diseases, there is currently no single knowledge source that can encompass all knowledge related to a specific disease. This phenomenon arises mainly for two reasons: one is that medical knowledge updates rapidly, and new knowledge cannot be quickly updated on the original knowledge source; the other is that different institutions publish knowledge sources that usually only contain specific dimensions of knowledge due to their publication purposes.

3. Complexity of Knowledge. Medicine is a science of empirical summary, mainly divided into three major disciplines: basic medicine, clinical medicine, and preventive medicine. Each of these disciplines studies complex human issues, and the corresponding knowledge is also complex and variable. Just from the perspective of medical concepts, the complexity of medical knowledge can be observed. A medical concept often has multiple upper-level and lower-level concepts. For example, pneumonia not only has two upper-level concepts: lung tissue inflammation and lung consolidation, but also several lower-level concepts such as neonatal pneumonia, interstitial pneumonia, and bronchopneumonia, as shown in the figure below.

Why We Need Medical Knowledge Graphs?

Upper and Lower Concepts of Pneumonia

Expressing complex medical knowledge completely and clearly can make computers appear more intelligent in assisting humans in medical activities, allowing users to obtain more comprehensive medical knowledge.

4. Uneven Quality. The multitude of sources for medical knowledge inevitably leads to varying quality. Resources such as clinical pathways, clinical guidelines, and medical textbooks have high authority, and clinical guidelines indicate evidence levels, corresponding to high-quality and credible medical knowledge. However, knowledge on some open platforms on the internet may have controversial authority due to the identity of maintainers and review processes.

These characteristics also increase the difficulty of constructing and applying medical knowledge graphs to some extent. In this context, OMAHA has independently constructed the Chinese medical knowledge graph – “Hui Zhi” Medical Knowledge Graph (abbreviated as “Hui Zhi” Graph) to meet the common digital medical knowledge needs within the industry. Focusing on four major areas: diseases, drugs, examinations, and surgical operations, it selects authoritative and high-quality medical knowledge from sources such as clinical guidelines, clinical pathways, medical textbooks, drug instructions, and the Chinese Pharmacopoeia as knowledge sources, using a “machine + human” approach to extract basic medical knowledge and form a high-quality knowledge graph, providing a foundation for the intelligentization and application of precise medical knowledge.

Currently, the “Hui Zhi” Graph has published disease knowledge graphs and drug knowledge graphs, totaling about 120,000 entities and 960,000 triples.

Disease Knowledge Graph

The disease knowledge graph is centered on diseases, mainly constructing relationships between diseases and examinations, surgical operations, and drugs. The disease knowledge graph has established knowledge graphs for 17 subfields of cardiovascular system diseases, nervous system diseases, digestive system diseases, etc., describing over 110 relationships such as clinical manifestations, departments, related examinations, treatment methods, symptoms, therapeutic drugs, and complications, providing more comprehensive knowledge support for the application of AI technology in different disease fields for auxiliary diagnosis and risk prediction. It has published a total of 75,000 entities and 370,000 triples, of which 53% of the entities have established mappings with the “Qiqiao Board” medical terminology set.

Drug Knowledge Graph

The drug knowledge graph is centered on drugs, constructing relationships between drugs and diseases, examinations, and surgical operations. As of now, the drug knowledge graph covers 30,000 types of drugs (generic names), with a total of 45,000 entities and 600,000 triples, of which 56% of the entities have established mappings with the “Qiqiao Board” medical terminology set. The published relationship types include insurance payment types, dosage forms, indications, applicable populations, contraindications, and contraindicated populations.

The “Hui Zhi” medical knowledge graph has been made available for download on the HiTA knowledge service platform (hita.omaha.org.cn) for OMAHA member service agency users. If you are not yet an OMAHA member service agency user, you can download sample data from the HiTA knowledge service platform [OMAHA Knowledge Base – Sample Data Download] interface for further understanding.

If you have any questions, contact OMAHA (scan the QR code below)!

Why We Need Medical Knowledge Graphs?

Note: Some content is excerpted from “Imit White Paper Issue 22: Medical Knowledge Graph: The Cornerstone of Medical Artificial Intelligence”

Why We Need Medical Knowledge Graphs?

OMAHA HiTA: Metadata | Terminology | Knowledge Graph

To cure, we choose openness and sharing

Why We Need Medical Knowledge Graphs?

Click “Read the Original” to learn more on the HiTA knowledge service platform!

Leave a Comment