◆ ◆ ◆
What is a knowledge graph?
A knowledge graph, as the name suggests, is a representation of knowledge in the form of a graph. Its essence is to describe the attributes of things and the relationships between them in a structured rather than purely textual manner.
A knowledge graph consists of nodes and edges. Nodes represent entities, concepts, or attribute values; any object, place, or person can be a node. Edges represent the attributes of entities or the relationships between entities, while triples are the basic representation form of a knowledge graph. For example, a node can be an organization, such as cardiology, or a disease, such as hypertension; the edge describes the “disease-department” relationship between hypertension and cardiology, where (hypertension, department, cardiology) is a simple triple.

Tracing the history of knowledge graph development reveals that concepts such as semantic networks, ontologies, the World Wide Web, semantic web, and linked data are all intricately related. In the field of artificial intelligence, knowledge graphs are an extension of concepts like semantic networks and ontologies. At the same time, one can also see the genes of the internet, such as the World Wide Web, semantic web, and linked data, within knowledge graphs. A timeline of the history of knowledge graph development is shown in Figure 2.
Source: Wang Haofen, Qi Guilin, Chen Huajun (Eds). Knowledge Graph: Methods, Practices, and Applications [M]. Electronics Industry Press, 2019
Figure 2 History of Knowledge Graph Development
Medical knowledge is characterized by diverse terminology, dispersed knowledge, complexity, and varying quality.
2. Dispersed Knowledge. Taking diseases as an example, there is currently no single knowledge source that can cover all knowledge regarding a specific disease. This phenomenon arises mainly for two reasons: one is that medical knowledge updates rapidly, and new knowledge cannot be quickly updated in the original knowledge sources; the other is that different institutions, due to varying publication purposes, often release knowledge sources that only contain certain specific dimensions of knowledge.
3. Complex Knowledge. Medicine is a science based on empirical summaries, mainly divided into three major categories: basic medicine, clinical medicine, and preventive medicine. Each of the three categories studies complex human issues, resulting in correspondingly complex and variable knowledge. Just from the perspective of medical concepts, one can see the complexity of medical knowledge; a medical concept often has multiple higher-level and lower-level concepts. Taking pneumonia as an example, pneumonia not only has two higher-level concepts of lung tissue inflammation and lung consolidation, but also multiple lower-level concepts such as neonatal pneumonia, interstitial pneumonia, and bronchopneumonia, as shown in Figure 3.
Figure 3 Multiple Higher-Level and Lower-Level Concepts of Pneumonia
Expressing complex medical knowledge completely and clearly can make computers appear more intelligent in assisting humans with medical activities, allowing users to gain more comprehensive medical knowledge.
4. Varying Quality. The numerous sources of medical knowledge inevitably lead to varying quality. Resources such as clinical pathways, clinical guidelines, and medical textbooks have high authority, and clinical guidelines will indicate the level of evidence, corresponding to high quality and credibility of medical knowledge. Knowledge from some open platforms on the internet may have controversial authority due to the identity of the maintainers and the review process.
These characteristics also necessitate that medical knowledge graphs have some special designs and considerations in the three aspects of knowledge representation, knowledge acquisition, and knowledge application.
1. Knowledge Representation. Medical knowledge contains a large number of hierarchical relationships, so a rigorous definition of hierarchical relationships is needed in medical knowledge graphs to achieve accurate expression of medical concepts. For complex medical knowledge, knowledge can be completely expressed by constructing Compound Value Types (CTV).
2. Knowledge Acquisition. The construction of medical knowledge graphs generally adopts a hybrid approach of human and machine involvement, requiring active participation from medical experts throughout the entire process to ensure data accuracy. Moreover, during the construction of medical knowledge graphs, it is necessary to fully incorporate current industry standards, textbooks, guidelines, and other authoritative knowledge sources, while also supplementing medical knowledge from the internet and clinical records to ensure the completeness of medical knowledge.
3. Knowledge Application. Different medical scenarios require the use of medical knowledge from different sources or with different levels of evidence. Each triple in the medical knowledge graph must record the source of the medical knowledge and its corresponding level of evidence to enhance the applicability of the medical knowledge graph in different scenarios.
Medical knowledge graphs are a combination of medicine and big data. Knowledge graphs possess powerful semantic processing and open connectivity capabilities. Their professionalism, standardization, and good representation of knowledge can obtain high-quality data from sources such as medical dictionaries, medical standards, and electronic medical records, providing foundational support for semantic search, knowledge Q&A, and decision support. Research on medical knowledge graphs can advance intelligent processing of massive data, giving rise to applications in advanced intelligent medicine. With the deepening of industry applications, higher demands are expected for medical knowledge graphs in terms of data quantity, data quality, standardization level, and knowledge grading.
1. Greater Quantity. As the foundational data for various application scenarios, the quantity and coverage of data in medical knowledge graphs are crucial. Medical knowledge is richer compared to other fields, and with the advancement of medicine, a large amount of new knowledge continues to emerge. Incomplete knowledge in applications can severely impact user experience and significantly reduce product value. To meet the quantity requirements of the medical industry, stable and rapid automated knowledge graph construction technologies should be developed, and knowledge graphs can also be co-constructed and shared within the industry. The industry should promote the sharing of high-quality knowledge sources, such as high-quality electronic medical records and new medical knowledge bases, to ensure quality sources for the construction of medical knowledge graphs.
2. Higher Quality. Medical knowledge graphs directly target applications in the medical field, where the tolerance for data errors is much lower than in other fields, thus requiring extremely high quality. To meet the quality requirements of the medical industry, a comprehensive quality control system for knowledge graphs should be established. Quality control should span the entire lifecycle of knowledge graph construction, with strict controls on data source quality before construction; a good human-machine collaboration model should be established during construction to ensure guidance and participation from medical professionals; and post-construction accuracy verification and review should be conducted, along with maintenance and updates based on application conditions.
3. Better Standardization. As the application level of knowledge graphs deepens in the medical industry, the standardization of knowledge graphs becomes crucial in scenarios where multiple knowledge graphs are used together. Knowledge graphs with high standardization levels have better interoperability, allowing for more efficient joint use, thus reducing application costs and enhancing user experience. Promoting the establishment of a standardized management system for medical knowledge graphs can facilitate the implementation of standardization. At the industry level, all stakeholders should share resources and results, promote the use of existing achievements, avoid duplication of efforts, and bottom-up promote the establishment of relevant standards for domestic medical knowledge graphs, including schema standards, technical standards, and other related standards; at the government level, there should be active promotion of research and formulation of standards related to medical knowledge graphs, providing relevant support for industry development to promote healthier and faster growth in the industry.
4. More Distinct Graded Applications. The application scenarios of medical knowledge graphs are rich, and different scenarios have varied knowledge requirements, which also gives rise to the need for graded management of medical knowledge. Currently, in the medical field, knowledge derived from clinical guidelines has a classification of evidence levels, achieving knowledge grading management. However, the current knowledge management in medical knowledge graphs is still relatively crude, and most have not yet involved knowledge grading management. Graded management of knowledge is an important development direction for future knowledge graph management. By establishing rules for knowledge grading from the perspectives of knowledge sources and levels of evidence, a sound grading management system can meet the diverse application needs of knowledge graphs in different scenarios.
In the development of artificial intelligence, both data and knowledge are equally important. Although the current trend is primarily driven by deep learning, driven by data, knowledge is the cornerstone of artificial intelligence. Accurately identifying and selecting suitable knowledge, designing appropriate construction routes, adopting suitable construction technologies, and fostering a good shared interactive ecosystem within the industry are key focuses for the future development of medical knowledge graphs.
Medical knowledge graphs are a long-term project that requires joint participation from the industry. Currently, the development of medical knowledge graphs in China is still in its initial stage, and we urgently need all parties in the industry to work together to build an ecosystem of knowledge accumulation and open sharing, to promote the rapid development of domestic medical knowledge graphs, so that medical knowledge graphs can quickly consolidate and find applications in more products. It is believed that in the near future, domestic medical knowledge graphs will enter a period of rapid development and widespread application.
Note: Some content is excerpted from “Release of the 22nd Issue of the imiti White Paper: Medical Knowledge Graphs: The Cornerstone of Medical AI”
OMAHA HiTA: Metadata | Terminology | Knowledge Graph
To cure, we choose openness and sharing

Click “Read Original” to join us immediately, and start the movement to return personal health medical data!!