Source: Smart Collection Club
This article contains approximately 5520 words, and is recommended for a 8 minute read.
This article organizes the development history of knowledge engineering and knowledge graphs.

Text Mining and Graph Database | ©ontotext
[ Introduction ] Knowledge engineering is a typical representative of symbolic artificial intelligence, and the increasingly popular knowledge graph is a new generation of knowledge engineering technology. How will knowledge engineering influence the future development of artificial intelligence, even enabling computers to possess cognitive abilities similar to those of humans? Professor Xiao Yanghua from Fudan University, the founder of Knowledge Workshop, was invited to give a speech titled “Knowledge Graphs and Cognitive Intelligence” at the Tencent Research Institute × Smart Collection Club AI & Society salon. Inspired by Professor Xiao’s lecture, I organized the development history of knowledge engineering and knowledge graphs. The lecture video transcript can be found at the end of this article in the mini-program and web link.
What is intelligence? For more than half a century, countless scientists have studied it, leading to the emergence of various factions; however, people’s attention seems to always focus on the currently “winning” side. In recent years, connectionism, represented by deep learning, has achieved fruitful results, and today, when artificial intelligence is mentioned, it is generally assumed to be based on deep learning and machine learning methods, while other research directions seem to be forgotten.
As the dividends of big data have been exhausted and the ceiling of deep learning model performance approaches, people are searching for new breakthroughs, and the momentum of “whoever has knowledge rules the world” is growing stronger. Symbolism, represented by knowledge graphs, is coming into the spotlight, a treasure trove containing a wealth of prior knowledge that has yet to be effectively mined.
For decades, symbolism and connectionism have alternated in prominence. Will this opposition continue in the future, or will we find an organic combination of the two, moving towards a path of cooperative development?
Knowledge engineering is a typical representative of symbolic artificial intelligence, and the increasingly popular knowledge graph is a new generation of knowledge engineering technology. How will knowledge engineering influence the future development of artificial intelligence, even enabling computers to possess cognitive abilities similar to those of humans? This article will start from history, systematically sorting out the nearly 40-year development history of knowledge engineering, and in conjunction with the technological and social background of the big data era, will look forward to the future prospects of knowledge engineering and knowledge graphs.
Breakthrough in Intelligence: Knowledge Engineering
It is generally believed that artificial intelligence can be divided into three levels: computational intelligence, perceptual intelligence, and cognitive intelligence. In brief, computational intelligence refers to fast computing, memory, and storage capabilities; perceptual intelligence refers to sensory capabilities such as vision, hearing, and touch; currently popular technologies like speech recognition, speech synthesis, and image recognition are examples of perceptual intelligence; cognitive intelligence refers to the ability to understand and explain.
The current research on intelligence aims to simulate human-like wisdom through computers to solve practical problems such as precise analysis, smart search, natural human-computer interaction, and deep relationship reasoning in the intelligent era. | © thoughtworks
Currently, computational intelligence aimed at fast computation and storage has basically been achieved. In recent years, driven by deep learning, perceptual intelligence, targeting recognition technologies such as vision and hearing, has also achieved good results. However, compared to the former two, achieving cognitive abilities is much more challenging. For instance, a kitten can “recognize” its owner, which involves perceptual abilities that most animals possess, while cognitive intelligence is a unique human ability.
One of the research goals in artificial intelligence is to enable machines to possess cognitive intelligence and think like humans.
This human-like thinking ability specifically manifests in the machine’s ability to understand, reason, explain, induce, and deduce data and language, reflecting all cognitive abilities unique to humans. Both academia and industry hope to simulate human-like wisdom through computers to tackle practical problems in the intelligent era, including precise analysis, smart search, natural human-computer interaction, and deep relationship reasoning.
Understanding that cognitive intelligence is the key to machine intelligence, we need to further consider how to achieve cognitive intelligence—how to enable machines to possess the ability to understand and explain.
In recent years, due to the disappearance of big data dividends, deep learning faces significant bottlenecks and needs to seek new breakthroughs. The statistical learning methods represented by deep learning heavily rely on samples and can only learn information from data. Some researchers have recognized that another very important breakthrough direction lies in knowledge, especially symbolic knowledge.
Professor Xiao Yanghua believes that knowledge graphs and the series of knowledge engineering technologies represented by knowledge graphs are at the core of cognitive intelligence. Knowledge engineering mainly includes knowledge acquisition, knowledge representation, and knowledge application. The directions we can attempt to break through lie in the utilization of knowledge and the application of combining symbolic knowledge with numerical models. The ultimate result of these efforts is to enable machines to possess understanding and explanation capabilities.
Professor Xiao Yanghua delivering a speech on knowledge graphs at the AI & Society salon, session 15
The Past and Present of Knowledge Engineering
The Origins of Knowledge Engineering
-
1950s—Early 1970s
Early artificial intelligence before the birth of knowledge engineering
So how exactly will knowledge graphs assist artificial intelligence? Looking back at history always helps us better understand the future. Let’s roll back the wheels of time to August 1956, when a group of psychologists, mathematicians, computer scientists, and information theorists gathered for a two-month seminar at Dartmouth College, a peaceful town in New Hampshire, USA, to discuss the issue of simulating human intelligence with machines. They gave the seminar a resounding name: artificial intelligence.
Thus, the discipline of artificial intelligence was born.
Representative Figures and Achievements in Traditional Knowledge Engineering
After the Dartmouth Conference, participants achieved a series of remarkable research results. Representative achievements include: A. Newell, J. Shaw, and H. Simon developed the logical machine LT, which proved 38 mathematical theorems; in 1960, they defined the logical reasoning architecture of GPS and proposed the idea of heuristic search; in 1956, Samuel developed a checkers program that had self-learning capabilities, allowing it to continuously improve its skills based on game experience. Many exciting achievements followed, marking the first peak in the development of artificial intelligence.
Among them, the symbolic school, represented by Newell and Simon, was the first to achieve fruitful results, with the most famous representative being the logical machine LT.
What is the core idea of symbolism? Symbolism believes that artificial intelligence originates from mathematical logic and that the essence of intelligence lies in the manipulation and operation of symbols. Symbolism, during the subsequent competition among several major schools, stood out for a long time and made significant contributions to the development of artificial intelligence. Of course, it also laid the foundation for the later flourishing knowledge engineering.
Next, let’s shift our focus to the early 1960s to the early 1970s, when the academic community was still celebrating the initial victories of artificial intelligence. However, unrealistic research goals led to a series of project failures and unmet expectations. Overly high expectations always result in more destructive disappointments, and eventually, artificial intelligence faced its first winter.
-
1977
Knowledge engineering is born
After experiencing setbacks in the field of artificial intelligence, researchers had to calm down and rethink the future path. At this time, Simon’s student, Edward A. Feigenbaum, stepped forward. He analyzed that traditional artificial intelligence overlooked specific knowledge and that artificial intelligence must incorporate knowledge.
Edward A. Feigenbaum (1936-), an American computer scientist, is known as the father of expert systems and the founder of knowledge engineering, and he received the Turing Award in 1994. One of his famous quotes is widely circulated: “Knowledge is the power in AI.”
Under Feigenbaum’s leadership, expert systems were born. As an important branch of early artificial intelligence, expert systems are programs that have expert-level problem-solving capabilities in specific domains.
Expert systems generally consist of two parts: a knowledge base and an inference engine. They utilize knowledge and experience provided by one or more experts to actively reason and judge problems by simulating the thinking processes of experts. The first successful expert system, DENDRAL, was launched in 1968. In 1977, Feigenbaum officially named it knowledge engineering.
Integrating knowledge into machines, enabling them to utilize human and expert knowledge to solve problems, is the goal of knowledge engineering.
The Rise and Development of Knowledge Engineering
-
1970s—1990s
Knowledge engineering flourished
After the birth of knowledge engineering in 1977, the field continued to develop, producing new logical languages and methods. One important milestone occurred during this period.
As mentioned in the previous section, expert systems were formed, but how did they develop? Could knowledge engineering be industrialized? The expert configuration system XCON from American DEC provided a preliminary answer. When customers ordered DEC’s VAX series computers, the expert configuration system XCON could automatically configure components according to requirements. In six years of operation, it processed eighty thousand orders, saving costs.
Thus, artificial intelligence gradually began its commercial application.
Other notable expert systems include Cyc, established by Douglas Lenat in 1984, aimed at collecting common knowledge in everyday life into an ontology knowledge base. Cyc not only contains knowledge but also provides many inference engines, encompassing 500,000 concepts and 5 million pieces of knowledge. In addition, there is WordNet, an English dictionary maintained by a psychology professor at Princeton University. Similarly, the Chinese “Synonym Forest” and its extended versions, as well as HowNet, are dictionaries of this kind. Unfortunately, with the demise of Japan’s fifth-generation fighter, expert systems gradually declined after a decade of prosperity due to the inability to overcome high costs of manual construction and difficulties in knowledge acquisition.
-
1998
The World Wide Web and Linked Data
The emergence of the World Wide Web greatly facilitated knowledge acquisition. In 1998, the father of the World Wide Web, Tim Berners-Lee, proposed the semantic web again. Its core idea is that the semantic web can directly provide machines with knowledge that can be processed by programs. By transforming documents on the web into a format understandable by computers, the internet becomes a medium for information exchange. However, the semantic web is a more macro concept that requires “top-down” design, making it difficult to implement.
Semantic Analysis and Knowledge Networks
Due to the difficulties of top-down design, scholars shifted their focus to the data itself, proposing the concept of linked data. Linked data aims to ensure that data is not only published on the semantic web but also establishes links between its data to form a vast network of linked data. Among them, the DBpedia project is currently known as the first large-scale open-domain linked data. Similar structured knowledge repositories include Wikipedia, Yago, and others.
-
2012 – Knowledge Graphs
A new development period of knowledge engineering
Alongside Wikipedia, Freebase also existed. While Wikipedia’s audience is human, Freebase emphasizes machine readability. Freebase contains representations of 40 million entities, and after being acquired, Google gave it a catchy name: “Knowledge Graph.”
Why Are Traditional Knowledge Engineering Difficulties So Profound?
In the 1970s and 1980s, traditional knowledge engineering did solve many problems, but these problems had a distinct characteristic: most of the successes were achieved in scenarios with clear rules, boundaries, and closed applications. Once open-ended problems are involved, it becomes nearly impossible to achieve success, such as proving mathematical theorems or playing chess.
Why do traditional knowledge engineering practices have such stringent conditions? Because traditional knowledge engineering is a typical top-down approach that heavily relies on expert intervention. The fundamental goal of knowledge engineering is to impart expert knowledge to machines, hoping that machines can utilize expert knowledge to solve problems. In traditional knowledge engineering, it is first necessary to have domain experts who can express their knowledge; furthermore, knowledge engineers must convert the knowledge expressed by experts into a form that computers can process.
Such heavy reliance on experts to express, acquire, and apply knowledge leads to many problems. On one hand, the knowledge base behind the machine is very limited in scale; on the other hand, its quality also raises many questions, which is why we say traditional knowledge engineering faces many difficulties.
In addition to the aforementioned issues, traditional knowledge engineering faces two major difficulties:
-
First: Difficulty in Knowledge Acquisition
Implicit knowledge, procedural knowledge, etc., are difficult to express. For example, how to express the knowledge used by an experienced traditional Chinese medicine doctor in diagnosing patients; different experts may have subjective differences, for instance, only a small percentage of diseases in China have clear treatment protocols, and most rely on the subjectivity of the doctor.
-
Second: Difficulty in Knowledge Application
Many applications, especially open-ended applications, can easily exceed the pre-established knowledge boundaries; many applications also require support from common sense, which is precisely what artificial intelligence fears most. Why? Because common sense is difficult to define, express, and represent; knowledge updating is challenging, and it heavily relies on domain experts, along with many exceptions or difficult-to-handle situations.
Internet Applications Stimulating
Knowledge Engineering in the Big Data Era
Due to the various reasons mentioned in the previous section, knowledge engineering became virtually silent after the 1980s.
Although the problem-solving ideas of knowledge engineering are highly forward-looking, the limited scale of traditional knowledge representation makes it difficult to meet the demands of large-scale open applications in the internet era. To address these issues, knowledge engineering researchers in academia and industry have sought new solutions.
The first major breakthrough came from Google. Google Search is the core product service of Google, and such internet applications mainly have the following characteristics:
-
Large-scale open applications, where the next search keyword from users is always unknown;
-
Low precision requirements; most search understanding and responses only require simple reasoning, while complex reasoning is extremely rare.
Under such demands, Google launched its own knowledge graph, using semantic retrieval to collect information from various sources to improve search quality. The launch of the knowledge graph essentially announced the entry of knowledge engineering into a new era, which we refer to as the knowledge engineering phase of the big data era. Google expressed its resolute break from traditional knowledge representation with a completely new name.
Snapshot of Google’s Knowledge Graph
Knowledge Graphs Leading the Renaissance of Knowledge Engineering
The emergence of knowledge graphs in the big data era is inevitable; the big data era provides a rich soil for the development of knowledge graph technology. You might ask, is there any essential difference between knowledge graphs and traditional semantic networks? What unique favorable conditions can the big data era bring us? The cutting-edge progress answers—
Big data technology makes large-scale knowledge acquisition possible, and knowledge graphs represent a form of large-scale semantic networks. Such a quantitative change in knowledge scale brings a qualitative change in knowledge utility.
We have massive data, powerful computing capabilities, collective intelligence, and a plethora of emerging models. With the support of these external forces, we have solved a bottleneck problem of traditional knowledge engineering—knowledge acquisition. We can utilize algorithms to achieve large-scale automated knowledge acquisition driven by data.
Symbolism, represented by knowledge graphs, is gaining momentum, and this treasure trove of prior knowledge is being unlocked by big data technology. | ©ontotext
Unlike traditional knowledge acquisition, which relied on experts in a top-down manner, now we utilize data bottom-up to mine and extract knowledge from data. Additionally, crowdsourcing and collective intelligence have become a new path for large-scale knowledge acquisition. High-quality UGC content provides high-quality data sources for automated knowledge mining.
In summary, knowledge engineering has entered a new stage under the leadership of knowledge graph technology, termed the knowledge engineering phase of the big data era. Professor Xiao Yanghua proposed a simple formula to illustrate the connection and distinction between traditional knowledge engineering and the new generation of knowledge engineering represented by knowledge graphs:
Small knowledge + Big data = Big knowledge
The term big data knowledge, or BigKE, will significantly enhance the cognitive intelligence level of machines. So, what is the fundamental significance of big data knowledge engineering for our artificial intelligence? It is to enhance the cognitive intelligence level of machines. We are experiencing a transition from perceptual intelligence to cognitive intelligence, and the most important technology in the future will be to realize cognitive intelligence.
In the big data era, what unique charm does the knowledge graph hold? Why has it garnered such widespread attention?
Knowledge graphs make machine language cognition possible. For machines to recognize and understand language, they require support from background knowledge. Knowledge graphs, rich in relationships between entities and concepts, can serve as background knowledge to support machines in understanding natural language.
Knowledge graphs make explainable artificial intelligence possible. At any stage of artificial intelligence development, we need the explainability of things; current deep learning often faces criticism for lacking explainability. The concepts, attributes, and relationships contained in knowledge graphs are naturally suitable for explanations.
By empowering machine learning with prior knowledge such as knowledge graphs, we can reduce the dependency of machine learning on samples and enhance its capabilities.
Knowledge will significantly enhance the capabilities of machine learning. Traditional machine learning relies on a large number of samples to acquire knowledge, and as the dividends of big data gradually diminish, it is encountering developmental bottlenecks. Empowering machine learning with prior knowledge such as knowledge graphs can reduce its dependency on samples and enhance its capabilities, which may be a path for the coexistence of connectionism and symbolism in the new era.
In addition to the various advantages mentioned above, knowledge graphs are also very useful in a series of practical applications, such as search, precise recommendations, risk identification, and deepening understanding and insights into industry data, and will play a role in various application scenarios.
The information technology revolution continues, and data will continue to develop towards larger scales and more connections. Against this backdrop, knowledge graphs will lead knowledge engineering towards a renaissance, promoting the realization of cognitive intelligence in machines.
References
[1] Xiao Yanghua Lecture: Knowledge Graphs and Cognitive Intelligence
[2] Smart Collection Club. The Ultimate of Science: A Discussion on Artificial Intelligence [M]. People’s Posts and Telecommunications Press, 2015-07
[3] Nick. A Brief History of Artificial Intelligence [M]. People’s Posts and Telecommunications Press, 2017
[4] Overview of the Development of Knowledge Graphs
Xiao Yanghua Lecture Review
Lecture review video link:
https://campus.swarma.org/gcou=415
Editor: Wen Jing