Nick | From Expert Systems to Knowledge Graphs

This article is an excerpt from Chapter 3 of Teacher Nick’s “A Brief History of Artificial Intelligence”: From Expert Systems to Knowledge Graphs. It provides a comprehensive review and in-depth commentary on the development of knowledge graphs, from the first expert system DENDRAL to the semantic web and Google’s open-source knowledge graph.

Nick once worked at Harvard and HP; later he ventured into entrepreneurship and investment, traveling between mainland China and Silicon Valley. Regardless of his busyness, he never forgets to read and write, with many of his works published in the “Shanghai Review”. He has authored books such as “UNIX System V Kernel Analysis” and “Philosophical Storytelling”.

1. Feigenbaum and DENDRAL

Feigenbaum entered Carnegie Tech (the predecessor of Carnegie Mellon) to study electrical engineering (EE) at the age of 16. In his junior year, a course on “Mathematical Models in Social Science” set the trajectory of his life, taught by Simon. After graduating, he stayed at the school to pursue a PhD at the Graduate School of Industrial Administration, where Simon was the dean. After earning his PhD, he taught at the Haas School of Business at UC Berkeley. He co-edited a collection of papers titled “Computers and Thought” with his junior Julian Feldman, and the royalties from this collection later funded the “Computers and Thought” award by the International Joint Conference on Artificial Intelligence (IJCAI), which became the most prestigious award for young scholars under 40 in the AI community, somewhat akin to the Fields Medal in mathematics. The first recipient was Terry Winograd, followed by Douglas Lenat, the late Mal, and Andrew Ng, with the most recent (2016) being Stanford’s natural language processing star Percy Liang. In 1962, McCarthy moved from the East Coast’s MIT to the beautiful San Francisco Bay Area, establishing the Computer Science Department at Stanford. In 1964, Feigenbaum answered McCarthy’s call, leaving Berkeley to assist McCarthy at Stanford.

Nick | From Expert Systems to Knowledge Graphs

Feigenbaum (1936—　)

In 1958, when Joshua Lederberg won the Nobel Prize in Physiology at the age of 33, the following year he left the University of Wisconsin, where he was teaching, to California, invited to rebuild Stanford’s medical school and serve as the head of the genetics department. At that time, Stanford’s medical school was still in San Francisco, along with the public UC San Francisco. Other UC campuses did not have medical schools; the San Francisco campus was the entire UC medical school until the late 1990s when Stanford Medical School and UC San Francisco sought to merge but ultimately failed. Back to the point, Lederberg was influenced by “Leibniz’s Dream” while studying at Columbia University, attempting to find universal rules of human knowledge. In the summer of 1962, Lederberg was still attending programming classes at the Stanford Computing Center, where his first language was BALGOL. He quickly met McCarthy, who had just joined Stanford from MIT, and they attempted to attract Minsky to Stanford Medical School as well.

Lederberg (1925—2008)

In 1964, Feigenbaum met Lederberg at a conference at Stanford’s Center for Advanced Study in the Behavioral Sciences, and their shared love for the philosophy of science fostered a long and fruitful collaboration. At that time, Lederberg’s research direction was space life detection, specifically analyzing data collected from Mars using a mass spectrometer to see if there was a possibility of life on Mars. Feigenbaum’s interest was in machine induction, which we now refer to as machine learning. The two of them, one with data and the other with tools, hit it off. From a historical perspective, this was an interdisciplinary collaboration, with Lederberg’s influence and leadership playing a core role. According to Buchanan, the task of the computer team led by Feigenbaum was to algorithmize Lederberg’s ideas. After Lederberg completed his philosophical concepts, his interest shifted; his initial idea took Feigenbaum and the team five years to realize, and Lederberg blamed them for being too slow.

Feigenbaum quickly discovered that Lederberg was a geneticist who was actually clueless about chemistry, so they enlisted the help of Carl Djerassi, a chemist, writer, and inventor of the oral contraceptive, who was also at Stanford. Djerassi had not won a Nobel Prize but had received the National Medal of Science (whose recipients include Wiener, Gödel, Shannon, and Shing-Tung Yau) and the National Medal of Technology and Innovation (whose recipients include DuPont’s founder Packard, Intel’s founder Noyce, and Microsoft’s founder Gates), which is quite unique. Another individual who had received both awards was John Cocke, who invented the RISC architecture for computers. Djerassi had just moved to free Stanford from Wayne State University and was the first friend Lederberg met in California. The three collaborated to create the first expert system DENDRAL. DENDRAL took mass spectrometer data as input and output the chemical structure of a given substance. Feigenbaum and his students captured the chemical analysis knowledge of Djerassi and his students, distilling it into rules. This expert system sometimes performed better than Djerassi’s students. In Djerassi’s extensive autobiography, there is only a small mention of DENDRAL; this project hardly ranks in his illustrious academic career and colorful life. Djerassi noted in his autobiography that Feigenbaum always referred to the core of DENDRAL as the “Djerassi Algorithm,” while Buchanan recalled that everyone believed the provider of professional knowledge was Lederberg, perhaps because Feigenbaum was smooth or the computer team had more contact with Lederberg.

Feigenbaum was an academic activist, becoming the director of the computing center shortly after arriving at Stanford, a position that was likely more influential than that of the computer science department head at the time. In the early to mid-1960s, Feigenbaum visited the Soviet Union twice and was impressed by the Soviet research in computer science and cybernetics; he had long observed that Soviet research was more theoretical than practical. However, the success of Soviet chess programs did surprise the world. The definition of Soviet cybernetics was too broad and all-encompassing, resulting in a lack of focus and breakthroughs, while the automation discipline in China at that time was modeled after the Soviet Union. There was no automation discipline in the United States, but there was significant overlap between the all-encompassing EE and automation. Feigenbaum realized that his Soviet colleagues were trying to use his reputation to endorse their work and secure funding. Meanwhile, in the United States, the inventor of dynamic programming, Bellman, advised the Air Force through the RAND Corporation that the U.S. should be wary of Soviet computer science research. Feigenbaum was displeased with Bellman’s report, believing he was using the Soviet threat to secure research benefits for himself. Years later, however, Feigenbaum leveraged Japan’s fifth-generation project to promote the idea of a Japanese threat, raising questions about his motivations. Feigenbaum’s several companies did not achieve significant success for various reasons. The Teknowledge company failed, but its byproduct, the knowledge base project SUMO, survived, and is now open-source, becoming one of the foundational common-sense knowledge graphs.

2. MYCIN

Buchanan, the leader of MYCIN, was also a core member of DENDRAL. Buchanan had a background in philosophy and a wide range of interests. In 1964, while studying philosophy at Michigan State University, he sought a summer internship at the System Development Corporation (SDC), but SDC ended up sending his resume to RAND Corporation, apparently sharing resumes from the defense sector. At the time, Feigenbaum was working at RAND for the summer and called Buchanan, leading him to intern at RAND and connect with Feigenbaum. Buchanan’s research direction was scientific discovery, and he took a logical approach rather than a psychological one, not realizing that Feigenbaum also had a strong interest in the philosophy of science. In fact, the earliest articles by Feigenbaum and Lederberg about DENDRAL mentioned the concept of “mechanizing scientific inference.” After earning his PhD, Buchanan wanted to teach philosophy and asked Feigenbaum for a recommendation letter, but Feigenbaum persuaded Buchanan to join him at Stanford to engage in real scientific discovery. Buchanan’s philosophical background helped him; at the beginning of the DENDRAL project, neither Lederberg nor Feigenbaum realized the difference between hypothesis generation and theory generation, while Buchanan also recognized that the Carnap theory he learned in philosophy class did not work computationally. No one on the DENDRAL team had a complete understanding of the chemistry involved; everyone assumed others knew. Buchanan’s early presentations always had to include some chemistry background knowledge, which the audience found incomprehensible and impatient. He recalls one time when McCarthy stood up and shouted to the audience, “Can’t you just listen?” (Just listen, will you?) McCarthy’s reputation saved him.

After DENDRAL’s success, Buchanan began to seek new directions. Experimental science is relatively primitive compared to theoretical science, and primitive experience is relatively easy to convert into rules. Besides chemistry and biology, medicine was another field where expert systems could be immediately utilized. At that time, Stanford Medical School welcomed a talented individual from Harvard, Edward Shortliffe, who earned his M.D. in 1976 at Stanford but had already obtained his PhD in computer science a year earlier under Buchanan’s guidance, with his thesis being the expert system MYCIN, a diagnostic system for bacterial infections. MYCIN’s prescription accuracy was 69%, while the accuracy of specialists was 80%, but MYCIN’s performance was already better than that of non-specialist doctors. Shortliffe received the 1976 ACM Grace Murray Hopper Award for young computer scientists. He later worked as an internal medicine resident at Massachusetts General Hospital for three years before returning to Stanford as a joint professor in the medical school and computer science department.

The MYCIN team believed DENDRAL was the progenitor of expert systems, partly because DENDRAL indeed came earlier, and partly because Buchanan himself came from DENDRAL. However, Newell, as an outsider, believed MYCIN was the true progenitor of expert systems because MYCIN pioneered the production rules that later became elements of expert systems: uncertain reasoning. DENDRAL’s original intention was to perform machine induction from data collected from experts, or machine learning. Although MYCIN was never used clinically, the principles developed in MYCIN gradually became the basis for the core of expert systems, EMYCIN. The motivation for EMYCIN was twofold; in addition to generalization, government funding was also a reason. In the early 1970s, DARPA cut funding for artificial intelligence, changing from long-term funding to annual reviews. Every time they reported to DARPA, Feigenbaum’s team had to pre-arrange their words; they dared not say that research funding was being used for medical-related research. It was only later that they received funding from the National Institutes of Health (NIH) and the National Library of Medicine (NLM), which improved the situation.

3. Maturity of Expert Systems

One of the main measures of whether a field is mature is whether it can generate profit. The lack of commercial applications for artificial intelligence has long been criticized. The most successful case of the expert system era was DEC’s expert configuration system XCON. DEC was the darling before the PC era, using minicomputers to challenge IBM. When customers ordered DEC’s VAX series computers, XCON could automatically configure components according to demand. From its launch in 1980 to 1986, XCON processed a total of 80,000 orders.

The exact amount of money XCON saved DEC has always been a mystery, with the highest claim being $40 million a year, while some estimates say $25 million, and the lowest claim is just a few million at most. Regardless, DEC promoted XCON as a commercial success. XCON indeed reflected technological advancement, originating from CMU’s R1. Interestingly, the earliest XCON was actually written in Fortran, and after its failure, it was shockingly rewritten in BASIC. Newell’s PhD student Charles Forgy invented the Rete algorithm and OPS language, greatly enhancing the efficiency of expert systems, and XCON quickly adopted OPS and later OPS5.

From the early 1980s to the early 1990s, expert systems experienced a golden decade. With the disillusionment of Japan’s fifth-generation project, the term “expert system” transformed from being trendy to having negative connotations. The rise of e-commerce, spurred by the internet, created many application scenarios similar to XCON, thus repackaging expert systems under the name of rules engines, becoming standard middleware. Credit scoring, fraud detection, and risk control have always been areas where rule systems excel, and the credit scoring company FICO acquired a series of struggling expert system companies, including Forgy’s RulesPower. Currently, there are very few independent expert system companies left.

4. Knowledge Representation

Knowledge representation has always been a lukewarm area in artificial intelligence, birthed by expert systems and natural language understanding. KRL (Knowledge Representation Language) is one of the earliest knowledge representation languages, influential but not successful. Winograd, who participated in the KRL project at Xerox PARC, summarized the lessons learned years later, stating that KRL needed to solve two problems simultaneously: first, the usability for knowledge engineers, meaning it should be human-readable and writable; second, it needed to have a foundational logic in the style of McCarthy to support semantics. Attempting to solve these two contradictory issues inevitably led to results that were too complex and unrecognizable, leaving knowledge engineers and logicians dissatisfied.

Logic

Logic is the most convenient knowledge representation language, familiar to people since Aristotle, and possesses various mathematical properties. Any introductory logic book will include the famous Socratic example: All men are mortal; Socrates is a man; therefore, Socrates is mortal. This syllogism can be expressed in modern mathematical logic as follows.

Major premise and minor premise: (∀x) Man(x) ⊃ Mortal(x) & Man(Socrates)

Conclusion: Mortal(Socrates)

First-order logic, also known as predicate logic, is the result of Hilbert’s simplification of Russell’s “Principia Mathematica.” Predicate logic has no ontology, meaning it has no axioms about a specific world. For this reason, philosophers and logicians like Quine equate logic with first-order logic. First-order logic is merely syntax, lacking ontology and semantics; while higher-order logic, in Quine’s eyes, is essentially “set theory in disguise.” The knowledge that Feigenbaum referred to is ontology. Of course, Feigenbaum approached the issue from a psychological perspective rather than a logical one, clearly influenced by his teachers Newell and Simon.

Computability and computational complexity theory are closely related to logic. First-order logic is undecidable, and the satisfiability problem for propositional logic is NP-complete. A core issue in knowledge representation is to find a subset of first-order logic that is decidable and as efficient as possible. Description logic emerged to address this. Description logic can express entities and classes as well as the relationships between classes. In description logic, entities correspond to constants in first-order logic. The representation of entities in description logic is also referred to as ABox; for example, “Newton is a physicist” can be represented as:

Physicist(Newton)

In description logic, there is no need for variables, and the terminology resembles set theory. The relationships between classes are referred to as TBox. For instance, in an ontology, law firms (Lawfirm) are subsets of companies (Company), companies are subsets of organizations (Organization), organizations are subsets of agents (Agent), and agents are subsets of things (Thing). This series of relationships can be represented as:

Lawfirm ⊑ Company ⊑ Organization ⊑ Agent ⊑ Thing

The corresponding first-order logic expression is: Lawfirm(x) → Company(x), Company(x) → Agent(x), Agent(x) → Thing(x)

In the theorem proving of first-order logic, the Term Index technique includes the concept of Subsumption, indicating the subset relationship between terms. TBox expresses a simplified Subsumption. In addition to ABox and TBox, there is also RBox, representing relationships or roles, where relationships can have operations common in set theory, such as subsets, intersections, unions, etc. For example, “the father of the father is the grandfather” can be represented as: has Father ◦ has Father ⊑ has GrandFather. The corresponding first-order logic expression is:

has Father(x, y) ∧ has Father(y, z) → has GrandFather(x, z)

Psychology and Linguistics

Another source of knowledge representation is psychology and linguistics. For example, the most convenient way to represent the hierarchical inheritance relationship of concepts is through trees rather than first-order logic. Psychological experiments have shown that it takes longer for people to answer “Can canaries fly?” than “Can birds fly?” To answer the first question, one must infer “a canary is a bird” again. This is because when storing knowledge, people only store the abstract, which is a consideration of spatial economy. Psychologists like Miller and Chomsky pioneered cognitive science together; perhaps his most famous paper is “The Magic Number Seven.” In addition to theoretical contributions, he led Princeton University’s cognitive science laboratory in later years to create “WordNet.” WordNet is not just a thesaurus but also defines the hierarchical relationships of words, for example, one upper term for car is motor, which can go up to wheeled vehicle, all the way to entity. WordNet has become a fundamental tool in natural language processing.

Nick | From Expert Systems to Knowledge Graphs

Figure: WordNet

Minsky’s Frame

A frame is a type. A canary is a bird, and all the properties of birds automatically transfer to canaries; if birds can fly, canaries can fly too. An iPhone is a phone, and if phones can make calls, iPhones can make calls too. Frames led to the object-oriented design philosophy, and related programming languages were influenced by this. In this sense, it validates that when a concept has a mature implementation, it automatically detaches from artificial intelligence. The semantic network (Semantic Net), which appeared around the same time, is an equivalent representation method to frames. In a semantic network, each node is a frame, and the edges on each node can be viewed as a slot.

Sowa’s Conceptual Graph

John Sowa of IBM proposed “conceptual graphs” in the early 1980s, attempting to establish knowledge representation on a more solid mathematical and logical foundation. Around the same time or slightly earlier, German mathematician Rudolf Wille proposed “formal concept analysis” based on algebra. The theory of programming languages also became increasingly rigorous. In conceptual graphs, the type hierarchy of multiple inheritance can be represented using the algebraic partial order relation called a “lattice.” A “total order” relation is a special case of a “partial order.” In a totally ordered set, each member is either a ≤ b or b ≤ a. Partial order relations allow a member to have multiple superiors and subordinates, whereas in a totally ordered set, each member can only have one superior and one subordinate, so total order relations are sometimes referred to as linear relations. When using a lattice for knowledge representation, each concept is a member of the lattice, and concepts follow a partial order relationship.

5. Lenat and Large Knowledge Systems

Amid the frenzy brought about by Japan’s fifth-generation project, the U.S. government decided to unite several high-tech companies to establish the Microelectronics and Computer Technology Corporation (MCC) in Austin, Texas, to counter Japan. Admiral Inman was appointed CEO, and Woody Bledsoe, a senior professor at the Austin campus working on machine theorem proving, joined MCC full-time for R&D. This brings to mind the division of labor between General Griffiths and Oppenheimer during the Manhattan Project in WWII. Feigenbaum proposed establishing a National Center for Knowledge Technology in the U.S., akin to Diderot’s creation of the encyclopedia, to build a repository of all human knowledge, which naturally had a significant impact on MCC’s plans. Bledsoe recommended Feigenbaum’s student Douglas Lenat.

At this time, Lenat, in his early 30s, was a rising star in the field of artificial intelligence. After obtaining dual degrees in mathematics and physics from the University of Pennsylvania, he lost interest in academic work in both fields. However, upon graduation, he faced conscription and had to continue his studies at Caltech for a PhD. During this period, he developed a strong interest in artificial intelligence and transferred to Stanford to study under McCarthy, but coincidentally, it was McCarthy’s sabbatical year, so he became a student of Feigenbaum and Buchanan. His doctoral thesis implemented a program called AM, for which he received the “Computer and Thought” award from IJCAI in the second year after his PhD. AM stands for Automated Mathematician, which can automatically “discover” theorems. Lenat did not use the word “invent”; in some sense, it reflects his philosophical stance. After facing a series of criticisms regarding AM’s lack of rigor, Lenat introduced its successor, Eurisko, which had a broader application area, including games.

Lenat (1950—　)

When Lenat arrived at MCC, he had a new idea: to encode human common sense and create a knowledge base. This new project was called Cyc, with the three letters derived from the English word “encyclopedia.” This was essentially the earliest knowledge graph. Lenat firmly supported his teacher Feigenbaum’s Knowledge Principle: a system can demonstrate advanced intelligent understanding and behavior primarily due to the specific knowledge exhibited in the field: concepts, facts, representations, methods, metaphors, and heuristics. Lenat even stated, “Intelligence is a million rules.”

Sowa proposed the concept of a “knowledge soup”: the knowledge in our brains is not a lump of knowledge but several lumps, each internally consistent, but between lumps, there may be inconsistencies, and the lumps are loosely coupled. Guha, whose doctoral thesis advisors at Stanford were McCarthy and Feigenbaum, wrote about how to decompose a large theory into multiple “microtheories” and how to utilize Cyc as a front end for multiple different data sources rather than all at once, which is precisely the implementation of Sowa’s “knowledge soup.” Thus, Cyc could become a tool for data or information integration. Lenat was somewhat displeased with this, but he still recruited Guha.

Lenat held Cyc in high regard. In 1984, he predicted that within 15 years, by 1999, every computer sold on the streets would come pre-installed with Cyc. In 1986, Lenat predicted again that if Cyc were available, it would need at least 250,000 rules, requiring at least 350 person-years, meaning 35 people working for ten years. The Cyc project initially had around 30 knowledge engineers, whose daily work was to encode everyday common sense using Cyc’s language CycL, covering areas like education, shopping, entertainment, and sports. By 1995, with the disappearance of Japan’s fifth-generation project, the U.S. government also reduced support for MCC. Lenat took Cyc and left MCC to establish Cycorp, embarking on a long entrepreneurial journey. Core member Guha left MCC and subsequently joined Apple, Netscape, and Google.

WordNet, on the other hand, can easily be found in various versions of Linux configuration App Centers. WordNet is more basic and user-friendly than Cyc; of course, WordNet lacks the reasoning capabilities of Cyc. Fifty years from now, people may not be as familiar with first-order logic as they are with Shakespeare. Perhaps WordNet is not a good example. Cyc’s original goal was more akin to today’s Wikipedia, but while Wikipedia’s audience is humans, Cyc’s users are machines. In the early 1990s, Cyc was criticized for not having successful cases, while other expert systems had applications to varying degrees. Lenat defended that Cyc would only yield benefits after reaching a critical mass of knowledge. Now, moving away from the criticisms of that time, over 20 years have passed, and we still do not see significant applications.

Cyc now has two versions: an enterprise version and a research version. The enterprise version is charged, while the research version is open to researchers. There was once an open-source version, OpenCyc, which was a simplified version but was discontinued due to too many issues encountered during trials; Cyc is preparing to replace OpenCyc with a cloud version. Lenat once said, “Learning only occurs at the edge of what is known, so people can only learn new things that are similar to what they already know. If what you are trying to learn is not far from what you already know, you can learn it. The larger the range of this edge (the more you know), the more likely you are to discover new things.” This not only reflects his early insights into machine learning but can also be seen as his understanding of the later Cyc project. When Lenat began the Cyc project in 1984, he was just in his early 30s; now, over 30 years later, he is nearing 70 and still serves as the CEO of Cycorp.

6. Semantic Web

This line of expert systems, having inherited their logic, found themselves lacking; they had been in conflict with the theorem-proving faction. After the wave of expert systems passed, they became a hidden current until one of the accidental supporters of the World Wide Web, Tim Berners-Lee, proposed the “Semantic Web” (see Berners-Lee 2001), believing the opportunity had arrived. Berners-Lee became famous for the grassroots and convenient HTTP protocol and the hypertext link standard HTML, being labeled by various media as the inventor of the World Wide Web. After the first wave of internet enthusiasm, he immediately left the European Particle Center to take a position at the newly established World Wide Web Consortium (W3C) at MIT. MIT arranged a position for him at the then-computer science laboratory (now merged into CSAIL, the Computer Science and Artificial Intelligence Laboratory), evidently to enhance the school’s influence in the internet wave. The internet boom widened the gap between Silicon Valley, the U.S. tech innovation capital, and Boston’s Route 128, where MIT is located. Twenty years later, Berners-Lee lived up to expectations, receiving the 2016 Turing Award, which is perhaps the lowest-value Turing Award in history.

In fact, the greater credit for the World Wide Web should go to the genius programmer Marc Andreessen, whose revolutionary Mosaic browser brought about the internet revolution. The young Andreessen aimed to change the world rather than just gain fame. Under the guidance and help of Jim Clark, he founded the iconic internet company Netscape, and later went through several difficult but not particularly successful entrepreneurial ventures. When the second internet peak arrived, Andreessen timely founded a new generation of venture capital firm Andreessen Horowitz, achieving results and influence that rivaled established venture capital firms KPCB and Sequoia Capital.

Returning to the point, thanks to the SGML standard that matured in the 1980s, the hypertext link standard HTML was a somewhat shortsighted simplification of SGML. HTTP, at best, was a trivial toy hanging off the browser’s robust body, until the internet standards organization IETF made several modifications to HTTP, making it more professional. The goal of the World Wide Web Consortium (W3C) is to establish standards for the World Wide Web. A group of long-ignored non-mainstream IT practitioners quickly gathered around Berners-Lee. The various haphazard standards proposed in W3C indeed reflected their lack of theoretical foundation. In various W3C meetings, senior practitioners from major tech companies, who had drifted to the margins, often participated in working groups of standardization organizations representing different companies, not to contribute technically but constantly to find noble reasons for their existence and detach from company management. At the 2006 AAAI, during Berners-Lee’s keynote speech, Peter Norvig, then Google’s R&D director, sharply questioned him, which was perceived as a ruthless critique of the Semantic Web.

W3C’s Semantic Web work later introduced description logic after some quasi-logicians joined, becoming seemingly rigorous. After several iterations, it evolved into a mishmash, theoretically unsound and practically unusable. The so-called “all beginnings are difficult” applies; however, if a bad start occurs, it can be disastrous and create artificial obstacles for future corrections. We can compare the work of the Semantic Web with the early DENDRAL and MYCIN projects; clearly, in terms of theory, practice, and the socio-political environment, they are not comparable. Almost every “Semantic Web” project can see Guha’s shadow; in 2013, while at Google, he gave a talk titled “Light at the End of the Tunnel,” which, rather than boasting of success, seemed to summarize lessons learned.

7. Google and Knowledge Graphs

Alongside Wikipedia, there was also Freebase. While Wikipedia’s audience is humans, Freebase emphasizes machine-readability. By 2016, Wikipedia reached 10 million articles, with the English version totaling 5 million articles, while Freebase had 40 million entity representations. Behind Freebase was a startup called Metaweb, one of the founders being Danny Hillis. In 2010, Metaweb was acquired by Google, which gave it the catchy name “Knowledge Graph.” In 2016, Google stopped updating Freebase and donated all the data to Wikidata. Wikidata is a project from Wikimedia, the parent company of Wikipedia, supported by the Allen Institute for Artificial Intelligence, founded by Microsoft co-founder Paul Allen.

In addition to Wikidata, there are several other open-source knowledge graphs, such as DBpedia, Yago, SUMO, etc. Notably, SUMO is the legacy of Teknowledge, a failed company founded by Feigenbaum. One of the significant sources of foundational data for all open-source knowledge graphs is Wikipedia. Taking the entry for Marie Curie in Wikipedia as an example, on the right side of the “Marie Curie” page, there is a box called an infobox, which contains data about Marie Curie, such as her birthday, date of death, birthplace, alma mater, teachers, and students, and this data is close to structured quality.

Wikipedia entry for “Marie Curie”

The underlying structure of IBM Watson integrates two open-source knowledge graphs, Yago and DBpedia. Above the common-sense graphs, specialized graphs can be constructed for vertical domains (e.g., biology, health, finance, e-commerce, transportation, etc.).

Newell and Simon are symbolicists in artificial intelligence. In fact, even within the symbolic camp, there are sub-camps; more “symbolic” than Simon’s faction is machine theorem proving, where Newell and Simon’s early careers had conflicts with a group of logicians, while Feigenbaum inherited his teacher’s genes, vigorously attacking the second-generation representative of theorem proving, Alan Robinson. Norberg, who conducted oral histories at the Babage Institute of the University of Minnesota, often tried to summarize the symbolic faction into the rivalry between MIT and Carnegie Mellon, while Stanford’s McCarthy and SRI’s Nelson leaned towards MIT, and Feigenbaum at Stanford favored his alma mater, Carnegie Mellon. Of course, we can trace back to the earlier Dartmouth Conference where McCarthy and Simon developed their rivalry. However, ultimately, the theoretical foundation of expert systems remains machine theorem proving. Although Feigenbaum, in a sense, artificially created the topic of “knowledge vs. reasoning” and emphasized the importance of knowledge for logical reasoning, knowledge and reasoning are an inseparable pair; emphasizing knowledge does not detach you from the symbolic camp. If viewed purely from the perspective of theorem proving, knowledge is essentially axioms; the more axioms there are, the fewer reasoning steps there will be. The so-called opposition between knowledge and reasoning is actually a distinction between narrow (specific purpose) and broad (general purpose). Knowledge is narrow, while reasoning is broad, as it does not require excessive axioms. Narrow approaches can achieve high efficiency for machines in the short term, but the learning threshold for humans is relatively high; conversely, broad approaches naturally have lower efficiency for machines but lower learning thresholds for humans. First-order logic has the lowest learning threshold, but as the knowledge base grows, the reasoning engine must become increasingly specialized to be efficient.

Nick | From Expert Systems to Knowledge Graphs

For purchasing on JD, please click “Read Original”!

OpenKG.CN

The Chinese Open Knowledge Graph (OpenKG.CN) aims to promote the openness and interconnection of Chinese knowledge graph data, as well as the popularization and widespread application of knowledge graphs and semantic technologies.

Leave a Comment Cancel reply