Knowledge graphs: Introduction, history, and perspectives

Knowledge graphs (KGs) have emerged as a compelling abstraction for organizing the world’s structured knowledge and for integrating information extracted from multiple data sources. They are also beginning to play a central role in representing information extracted by AI systems, and for improving the predictions of AI systems by giving them knowledge expressed in KGs as input. The goals of this article are to (a) introduce KGs and discuss important areas of application that have gained recent prominence; (b) situate KGs in the context of the prior work in AI; and (c) present a few contrasting perspectives that help in better understanding KGs in relation to related technologies.


INTRODUCTION
The term knowledge graph (KG) has gained several different meanings across a range of usage scenarios.This paper focuses on the use of KGs in the context of two important current trends: the desire and need to harness the large and diverse data that are now available and the advent of new machine learning capabilities for extracting meaning from unstructured text and images.It provides the authors' perspective on this area and tracks recent efforts in the NSF Convergence Accelerator Track A on Open Knowledge Network (OKN), where the first author was a participant could represent any real-world entity, for example, people, companies, and computers.An edge label captures the relationship of interest between the two nodes.For example, a friendship relationship between two people; a customer relationship between a company and person; or a network connection between two computers.
There are multiple approaches for associating meanings with the nodes and edges.At the simplest level, the meanings could be stated as documentation strings expressed in a human understandable language such as English.At a computational level, the meanings can be expressed in a formal specification language such as first-order logic.An active area of current research is to automatically compute the meanings captured in a vector consisting of a sequence of numbers.We will contrast these approaches for capturing meaning in a later section on symbolic versus vector representations.
Information can be added to a KG via a combination of human-driven, semiautomated, and/or fully automated methods.Regardless of the method, it is expected that the recorded information can be easily understood and verified by humans.We will contrast different approaches to creating a KG in a later section on human curation versus machine curation.
Search and query operations on KGs can be reduced to graph navigation.For example, in a friendship KG, to obtain the friends of the friends of a person A, one can first navigate the graph from A to all nodes B connected to it by a relation labeled as friend.One can then recursively navigate to all nodes C connected by the friend relation to each B. Directed labeled graph representation and graph algorithms are effective for several classes of problems.They are, however, insufficient to capture all inferences of interest.We will discuss this in more detail in a later section on big semantics versus little semantics.
Practical systems adapt the directed labeled graph representation to suit specific application requirements.For example, a KG model prominently used over the World Wide Web, called the Resource Description Framework (RDF) (Cygniak, Wood, and Lanthaler 2014), uses International Resource Identifiers (IRIs) to uniquely identify "things" (entities).Property graph models (Robinson, Webber, and Eifrem 2015) associate properties and values with each node and each edge.Edge properties can be used for a variety of purposes: to represent facts that are in dispute (for example, a country in which Kashmir resides); highly time-dependent information (for example, the president of USA); or genuine diversities (for example, user behaviors).With the recent emphasis on responsible AI, annotating the edges with information on how they were obtained plays a key role in explaining inferences based on the KG.For example, an edge property of confidence could be used to represent the probability with which that relationship is known to be true.Finally, query languages, such as SPARQL (Pérez et al. 2006) for RDF and Graph Query Language ii for property graph models, provide the ability to query the information in respectively RDF and property graph KGs.

APPLICATIONS OF KNOWLEDGE GRAPHS
Two key applications that have led to a surge in popularity of KGs are: (1) integration and organization of information about known "entities," either as an openly accessible resource on the web iii , or as a proprietary resource within an enterprise/organization; and (2) representation of input and output information for AI/ML algorithms.These application use cases are explored further in the following sections.

Organizing open information
Wikidata is a collaboratively edited open KG that provides data for Wikipedia and for other uses on the web (Vrandečić and Krötzsch 2014).As illustrated in the following example, the Wikidata KG can help enhance and improve the quality of information in Wikipedia.Consider the Wikipedia page for the town, Winterthur iv , which includes a list of all of Winterthur's twin towns: two are in Switzerland, one in the Czech Republic, and one in Austria.Wikipedia also has an entry for the city, Ontario, in California v , which lists Winterthur as its sister city.The "sister city" and "twin city" relationships are meant to be identical as well as reciprocal.Thus, if a city A is a sister (twin) of another city B, then B must be a sister (twin) of A. In Wikipedia, "Sister cities" and "Twin towns" are simply section headings without any relationship/linkage specified between the two.Therefore, it is difficult to detect this discrepancy automatically.In contrast, the Wikidata representation of Winterthur vi includes a relationship called twinned administrative body, which includes the city of Ontario, CA.As this relationship is defined to be a symmetrical relationship in the KG, a SPARQL query engine can infer that the Wikidata page for the city of Ontario, CA vii is to be linked to the Wikidata page of Winterthur.
Wikidata solves the problem of identifying inverse relationships through the relation definitions created by curators and by using inference made possible through a KG inference engine.More advanced forms of such inference are illustrated in the Environmental Intelligence OKN (Janowicz et al. 2022) and the flood impact evaluation OKN  (Johnson et al. 2022) reported in this issue.To the degree that the Wikidata KG is fully integrated into Wikipedia, the discrepancy of missing links in the example provided here would not be present.Figure 1 depicts the two-way relationship between Winterthur and Ontario and shows some of the other objects to which Winterthur and Ontario are connected.Wikidata includes information from several independent providers including, for example, the Library of Congress viii .By using unique internal identifiers for distinct entities, for example, Winterthur, from a variety of sources, such as, the Library of Congress and others, the information about an entity can be easily linked together.Wikidata makes it easy to integrate the different data sources by publishing a mapping of the Wikidata relations to the schema.orgontology.Such tools were recently leveraged to add information about COVID 19 to Wikidata (Waagmeester et al. 2021).Mappings from relation names in Wikidata to relation names in other sources enable formulation and processing of queries spanning multiple datasets across such sites using relations that are common to that set of sites (Peng et al. 2018).An example of such a request is: Display on a map the birth cities of people who died in Winterthur.Without a common relation vocabulary, for example, birth city, it would be necessary to create appropriate translations between relations used in one site to the relations used in other sites.Search engines are routinely using the results of such queries to enhance their results (Noy et al. 2019).

Graph Underlying Wikidata
As of 2021, Wikidata contained over 90 million distinct objects with over one billion relationships among those objects.Wikidata makes connections across over 4872 different catalogs in 414 different languages published by independent data providers.As per a recent estimate, 31% of all websites and over 12 million data providers are currently using the vocabulary of schema.org to publish anno-tations to their web pages (Guha, Brickley, and Macbeth 2016).
There are many new and exciting aspects of the Wikidata KG.First, it is a public graph of unprecedented scale, and one of the largest KGs openly available today.Second, even though it is manually curated, the cost of curation is shared by a community of contributors.Third, while some of the data in Wikidata may be automatically extracted from sources (Wu, Hoffmann, and Weld 2008), all information is required to be easily understandable and verifiable as per the Wikidata editorial policies.Lastly, and importantly, there is a commitment to providing semantic definitions of relation names through the vocabulary in schema.org.
A recent example of another openly accessible KG is from the Data Commons ix effort whose goal is to make publicly available data readily accessible and usable.Data Commons performs the necessary cleaning and joining of data from a variety of publicly available government and other authoritative data sources and provides access to the resulting KG.It currently incorporates data on demographics (US Census, Eurostat), economics (World Bank, Bureau of Labor Statistics, Bureau of Economic Analysis), health (World Health Organization, Center for Disease Control), climate (Intergovernmental Panel on Climate Change, National Oceanic and Atmospheric Administration), and sustainability.

Organizing enterprise information
Data integration is essential to the functioning of modern enterprises where corporate data typically reside across many distinct databases and unstructured sources.Furthermore, the broad shift to online operations for almost all enterprises has resulted in the accumulation of very large amounts of valuable user behavior data across distributed locations.In addition, a proliferation of data available from third-party data vendors is providing enterprises highly valuable information which needs to be integrated with internal data for more effective business operations.
Consider the following example: a financial news report has been released stating that "Acma Retail Inc'' has filed for bankruptcy due to the pandemic because of which many of its suppliers will face financial stress (Ding et al. 2021).If company C, that is a supplier to Acma, is undergoing financial stress, one might expect that a similar stress is also experienced, in turn, by suppliers to C. Such supply chain relationships are currently being curated as part of a commercially available dataset called Factset x .
A "360-degree view" of a customer of a company includes the data about that customer from within the company and the data about the customer from sources outside the company.A company could create a "360degree view" of its customers by combining third-party data, for example, Factset and information from the open financial news with the company's own internal databases.This often requires solving the entity disambiguation problem to uniquely identify entities under question-which is also a problem being addressed in the OKN-related projects described in (Cafarella et al. 2022) and (Pah et al. 2022) in this special issue.The resulting KG could be used to track the Acma supply chain and help identify stressed suppliers whose risk may be worth monitoring.
The data integration process for creating the 360-degree view of a customer might begin with knowledge engineers working with business analysts to sketch out a schema of the key entities, events, and the relationships that they are interested in tracking (see Figure 2).An essential part of this process is for the users to agree on the meanings of the terms.For example, when does an "organization" become a "customer"-at the time of placing an order, or at the time when the product is delivered?In practice, the visual nature of the graph-oriented KG schemas facilitates whiteboarding of the schemas by the business users and subject matter experts in specifying their requirements.Next, the KG schema needs to be mapped to the schemas of the underlying sources so that the respective data can be loaded into the KG engine.The meaning of the data stored in enterprise databases is hidden in logic embedded in queries, data models, application code, written documentation, or simply in the minds of subject matter experts requiring both human and machine effort in the mapping process (Sequeda and Lassila 2021).
Let us consider new and exciting aspects of the use of KGs for data integration.First, the integrated information may come from text and other unstructured sources (for example, news, social media, and others) as well as structured data sources (for example, relational databases).As many information extraction systems already output information in triples, using a generic schema of triples substantially reduces the cost of starting such data integration projects.Second, it can be easier to adapt a triple-based schema in response to changes than the comparable effort required to adapt a traditional relational database.This is because a relational system is typically modeled to support the application (McComb 2018), and thus, schema changes often require database reorganization.On the other hand, in a KG system, the schema is modeled to represent the enterprise (McComb 2019), and its representation in triples remains fixed.Lastly, modern KG engines are highly optimized for answering questions that require traversing the graph relationships in the data.For the example schema of Figure 2, a typical graph engine would be able to employ built-in operations for identifying (1) the central suppliers in a supply chain network, (2) closely related groups of customers or suppliers, and (3) spheres of influence of different suppliers.All these computations leverage domainindependent graph algorithms such as centrality detection and community detection.
Due to the relative ease of creating and visualizing the schema and the availability of built-in analytics operations, KGs are becoming a popular solution for turning data into intelligence in the enterprises.For example, the precision medicine OKN reported later in this special volume makes an extensive use of the graph-based visualization and inference for solving problems in biomedicine (Baranzini et al. 2022).

Representing information for AI algorithms
KGs are an essential technology for natural language processing (NLP), computer vision (CV), and commonsense reasoning.As a result of recent advances in deep learning for NLP and CV, algorithms in these domains are moving beyond basic recognition tasks to extracting relationships among objects, thereby requiring a representation scheme in which the extracted relations could be stored for further processing and reasoning.In commonsense reasoning, the success of hybrid methods employed in IBM's Watson (Ferrucci et al. 2010) has prompted many to pursue a combination of symbolic and statistical approaches for common sense reasoning that requires the use of KGs.In CV, an image is represented as a set of objects with a set of properties, where each object corresponds to a bounding box, identified by an object detector, and the objects are interconnected by a set of named relationships that are predicted by a model trained for identifying visual relationships.In Figure 4, a CV algorithm produces the KG shown to the right with objects such as a woman, a cow, and a mask, and relationships such as holding, feeding, and others.In modern CV research, such a KG is referred to as a scene graph (Chen et al. 2019), which has become a central tool for achieving compositional behavior in CV algorithms.That is, once a CV algorithm has been trained to recognize certain objects, then by leveraging scene graphs, it can be trained to recognize any combination of those objects with fewer examples.Scene graphs also provide the foundation for tasks such as visual question answering (Zhu et al. 2016).We next take the example of a specific kind of commonsense reasoning known as cause-and-effect reasoning.Given an event such as X repels Y's attack, humans can make many commonsense inferences about why did the repel happen?How does X feel about the attack?What might be the likely effect of such a repel?A general strategy to program such reasoning is to first curate a KG manually and then use it in conjunction with a machine learning algorithm to predict the effects for events that do not exist in the KG.For example, given a new event such as X leaving without Y, the system makes inference such as X wanting to be alone, X wanting to go home, Y might miss his friend, etc.Two examples of such systems are ATOMIC that contains over 300,000 event nodes and over 800,000 cause-effect triples (Sap et al. 2019), and GLUCOSE that contains over 670,000 cause-effect triples (Mostafazadeh, et al. 2020).
In these uses of KGs in AI, automated creation of the KG is a central component of the approach.For the commonsense reasoning KGs, even though there is a significant upfront manual effort to create the training set, once trained, the learning algorithm would deal with many new cases at no additional cost.Second, there is a clear recognition that KG representations are a central ingredient to achieving the compositional behavior in AI systems.This is clearly illustrated in the context of a scene graph, but also in capturing the output of NLP and in the rationale for creating cause-effect KGs.

PRIOR RESEARCH RELATED TO KNOWLEDGE GRAPHS
Graph-based representations of data are employed widely throughout computer science (Borgida and Mylopoulos 2009).AI agents maintain representations of real/simulated worlds and utilize these representations for reasoning in the domain.Indeed, choosing representations that allow agents to store information and derive new conclusions is a problem that is central to AI.
The earliest research in AI used frame representations, known as semantic networks, which were directed labeled graphs (Woods 1975).This directed labeled graph representation has been adapted depending on the needs of a given application.A directed labeled graph where the nodes are, say, people, and the edges capture the parent relationship is sometimes referred to as a relational structure.A directed labeled graph where the nodes are classes of objects (for example, Book, Textbook, and others), and the edges capture the subclass relationship, is known as a taxonomy.In some data models, given a triple (A, B, C), we refer to A, B, C as the subject, the predicate, and the object of the triple, respectively.For example, given the triple ("Biden," "President," "USA"), "Biden" is the subject, "President" is the predicate, and "USA" is the object of the triple.A directed labeled graph containing data and taxonomy is often referred to as an ontology.
While some researchers used first-order logic (FOL) to computationally understand semantic networks (Hayes 1981), others advocated that FOL was required to represent the knowledge needed for AI agents (McCarthy 1989).Because of the computational difficulty of reasoning with FOL, different subsets of FOL, such as description logics (Brachman and Levesque 1984) and logic programs (Kowalski 2014), were investigated.There was an analogous development in databases where the initial data systems were based on a network data model (Taylor and Frank 1976), but a desire to achieve independence between the data model and the query processing eventually led to the development of relational data model (Codd 1982), which shares its mathematical core with logic programming.A need to handle semistructured data (Buneman 1997) inspired the investigation of "schema-free" systems or triple stores that capture an important class of problems addressed by modern KG systems.
Implemented KR systems accompanied the foundational research.For example, the representation system CycL (Lenat and Guha 1991) combined ideas from FOL and semantic networks in the context of the practical requirements of coding knowledge on a spectrum of topics (Lenat 1995).These early systems were used to capture the knowl-edge of an intelligent agent, including the rules of causality, implications of relationships between entities, commonsense rules, expert rules, and others.This trajectory of development in AI can be loosely characterized as starting from the need for explicit representations (McCarthy 1989;Newell 1982) to expert systems (Feigenbaum 1984) to large common sense knowledge bases (Lenat 1995).These systems had complex axioms with sophisticated inference mechanisms, but the overall scale, measured in terms of the number of axioms, has been relatively small.The goal was to use the rules to model human reasoning.
The mid-1990s saw an explosion of information on the web, and better methods to access and search this information were needed.There was a tremendous success in using information retrieval methods such as the Page Rank algorithm (Page et al. 1999), and yet it was felt that more was possible if there was a way for us to convey the semantics to our search algorithms (Berners-Lee, Hendler, and Lassila 2001).That vision is coming to fruition with the improvement in search results with the help of resources such as Wikidata and Data Commons which use representations heavily influenced by an earlier language called the Meta Content Format (Guha 1996).In contrast to the early AI systems, today's KGs emphasize capturing many ground facts that are used in applications such as search and analytics with much less emphasis on complex inference.A broader account of the historical developments of KGs outside AI is available elsewhere (Gutiérrez and Sequeda 2021).
Table 1 describes KG models currently being used by the OKN projects described in this special issue.These include RDF and property graph data models, as well as key-value representation in JSON, and mapping of data into a relational database through suitable translations.Each project addresses semantics either through the development of new ontologies or through leveraging existing ontologies.

CONTRASTING PERSPECTIVES
With the increasing adoption and use of KGs in different scenarios and use cases, three contrasting perspectives have emerged: symbolic representation versus vector representation, human curation versus machine curation, and "little semantics" versus "big semantics."There are spirited debates in the community about the effectiveness and efficacy-sometimes even the validity of each approach, with the adherents of one perspective claiming superiority of their approach over the other.Given the breadth of potential applications, it is not necessary for us to settle these debates, but it is important to try many different approaches in parallel and explore means of combining various approaches to advantage.
Our objective in presenting the differing perspectives here is to enable a better understanding of each and articulate the problems where a solution of a certain kind is appropriate.

Symbolic representation versus vector representation
Machine learning algorithms used for NLP and CV rely on a vector representation of text and images.The recent success of deep learning on multiple tasks has prompted many to reject the need for any symbolic representation.We will examine these alternative views more closely.
A commonly used vector representation in NLP is word embedding xi .For example, given a corpus of text, one can count how often a word appears next to every other word, resulting in a vector of numbers.Sophisticated algorithms are available for reducing the dimensions of the vectors to calculate a more compact vector, known as a word embedding (Mikolov et al. 2013).Word embeddings capture the semantic meaning of the word in a way that can be computationally leveraged in tasks, such as word similarity calculation, entity extraction, and relation extraction.Analogously, the CV algorithms operate on vector representation of images.Graph embedding is a generalization of word embedding, but for graph-structured input (Hamilton 2020).
Algorithms using vector representations have excelled at many tasks, for example, web search and image recognition.Using web search of today, we can answer questions such as: Who was the prime minister of the UK in October of 1956?But the search fails if the question is modified to an unusual combination of inference steps, for example, Who was the prime minister of the UK when Theresa May was born?Humans have little difficulty in understanding such questions (Lenat 2019a;Lenat 2019b).The limitations of vector representations can be addressed by encoding the information extracted from text and images into a KG, as we saw in Figures 3 and 4. Complementing the vector and symbolic representations enables the programs to achieve compositional behavior and facilitates inference and reasoning.The use of graph embeddings with a neural network-also known as machine learning with graphs-is being used for handling unseen actions in the cause-effect KGs we considered earlier.
Neuro-symbolic reasoning is a fast-emerging area of research that leverages the benefits of automatic calculation of embeddings while recognizing the need for a discrete KG to produce a human-understandable representation.We illustrate neuro-symbolic reasoning on a story understanding task (Dunietz et al. 2020).Consider the following story: Fernando went to a plant shop.He liked the minty smell of the leaves.He bought a plant and placed it next to a window.Given this story we want to answer the question: Why did Fernando buy the plant?A possible human-understandable chain of reasoning to answer this question involves the following steps: (a) If A (plant) has part B (leaf), and B has property P (minty) then A has property P; (b) If A (person) likes property P (minty leaves) of B (plant), then A likes B; and (c) If A likes B, A may buy B. In this chain of reasoning, steps (a) and (b) are examples of the rules that may exist in a traditional symbolic knowledge base, whereas (c) is a probabilistic rule of the sort that we may find in a cause-effect KG that we considered in the earlier section.Such rules may already exist as part of the curated portion of the KG or could be inferred ahead of time using a graph neural network or could be inferred dynamically in response to a query.A neuro-symbolic reasoner can manage and execute this reasoning process (Kalyanpur et al. 2020).

Human curation versus machine curation
Industrial KGs, such as the Google KG, Amazon Product Graph (APG), and Microsoft Academic Graph (MAG) are of unprecedented scale (Noy et al. 2019).There has often 23719621, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/aaai.12033,Wiley Online Library on [25/01/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License been debate on the degree to which one could create such KGs exclusively through automated methods (also referred to as machine curation) versus creation through human effort.This tradeoff is illustrated via two examples based on the MAG and APG, which leveraged significant automation; and two examples based on the Wikidata KG and the Cyc knowledge base (Lenat 1995), which were primarily created through human curation.
The MAG team used machine curation to solve the problem of uniquely identifying authors and their publications (Wang et.al 2020).A human curation strategy advocates setting up standards such as Document Object Identifier (DOI) for uniquely identifying publications, and Open Researcher and Contributor ID (ORCID) for uniquely identifying authors.This approach relies on the authors and publishing organizations contributing manual effort to annotating documents with DOIs and ORCIDs.However, human curation of even such simple tasks has been problematic for several reasons.First, such identifiers have had low human readability discouraging their use.Second, frequent typographical errors have created an adoption barrier.Third, not having DOIs for the publications has not hampered their accessibility as there are multiple ways to find publications on the web.Finally, there is some abuse of the uniform identifiers.For example, some individuals acquire multiple identifiers to partition their publications into separate profiles defeating the design goal of ORCID being a unique identifier.The MAG team consequently leveraged machine curation by identifying a publication by its contents and disambiguating authors based on their field(s) of research, affiliation(s), coauthor(s), and other factors that are more natural to humans.
The APG is multilingual and aims to collect product knowledge for millions of categories of products and thousands of attributes of each of those products.While one might reasonably assume that vendors interested in selling their products via Amazon might volunteer structured information that could be directly input into the APG, that is not the case in practice, and the structured data are sparse and noisy.However, creating the APG entirely through human curation would have required hundreds of person years of effort.Machine curation techniques were leveraged at different levels of scaling.To get the project off the ground, highly accurate automated knowledge extraction models were created to generate trustworthy data on a small scope of products, where each model extracted knowledge for a single attribute from a single product domain (Zheng et al. 2018).Even though neural networks were explored to automate the process, tremendous manual work was involved to create training data, conduct human evaluation, and to identify postprocessing rules to remove extraction noise.The next level of scaling aimed to reduce modeling cost through AutoML and automatic cleaning techniques (Wang et al. 2020) so that manual tuning for each knowledge extraction model could be significantly reduced.Scaling further required reducing the total number of models required for the variety of knowledge to be extracted, which was achieved through transfer learning techniques such that a model can extract knowledge for multiple attributes and for multiple domains (Karamanolakis, Ma, and Dong 2020).The final level of scaling aimed to increase knowledge extraction yield through multimodal information, for example, extraction from text as well as images (Lin et al. 2021;Yan et al. 2021).Humancreated and highly precise models were the foundation of this process.Different levels of scaling required leveraging techniques such as named entity recognition, closed information extraction, knowledge cleaning, and knowledgebased question answering.
The Wikidata KG was launched to address the problem that data in Wikipedia is buried across 30 million articles in 287 different languages from which automatic extraction is inherently difficult.The same information often appears in articles in many languages and in many articles within a single language.Population numbers for Rome, for example, can be found in English and Italian articles about Rome but also in the English article, "Cities in Italy."The data is inconsistent-the population numbers in these different Wikipedia documents are all different.Having been founded on the principle of plurality, it is not easy, or even possible, to arrive at a global consensus on the "true" data, since many facts are disputed or simply uncertain.Unlike, MAG and APG, Wikidata allows conflicting data to coexist and provides mechanisms to organize this plurality in values.Checking, verifying, and allowing such a plurality of data is something the Wikipedia community has been doing for years.Wikidata's human curation effort involves a community of over 400,000 editors, with over 20,000 active editors.In this process, Wikidata has leveraged standard published identifiers, including the International Standard Name Identifier (ISNI), China Academic Library and Information System (CALIS), International Air Transport Association (IATA), MusicBrainz for albums and performers, and North Atlantic Basin's Hurricane Database (HURDAT).Wikidata itself publishes a list of standard identifiers for items that appear in its corpus, which are now increasingly being used in commercial KGs.
Finally, consider Cyc, the largest available knowledge base that captures complex human common sense.The Cyc knowledge base was largely created through human curation because the project aims to capture "hidden" knowledge that is not explicitly written down in text and, thus, cannot be automatically extracted.Early versions of Cyc employed representations like present-day KGs.Since 1989, Cyc has used a representation language called CycL which is based in higher-order logic and nested  modals (Lenat and Guha 1991).CycL was needed to represent and reason about answers to queries like: When Juliet drank her potion, what did she expect that Romeo would believe once he heard that she was dead, and why (Lenat 2019a)?Automatically extracting knowledge into such highly expressive languages is out of the reach of present NLP techniques even if the knowledge to be entered had been explicitly written down.Cyc is building increasingly automatic tools that help lower the bar for creation and modification of its KB.The project's Knowledge Axiomatization Institute (KNAXI) is also interested in education and professional training in "ontological engineering" at all education levels to facilitate creation of CycL knowledge bases.

Little semantics versus big semantics
The big semantics perspective may be viewed as one that advocates for capturing more meaning about concepts.Whereas, the little semantics perspective, is focused on capturing/recording the basic facts and not so much the concept meanings.A KG defined as a directed labeled graph is a representative technique of the little semantics approach.The representation kanguage CycL is a representative technique of the big semantics approach.Using only directed labeled graph representation for KGs has its inherent limitations.A simple example of such a limitation is in representing the statement: Los Angeles is between San Diego and San Jose along US 101.This statement could be captured in a directed labeled graph using a technique known as reification but requires multiple triples (see Figure 5A).The statement can be captured directly if we allow four-place predicates which are not supported in directed graphs-although many implemen-tations of graph and semantic web databases do include this capability.For this example, the KG representation is akin to using assembly language as opposed to a higherlevel programming language.Use of triples and reification makes downstream tasks such as natural language generation more difficult as they must now assemble information spread across multiple triples.As a more involved example, consider the statements Every Swede has a King, and Every Swede has a mother, which are syntactically similar in English, and many KGs would represent them identically, but these statements have very different computational meanings (see Figure 5B).It is possible to extend the directed graphs in a variety of ways to correctly capture the semantics of the example considered in Figure 5B (Chaudhri et al. 2004;Sowa 2008), but such extensions lose the simplicity offered by the triple representation.Not surprisingly, similar efforts are underway for machine learning of nonbinary relationships as well (Fatemi et al. 2019).
Despite the above stated limitations of the directed labeled graph representation for KGs, it has been found useful for solving many practical problems that are well served by little semantics.Wikidata, Data Commons, MAG, and APG all employ a directed labeled graph representation at their core and their existence and commercial usefulness is a strong evidence that a little semantics goes a long way (Hendler 2007).Furthermore, even for the simple directed labeled graph representation, there are numerous unsolved problems.For example, how might we create open KGs?-which is precisely the question being addressed by multiple OKN projects in this special issue.What common naming conventions will allow users to interact with multiple existing KGs and create their own combined products, which in turn can be used by others and combined still further, ad infinitum?How do we

SUMMARY AND CONCLUSION
KGs have emerged as indispensable information structures that enable access, integration, and use of the vast amounts of data that are currently being generated.A KG also serves the purpose of capturing knowledge learned and used by modern machine learning methods.The most notable uses of directed labeled graphs in AI and databases (data modeling) have taken the form of data graphs, taxonomies, and ontologies.While this representation schema may fall short of the full capability of reasoning and inferencing that is required by general-purpose repositories of knowledge for AI programs, it still provides a scalable and powerful representation that serves many needs.
Even though a directed labeled graph is a common thread linking present day KGs with the early semantic networks in AI, there are some important differences in the research methodology and technical problems addressed.Early semantic networks were created by top-down design methods and manual knowledge engineering processes.They never reached the size and scale of today's KGs.In contrast, modern KGs tend to be large in scale; employ bottom-up development techniques; and employ manual as well as automated strategies for their construction.The differences are summarized in Table 2.
The emphasis in the early AI semantic networks was on complex logical inferencing, in contrast to the focus on supporting analytics operations in modern KGs.Furthermore, vast proliferation of available data, difficulty in arriving at a top-down schema design for data integration, and the data-driven nature of machine learning have all led to a bottom-up methodology for creating KGs.Contemporary KGs are also supplementing manual knowledge engineering techniques with crowdsourcing and significant automation that is now possible through progress in machine learning.However, we posit that modern KG construction methods should also learn the lessons from classical knowledge representation, as there is much to benefit from the substantial body of prior research without reinventing available methods and tools.
We conclude by noting that making progress does not require us to settle all the debates, for example, on symbolic representation versus vector representation, manual curation versus machine curation, and little semantics versus big semantics.Indeed, as reflected by the ethos of the NSF Convergence Accelerator program, we should drive future research by exploring and prototyping various approaches in the context of real-world use cases.Setting a use-inspired context enables us to justify the need and helps specify the requirements for the specific innovations for KGs to have the maximum societal and scientific impact.

A C K N O W L E D G M E N T S
This work has been partially supported by National Science Foundation's Convergence Accelerator program.We sincerely thank Dr. RV Guha for his contributions and insightful comments on the paper.

C O N F L I C T O F I N T E R E S T
No conflict of interest has been declared by the author(s).

Figure 3
Figure 3 depicts an example of the use of KGs to represent knowledge extracted by NLP.It shows a sentence from which one can extract the entities: Albert Einstein, Germany, Theoretical Physicist, and Theory of Relativity; and the relations born in, occupation, and developed.Once this snippet of knowledge is incorporated into a larger KG, we can use logical inference to derive additional links (shown by dotted edges), such as a Theoretical Physicist is a kind of Physicist who practices Physics, and that Theory of Relativity is a branch of Physics.The court records OKN project described in this special issue makes an extensive use of similar entity extraction techniques (Pah et al. 2022).In CV, an image is represented as a set of objects with a set of properties, where each object corresponds to a bounding box, identified by an object detector, and the objects are interconnected by a set of named relationships that are predicted by a model trained for identifying visual relationships.In Figure4, a CV algorithm produces the KG 4 A knowledge graph created using computer vision techniques 23719621, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/aaai.12033,Wiley Online Library on [25/01/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Example sentences and their representation in knowledge graph and first order logic 23719621, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/aaai.12033,Wiley Online Library on [25/01/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License 23719621, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/aaai.12033,Wiley Online Library on [25/01/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License Difference between research on semantic networks and knowledge graphs viii https://id.loc.gov/authorities/names/n50013808.htmlix http://datacommons.orgx http://factset.com, 2022, 1, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/aaai.12033,Wiley Online Library on [25/01/2023].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License xi https://nlp.stanford.edu/projects/glove/RE F E R E N C E S Baranzini, S., K. Börner, J. Morris, C. A. Nelson, K. Soman, E. Schleimer, M. Keiser, M. Musen, R. Pearce, T. Reza, B. Smith, B. Herr, B. Oskotsky, A. Rizk-Jackson, K. Rankin, S. Sanders, R.23719621