Ontology

The Ontology as a Smart, High-Performance

Computer-Based Information Structure

Purpose

The purpose of this article is to describe and explain an ontology and how, as a type of computer-based information structure, it has capabilities that go beyond data interoperability (i.e., beyond the capabilities of XML schema). This article will discuss ontology in its intended role of facilitating communications between applications (i.e., intelligent agents - discussed in another E-MAPS article) and users with some level of automatic reasoning.

We will explain how ontology-based information structures facilitate the transfer of knowledge.

Basic Concept

The subject of ontology is complex. We are going to define the expression twice: once now as a basic concept and later we’ll define it more fully.

James Hendler defines an ontology as “…a formal definition of a body of knowledge.” ⁽¹⁾

This is an excellent definition because it captures everything clearly without getting too complicated. We might add, however, that an ontology is an abstraction that can take the form of a glossary or other set of concepts or objects that relate to a particular field of knowledge or collection of objects.

Is this reminiscent of a taxonomy? Yes, the two are similar. It is feasible, although not necessarily preferable, to develop an ontology by first creating a taxonomy of the subject matter and then adding more details later.

While one cannot “see” an ontology, just as one cannot “see” a taxonomy, one can visualize it by looking at a representation or model. In our article on taxonomy, we introduce them with a representation of a taxonomy involving areas of the United States , individual states, and individual cities.

Now we’ll model a simple ontology.

Background: Need for Greater Efficiency

We have seen how, over the past several years, XML has evolved into a robust technology for the efficient exchange of data.

The supporters of XML and its schemas include users from commerce, government, the military establishment, and the World Wide Web. A number of practitioners from these communities perceive what they characterize as an information glut arising from our own past successes: near-ubiquitous computing, the world-wide shift to the World Wide Web, and the rise of an impressive communications infrastructure to provide ever-increasing bandwidth.

At the same time, the United States , since the September 11th tragedy, has experienced the unprecedented effects of global organized terrorism. The challenge to defense-related computer-based information systems has never been so formidable; these challenges are becoming more numerous and more life-threatening. They must be solved more quickly and more reliably.

At this point in the ongoing computer revolution, we need the help of more capable technologies to (1) respond to the urgency of obtaining fast, reliable solutions from information systems and to (2) surmount the massive amounts of information we are required to search before doing anything with the results.

Background: A Smarter Approach

It is our conviction that bringing a higher degree of intelligence to our processes – intelligence derived from the worlds of knowledge representation and artificial intelligence -- will be of significant benefit for developing smarter systems capable of greater speed in finding the proverbial needle in the ever-enlarging haystack.

Our focus on a smarter approach requires that we clarify some of the differences between knowledge and information. We will attempt to address only some of them.

One difference is the larger relative degrees of processing, refinement, corroboration and persistence that characterize knowledge in comparison to information. Knowledge tends to accrete as we process information. One also tends to be interested in the state of knowledge about something. Is it complete? Is it accurate? Is it ready for use by decision-makers?

One can readily understand how knowledge can result from the comparison, synthesis, and evaluation of information. After processing information, it is often advantageous to store it in an appropriate repository (i.e., library, knowledge base) to preclude repeating the process needlessly and also to make the knowledge available to others.

Today we are able to process and to share knowledge through the logic of ontology, ontology languages, and intelligent agents engineered to exploit the semantic precision of ontology-embedded semantic markup.

What, then, is an ontology and how does it, as an information structure with enhanced semantic precision, help facilitate the transfer of knowledge?

Explanation

The following definitions, examples, and exhibits will help explain how an ontology is an efficient structures for communicating knowledge because of (1) its semantic advantage (i.e., its ability be exceptionally precise in describing the data or information it describes); in other words, its ability to describe exact meaning as opposed to simply providing a match between strings of characters, and (2) its ability to support rule-based inferential reasoning exercised by “intelligent agents,” pieces of software also known as “crawlers”, “bots” or “knowbots”.

A. Definitions of Ontology

In its original form, ontology has existed since the time of Aristotle. Ontology, in antiquity, referred to the study of “being” or “existence”.

Today, ontology “… includes data categorization schemes, thesauruses, vocabularies, key-word lists, and taxonomies. Ontologies promote semantic and syntactic understanding of data.” ⁽²⁾

The classic definition of ontology is Gruber’s: “… a specification of a conceptualization”. ⁽³⁾ More loosely speaking, it defines “… the things and rules that exist within a respective domain”. ⁽⁴⁾

The following comprise the essential elements of ontology:

(1) It is an abstraction, a logical concept.

(2) Its scope exists within a specific area of knowledge, concepts or objects.

(3) The objects or concepts in an ontology have relationships between them; the relationships are usually but not necessarily hierarchical.

Ontologies may:

(1) Be used to describe the components of a given system, area, or domain.

(2) Be used to describe how the components of a knowledge system relate to one another (i.e., how they interact).

(3) Be linked together or “chained.” It is possible to have a master ontology comprising several constituent ontologies.

(4) Be modeled or represented in a model as a means of describing them.

(5) Have their logic can be embedded in software to achieve varying degrees of artificial intelligence.

B. Visualizing an Ontology

The best way to visualize an ontology is to model an oversimplified one.

Recall Moschella’s definition of ontology: “…the things and rules that exist within a respective domain.” By “things”, we can include the abstract or the concrete, such as concepts or real-life objects. By “rules”, we mean rules governing the relationships between the objects or about the objects; such as “automobiles have wheels - more than two and fewer than five” and “sports cars are types of automobiles.” By “domain”, we mean a particular realm or a particular “world” carved out of the universe of “everything”.

It is important to note that while ontology is an abstraction, it is usually used as an aid to describe something that is real, such as a real body of knowledge about animal husbandry or a real arsenal of weapons used by a particular army’s artillery corps.

The simplified ontology modeled below shows entities (things) comprising part of the American automobile manufacturing industry (domain) and some of the relationships (rules) between them.

The language of ontologies uses many concepts found in the world of object-oriented programming (OOP). We will use and define some OOP terms below.

Figure (1) A Simplified Ontology.

The arrows represent rules or relationships between concepts or objects. Arrows point away from “owner” of relationship; they can be viewed as properties attached to the figures from which they emanate. Ford Motor Company is an instance of sub-class “automobile manufacturer”; “automobile” is a sub-class of class “motor vehicle”. The Taurus’s manufacturer is Ford Motor Co. Ford Motor Co. produces the Ford Taurus.

Both “motor vehicle” and “corporation” are classes in the domain of American automobile manufacturing. Classes are groups or categories having distinct characteristics. “Automobile” is a sub-class of the motor vehicle class. Other sub-classes of “motor vehicle”, although not pictured in the diagram, could conceivably include “truck”, “tractor”, or “forklift.” “Automobile manufacturer” is a sub-class of the manufacturer class. Other sub-classes of class “manufacturer” in this domain could conceivably include “tire manufacturer” or “glass manufacturer”.

Remember that “things” in ontology can be abstract or concrete. Classes and sub-classes are abstract; they don’t actually exist as anything outside our minds. They exist only as representations in our minds of something or things that do exist in reality.

The Ford Taurus LX is a real entity. One can see and touch one. It is shown as an instance of the automobile sub-class. In OOP terminology, an instance is a real thing. It does exist outside our minds. It is produced by a class or a sub-class. The act of producing an instance is called instantiation.

We have one final OOP concept to present - an object. In real life, an object can be anything. Generally it’s something that’s real; it can be touched or perceived in some other physical – versus conceptual -- way. In OOP, an object is something that’s instantiated from -- or is an instance of -- a class or sub-class.

Please note the consistency between this definition and the definition of ontology citing the latter as “…an abstraction that can take the form of glossary or other set of concepts or objects that relate to a particular field of knowledge. “

We have discussed classes, instances, and objects only to enhance one’s understanding of the following basic concept: ontologies comprise things --both real and conceptual -- and the relationships between them within a particular portion of the real world. Ontologies attempt to simplify the complexities of real life; they can be thought of as robust models or representations of reality within a specified scope.

Click here for a more complete ontology discussion using a wine knowledge base or click here for a discussion of ontology and NATO’s generic hub.

C. Purpose/Uses of Ontology-driven Software

The properties, characteristics, and logic of ontologies may be encapsulated in software created to accomplish a variety of goals. The most prominent such uses of ontology-driven software today are in the fields of knowledge representation, artificial intelligence, and information technology. These disciplines have come together with a focus on the Web to produce remarkable gains in our ability to transfer knowledge.

We wish to parenthetically note that the following discussion recognizes the large role that the Web and web-related technology play in the exploitation of ontologies and intelligent agents. This is not meant to imply, however, that application of the technology is restricted to the Web. The technology can be applied in virtually any environment beyond the web, including mainframe and networked environments. If the chosen environment is compatible with Java and marked-up text files, it is compatible with this technology. Note that although Java is not specifically required, it tends to be the language of choice for these applications. The reason we offer web-based examples is because, having started there, this is where the technology is particularly robust at the present time.

Thanks to the work of the World Wide Web Consortium, DARPA, and various members of the knowledge representation/artificial intelligence communities of the European Union, a phenomenon known as the Semantic Web is emerging. It should be made clear that the Semantic Web in the brain child of Tim Berners-Lee, the same visionary who created the original Web. Just as the original Web was document-centric with HTML marked-up pages, the semantic web will be document-centric with semantic marked-up pages. In the following text, we refer to such pages as semantic markup, whether web-based or not.

The difference between conventional HTML markup and the new semantic markup lies in the language used to create the markup (or annotation) in the first place. Originally, the language used was the Hypertext Markup Language, HTML. It is a very limited language - all it does is tell how documents should appear when viewed through a browser. There’s no data involved.

In contrast, semantic markup empowers documents not only to contain data, but also to contain data in a way that is far more refined, predictable, and understandable than anything previously possible. Semantic markup confers the ability to search and perform other operations with content – including content outside the web – based on the meanings of words, or semantics, instead of using simple character strings having no intrinsic meaning. For example, instead of searching on “stars” and getting everything from constellation objects to Hollywood figures – as we do now when we use the typical search engine -- we are able to restrict the search to stars of a constellation, if that’s what we actually want.

It is the ontology that is directly and uniquely responsible for the logic behind such greatly enhanced precision, and it is ontology-based markup languages that are directly and uniquely responsible for the implementation of such logic, or embedding it, in semantic markup.

D. How Does Embedding The Logic of Ontology in Semantic Markup Promote Precision Semantics in the Exchange of Information or Knowledge?

This is the point of embedding ontology-driven logic into semantic markup – to ensure that the consumer of information understands precisely what the supplier is providing. Any potential ambiguity in meaning is minimized because the ontology has defined its content so precisely.

The ontology facilitates this process because agreements to its use, as well as detailed understanding of its precise meanings, are secured in advance of the exchange of information. In other words, each party to an exchange – the semantic markup serving as the source of supply and the intelligent agent as consumer exercising rule-based inferential reasoning -- knows precisely what the other is talking about because they have made a prior commitment to use the same ontology.

The ontological logic and knowledge embedded in semantic markup is machine-readable by an intelligent agent as well as by browsers specifically written for the purpose.

“If an ontology can be made machine readable, it allows a computer to manipulate the terms used in the ontology, terms that make sense to users who understand this information. The computer doesn't understand this information, in any deep sense of the term, but it manipulates terms that the user understands. This allows for a form of communication between user and computer, which in turn enables the creation of software products … that can represent the needs, preferences, and constraints of the user.^{“ (1)}

The exchange of knowledge may be viewed as two separate processes. In the first instance, knowledge is exchanged that is intrinsic to the ontology and to the semantic markup. This involves knowledge that is specified, domain knowledge already embedded in the semantic markup.

The second instance involves the use of an inference engine employed by an intelligent agent. When the existing domain knowledge is accessed by the agent, new knowledge is inferred or deduced after applying the rules or rules of thumb that reside in the engine.

This is new knowledge. Note - the role of intelligent agents and inference engines is the subject of another paper.

E. What Are The Principles or Steps Involved in Creating Ontology?

The process of creating ontology is currently referred to as “ontology engineering”. Ontologies are engineered by people with appropriate training and who often perform their work in collaboration with subject-matter experts.

Broadly speaking, we capture an ontology by defining its purpose and scope, identifying its key components (objects, concepts, classes) and relationships, identifying terms to refer to such concepts and relations, and by producing “unambiguous text definitions” of each.⁽⁵⁾

Noy and McGuiness⁽⁶⁾ recommend the following the seven steps:

1. Identity the purpose, the domain, and the scope of the ontology.

2. Consider using a pre-existing ontology, if one exists.

3. Identify the “important terms in the ontology” (i.e., its properties and relationships).

4. Identify the classes, sub-classes, and their hierarchy.

5. Identify the properties of the classes, objects or concepts.

6. Identify the allowed values, value type, etc., of each property.

7. Create instances of each class or sub-class by defining real objects having values conforming to the objects’ allowable properties.

The wine knowledge base ontology discussion (click here) mentioned previously uses the same seven steps. A reader who wishes to actually develop an ontology should consider is as a primary starting point.

Conclusion

The thesis of this article is that today’s computer technology can exploit the capabilities of an ontology that goes beyond the data interoperability capabilities of XML.

Embedding the logic of ontologies in semantic markup -- whether the content is available via the web, through a corporate intranet, or in a completely closed, offline system located elsewhere -- is an effective means of storing, processing, and transferring knowledge.

Knowledge is involved in both parts of technology’s consumer-supplier relationship. On the supply side, ontologies embedded in semantic markup capture domain knowledge by virtue of the construction judgments of ontology engineers in conjunction with the accumulated wisdom of experienced subject-matter experts

On the consumer side, the “expertise” of intelligent agents – exercised through rule-based inferential reasoning -- is able to exploit the same ontology as the source and, in effect, figure out what source content to use, what to do with it, and how to make sense of it. As a “crawler”, “bot”, or “knowbot”, an intelligent agent is able to make judgments about the knowledge it mines from the source, draw inferences from it, and thereby create new knowledge.

XML schema involves the storage, processing, and movement of self-described data. Ontologies created with today’s ontology languages involve the capture and encapsulation of domain knowledge; when mined by intelligent agents, new knowledge can be created from the source content so encapsulated.

The following is a partial list of potential military applications that can exploit ontology-based systems but could not be accomplished relying upon XML schema technology:

• Automated calculation of thresholds for threat-levels based on ontologies of battlefield conditions, terrorist activities or geopolitical crises.

• Automated calculation of optimizations for logistical support based on ontologies of military units and their objectives and conditions.

• Automated calculation of prioritized target lists based on ontologies of targets, availability of strategic resources and updated battle damage assessments.

• Automated anticipation of enemy response based on ontology of theatre battle conditions in conjunction with various what-if scenarios.

• Automated identification of friendly forces based on ontology of relevant organizations and equipment.

References

(1) James A. Hendler, “Is There An Intelligent Agent in Your Future?”, Nature, March 11, 1999.

(2) John P. Stenbit, DoD Chief Information Officer, Memorandum dated May 9, 2003, Subject: DoD Net-Centric Data Strategy, see page 15.

(3) Thomas R. Gruber, “A Translation Approach to Portable Ontologies”, Knowledge Acquisition, 5(2):199-220, 1993.

(4) David Moschella, “Semantic Applications, or Revenge of the Librarians”, Darwin, March, 2003.

(5) Mike Uschold & Michael Gruninger “Ontologies: Principles, Methods and Applications”, Knowledge Engineering Review; Volume 11 Number 2, June 1996

(6) Natalya F. Noy and Deborah L. McGuinness, “Ontology Development 101: A Guide to Creating Your First Ontology”, Stanford University Knowledge Systems Laboratory web site. Click here to view.

Select Links

OntoWeb homepage.

Tutorial on using Simple Hypertext Ontology Extension (SHOE) markup language.

DARPA Agent Markup Language homepage.

DAML Ontologies by keyword.

University of Maryland's Maryland Information and Network Dynamics Lab web site.

National Cancer Institute Thesaurus and ontology of cancer including diseases, drugs, chemicals, diagnoses, genes, treatments, anatomy, organisms, and proteins web site.

OntoResearch.org homepage.

“The Semantic Web” (May 2001) paper in Scientific American by Tim Berners-Lee, James Hendler, Ora Lassila describing role of ontologies and intelligent agents in the Semantic Web.

Introduction to markup languages and ontologies.

Web-Ontology Working Group of W3C homepage.

World Wide Web Consortium homepage.

Back