Categorization is the process in which ideas and objects are classified or differentiated into a set of basic concepts. Categorization is one of the most fundamental operations of the mind that underlies human understanding.

The study of categorization is pertinent in various areas including philosophy, linguistics, cognitive psychology, information science, artificial intelligence, and information technology. Classical philosophical treatises on categorization by philosophers such as Aristotle and Kant have been reformulated as such topics as Conceptual Clustering and Prototype Theory in the twentieth century. The development of information science and information technology requires the explication of the mechanism of human reasoning, the decision making process, and other processes of reasoning.

There are many categorization theories and techniques. In a broader historical view, however, three general approaches to categorization may be identified:

  • Classical categorization
  • Conceptual clustering
  • Prototype theory

The classical view


Classical categorization comes to us first from Plato, who, in his Statesman dialogue, introduces the approach of grouping objects based on their similar properties. This approach was further explored and systematized by Aristotle in his Categories treatise, where he analyzes the differences between classes and objects. Aristotle also applied intensively the classical categorization scheme in his approach to the classification of living beings (which uses the technique of applying successive narrowing questions such as "Is it an animal or vegetable?," "How many feet does it have?," "Does it have fur or feathers?," "Can it fly?"… ), establishing this way the basis for natural taxonomy.

The classical Aristotelian view claims that categories are discrete entities characterized by a set of properties which are shared by their members. In analytic philosophy, these properties are assumed to establish the conditions which are both necessary and sufficient to capture meaning.

Kant basically succeeded a table of categories set by Aristotle. Kant, however, interpreted categories not as ontological principles of nature but the principles of how the mind organizes experiences. Categorization is, for Kant, an inherent mental mechanism that organizes given sense experiences. Concepts are these organizing principles of mind and categories are the most fundamental concepts.

According to the classical view, categories should be clearly defined, mutually exclusive and collectively exhaustive. This way, any entity of the given classification universe belongs unequivocally to one, and only one, of the proposed categories.



Categories (Lat. Categoriae, Greek Κατηγορίαι Katēgoriai) is a text from Aristotle's Organon that enumerates all the possible kinds of thing which can be the subject or the predicate of a proposition.

The Categories places every object of human apprehension under one of ten categories (known to medieval writers as the praedicamenta). They are intended to enumerate everything which can be expressed without composition or structure, thus anything which can be either the subject or the predicate of a proposition.

The text begins with an explication of what is meant by "synonymous," or univocal words, what is meant by "homonymous," or equivocal words, and what is meant by "paronymous," or denominative words. It then divides forms of speech as being:

  • Either simple, without composition or structure, such as "man," "horse," "fights," etc.
  • Or having composition and structure, such as "a man fights," "the horse runs," etc.

Next, we distinguish between a subject of predication, namely that of which anything is affirmed or denied, and a subject of inhesion. A thing is said to be inherent in a subject, when, though it is not a part of the subject, it cannot possibly exist without the subject, e.g., shape in a thing having a shape.

Of all the things that exist,

  1. Some may be predicated of a subject, but are in no subject; as "man" may be predicated of James or John, but is not in any subject.
  2. Some are in a subject, but can be predicated of no subject. Thus my knowledge in grammar is in me as its subject, but it can be predicated of no subject; because it is an individual thing.
  3. Some are both in a subject, and may be predicated of a subject, as science, which is in the mind as its subject, and may be predicated of geometry.
  4. Last, some things can neither be in a subject nor be predicated of any subject. These are individual substances, which cannot be predicated, because they are individuals; and cannot be in a subject, because they are substances.

Then we come to the categories themselves, (1-4) above being called by the scholastics the antepraedicamenta. Note, however, that although Aristotle has apparently distinguished between being in a subject, and being predicated truly of a subject, in the Prior Analytics these are treated as synonymous. This has led some to suspect that Aristotle was not the author of the Categories.

Ten Categories

The ten categories, or classes, are

  1. Substance. As mentioned above, the notion of "substance" is defined as that which can be said to be predicated of nothing nor be said to be within anything. Hence, "this particular man" or "that particular tree" are substances. Later in the text, Aristotle calls these particulars "primary substances," to distinguish them from "secondary substances," which are universals. Hence, "Socrates" is a primary Substance, while "man" is a secondary substance.
  2. Quantity. This is the extension of an object, and may be either discrete or continuous. Further, its parts may or may not have relative positions to each other. All medieval discussions about the nature of the continuum, of the infinite and the infinitely divisible, are a long footnote to this text. It is of great importance in the development of mathematical ideas in the medieval and late scholastic period.
  3. Quality. This is a determination which characterizes the nature of an object.
  4. Relation. This is the way in which one object may be related to another.
  5. Place. Position in relation to the surrounding environment.
  6. Time. Position in relation to the course of events.
  7. Position. The examples Aristotle gives indicate that he meant a condition of rest resulting from an action: 'Lying', 'sitting'. Thus position may be taken as the end point for the corresponding action. The term is, however, frequently taken to mean the relative position of the parts of an object (usually a living object), given that the position of the parts is inseparable from the state of rest implied.
  8. State. The examples Aristotle gives indicate that he meant a condition of rest resulting from an affection (i.e. being acted on): 'shod', 'armed'. The term is, however, frequently taken to mean the determination arising from the physical accoutrements of an object: one's shoes, one's arms, etc. Traditionally, this category is also called a "habitus" (from Latin "habere" "to have").
  9. Action. The production of change in some other object.
  10. Affection. The reception of change from some other object. It is also known as passivity. It is clear from the examples Aristotle gave for action and for affection that action is to affection as the active voice is to the passive. Thus for action he gave the example, 'to lance', 'to cauterize', for affection, 'to be lanced', 'to be cauterized.' The term is frequently misinterpreted to mean a kind of emotion or passion.

The first six are given a detailed treatment in four chapters, the last four are passed over lightly, as being clear in themselves. Later texts by scholastic philosophers also reflect this disparity of treatment.

After discussing the categories, four ways are given in which things may be considered contrary to one another. Next, the work discusses five senses wherein a thing may be considered prior to another, followed by a short section on simultaneity. Six forms of movement are then defined: generation, destruction, increase, diminution, alteration, and change of place. The work ends with a brief consideration of the word 'have' and its usage.


In Kant's philosophy, a category is a pure concept of the understanding. A Kantian category is an a priori principle or function of mind by which mind organizes experiences. These principles of mind determine how things appear to human being. In this sense, category is a characteristic of the appearance of any object in general. Kant wrote that he wanted to provide "… a word of explanation in regard to the categories. They are concepts of an object in general… ."1 Kant also wrote that "… pure concepts Categories of the understanding… apply to objects of intuition in general… ."2 Such a category is not a classificatory division, as the word is commonly used. It is, instead, the condition of the possibility of objects in general,3 that is, objects as such, any and all objects.

Conceptual clustering

(see main article: Conceptual clustering)

Conceptual clustering is a modern variation of the classical approach, and derives from attempts to explain how knowledge is represented. In this approach, classes (clusters or entities) are generated by first formulating their conceptual descriptions and then classifying the entities according to the descriptions.

Conceptual clustering developed mainly during the 1980s, as a machine paradigm for unsupervised learning. It is distinguished from ordinary data clustering by generating a concept description for each generated category.

Categorization tasks in which category labels are provided to the learner for certain objects are referred to as supervised classification, supervised learning, or concept learning. Categorization tasks in which no labels are supplied are referred to as unsupervised classification, unsupervised learning, or data clustering. The task of supervised classification involves extracting information from the labeled examples that allows accurate prediction of class labels of future examples. This may involve the abstraction of a rule or concept relating observed object features to category labels, or it may not involve abstraction (e.g., exemplar models). The task of clustering involves recognizing inherent structure in a data set and grouping objects together by similarity into classes. It is thus a process of generating a classification structure.

Conceptual clustering is closely related to fuzzy set theory, in which objects may belong to one or more groups.

Prototype Theory

(see main article Prototype Theory)

Since the research by Eleanor Rosch and George Lakoff in the 1970s, categorization can also be viewed as the process of grouping things based on prototypes-the idea of necessary and sufficient conditions is almost never met in categories of naturally occurring things. It has also been suggested that categorization based on prototypes is the basis for human development, and that this learning relies on learning about the world via embodiment.

A cognitive approach accepts that natural categories are graded (they tend to be fuzzy at their boundaries) and inconsistent in the status of their constituent members.

Systems of categories are not objectively "out there" in the world but are rooted in people's experience. Conceptual categories are not identical for different cultures, or indeed, for every individual in the same culture.

Categories form part of a hierarchical structure when applied to such subjects as taxonomy in biological classification: higher level: life-form level, middle level: generic or genus level, and lower level: the species level. These can be distinguished by certain traits that put an item in its distinctive category. But even these can be arbitrary and are subject to revision.

Categories at the middle level are perceptually and conceptually the more salient. The generic level of a category tends to elicit the most responses and richest images and seems to be the psychologically basic level. Typical taxonomies in zoology for example exhibit categorization at the embodied level, with similarities leading to formulation of "higher" categories, and differences leading to differentiation within categories.

