Mixing Memory: Concepts II: Prototypes

With the downfall of of the classical theory after more than 2000 years, there was a need for an entirely new approach to concepts. The new approach would have to account for the same things that the classical theory was thought to explain (classification, category learning, concept representation), but also several newly discovered empirical phenomena, including:

Typicality effects: Some members of a category are treated as "better" members than others. This effect embodies the insight of Wittgenstein and others that concepts may have a family resemblance structure.
Fuzzy boundaries: Some explanation of why category membership is not always an either-or relationship.
Relationships between features: Eleanor Rosch, using her research on typicality effects and the hierarchical organization of categories¹, argued that our concepts must represent the structure of the world. She wrote²:
The world is structured because real-world attributes do not occur independently of one another. Creatures with features are likely to wings then creatures with fur, and objects with the visual apperance of chairs are more likely to have functional sit-on-ableness than objects with the appearance of cats. That is combinations of attributes of real objects do not occur uniformly. Some pairs, triples, or n-tuples are quite probable , appearing in combination sometimes with one, sometimes with another attribute; others are rare; others logically cannot or empirically donot occur.
In other words, concept representations must capture the causal, thematic, and co-occurence relationships between features that are present in the structure of the world. Concepts must "carve the world at its joints."
Contrasts between categories: Concepts are not learned or represented in isolation. Instead, they are born within a web of concepts, and the relationships between concepts can influence the representation of individual concepts, as well as the classification of individual exemplars as being members of one concept or another³.

The solution, first proposed by Rosch, was to treat concepts as "clusters in a vast high-dimensional similarity space that were devised to maximize the similarity within a cluster and minimize the similarity between clusters"⁴. This marked the beginning of the era of similarity-based theories of concepts. The fundamental idea behind all similarity-based theories is that concepts can be reprsented by a set of features, and the membership of exmplars in a particular category can be determined by measuring the similarity (through a comparison of the features of the exemplar and the category) between an exemplar and the category representation. This incredibly powerful idea allows us to account for all of the phenomena listed above, and many others. The question, then, becomes how do we represent the features of a concept? If we represent concepts as collections of features and members as clusters in a similarity space (in which the dimensions are defined by the features, e.g.), then there are two easy ways to represent concepts: as a central tendency of any one cluster, or as the collection of exemplars that make up the two clusters. The first similarity-based theories took the first approach, and are usually called prototype theories. Because they were first chronologically, and because I like them more, I'll deal with them first. The next post will be on the second type of similarity-based approach, which are usually called exemplar theories.

Prototype Theories

Prototype theories are pretty easy to conceptualize. In essence, they consist of an average of all of the (thusfar observed) exemplars of a category. Unlike definitions (under most views), prototype representations are dynamic, in the sense that for each new exemplar we classify as a member of a category, the average will change. In most prototype models, features in the prototype will be weighted, either by salience (a very vague term) or by the frequency with which they occur in category members. This provides an easy way to understand how concepts are learned. Exemplars are classified as members of a category by comparing them to the prototype. Similarity will be determined by a feature match in which the feature weights figure into the similarity calculation, with more salient or frequent features contributing more to similarity. The similarity calculation might be described by an equation like the following⁵:

S_j = S_i (w_i.v(i,j))

In this equation, S_j represents the similarity of exemplar j to a prototype, w_i represents the weight of feature i, and v(i,j) represents the degree to which exemplar j exhibits feature i. Exemplars that reach a required level of similarity with the prototype will be classified as members of the category, and those fail to reach that level will not.

This characterization allows us to easily account for typicality effects. Because none of the features of a prototype are necessary for category membership, exemplars that exhibit some, but not all of the features may reach a level of similarity with the prototype sufficient for classification. A family resemblance structure will then arise naturally, because while all members of a category will share features with the prototype, not all members will share features with each other. It also allows us to encode the objective structure of categories in the world, because prototypes will represent the features that co-occur most frequently. Finally allows us to explain the effects that relationships between categories can have on classification. An exemplar that, under some circumstances, might be most similar to the prototype of category A, will, in a context in which category B is present, turn out to be more similar to category B.

Since prototype theories provided a clear and powerful account of the phenomena discovered with natural categories, it became important to test it more thoroughly with more finely controlled stimuli. This required the construction of artificial stimuli (usually dot patterns, or some other similar structure). Using such stimuli, researchers produced several findings that are consistent with prototype theory. For instance, participants are slower to respond to, and make more errors when processing exemplars that are far from the prototype than exemplars that are close to it, in a similarity space. They also classify the prototype itself as easily (meaning as rapidly and with as few errors) as exemplars, even when they had not seen the prototype during the learning phase⁶. In fact,after a long delay (days or a week), participants actually performed better when classifying the prototype (and were more likely to recognize it as having been in the learning phase, even when it hadn't been) than exemplars they had actually learned.

Even though prototype theories acheived a great deal of empirical success early on, they still suffered from important problems. As averages, prototypes tend to abstract away the variance within categories, which may be important for capturing the full structure of objects in the world. In fact, because they are only averages, while feature co-occurence relationships across concepts will be encoded in their representation, more specific co-occurence relationships between features may not. Malt and Smith⁷ showed that features correlations like "wooden" and "large" within the category SPOON are not reliably captured by prototype theories. In addition, if concepts are represented by prototypes, then we would expect that the typicality gradien of a conceptual combination (e.g., pet fish) would be a function of the constituents. However, Osherson and Smith⁸ showed that the typicality of the two exemplars did not reliably predict the typicality of their combination. For example, while cats and trout may be typical members of the categories PETS and FISH, gold fish (which are fairly atypical members of the category "fish") are highly typical members of the category PET FISH.

One of the more interesting flaws in prototype theories is their failure to account for ad hoc categories. These are categories constructed on the fly, so to speak, rather than permanently stored representations⁹. Examples of ad hoc categoires include things to take out of the house in case of a fire or things to take on a picnic. These categories seem to share many of the important features of ordinary categories, such as a typicality gradient, but the members appear to be clustered around an ideal rather than a set of common features. If ad hoc categories are constructed and represented in a way similar to that of ordinary concepts, as they appear to be, then their existence is a problem for prototype theories.

Prototypes also have a problem with natural kind concepts. Bats are members of the natural category MAMMAL, but a feature comparison will lead to the classification of bats as members of the category BIRD in most cases. This is because the concept MAMMAL, and other natural kind concepts, appear to be linearly separable. In other words, all concepts share a value on one dimension (in the case of biological kinds, this value is usually a genetic or parental one).

Finally, there are cases in which similarity and category membership are dissociated. In an excellent set of studies, Lance Rips¹⁰ showed that in some cases, people will rate an exemplar as being more similar to the prototype of one category, but as more likely to belong to another category. One set of contrast categories used by Rips involved quarters and pizzas. When presented with a circle of a size midway between the prototypical quarter and pizza sizes, participants rated the circle as more similar to quarters. However, when asked to which category, QUARTER or PIZZA, the circle was more likely to belong, they invariably answered PIZZA. This is because quarters are of a fixed size. Once again, QUARTER appears to be a linearly separable category, with all members having the same value on the size dimension, and prototype theories are unable to deal with this.

Some researchers made attempts to save prototype theory from these problems and others by proposing dual process accounts that involved a prototype and a classification procedure specific to a concept¹¹, but many others decided to develop entirely new types of similarity-based theories. The most prominent of these are the exemplar theories, of which there are several. The next post will discuss some of these.

¹ More on what this means in a subsequent post.
² Rosch, E.H. (1973). Natural categories. Cognitive Psychology, 4, 328-350.
³ Rosch, E., & Lloyd, B.B. (1978). Principles of categorization. In E. Rosch & B.B. Lloyd (eds.), Cognition and Categorization. Lawrence Erlbaum: Hillsdale, NJ.
⁴ Sloman, S.A., Lagnado, D.A. (2005). The problem of induction. In K. Holyoak & R. Morrison (eds.), The Cambridge Handbook of Thinking and Reasoning. Cambridge University Press.
⁵ From Hampton, J.A. (1979). Polymorphous concept in semantic memory. Journal of Verbal Learning and Verbal Behavior, 18, 441-461.
⁶ Posner, M.I., & Keele, S.W. (1968). On the genesis of abstract ideas. Journal of Experimental Psychology, 77, 353-363.
⁷ Malt, B., & Smith, E.E. (1984). Correlated properties in natural categories. Journal of Verbal Learning and Verbal Behavior, 23, 250-269.
⁸ Osherson, D.N., & Smith, E.E. (1981). On the adequacy of prototype theory as a theory of concepts. Cognition, 11, 35-58.
⁹Barsalou, L.W. (1983). Ad hoc categories. Memory and Cognition, 11, 211-217.
¹⁰ Rips, L.J., & Collins, A. (1993). Categories and resemblance. Journal of Experimental Psychology: General, 122, 468-486.
¹¹ E.g., Smith, E.E., Medin, D., & Rips, L. (1984). A psychological approach to concepts: Comments on Rey's 'Concepts and Stereotypes'. Cognition, 17, 265-274.

7 comments:

Anonymous said...: This is quite interesting as now you are talking about the stuff I do at work. I develop a fair bit of indexing and linguistic software. The various vector space models (variations on what you call Prototypes) are quite good and powerful for certain applications, but suffer as you said from being overly broad and general. Thus you almost always have to use them in conjunction with a more explicit rule based ontology system - constructed in terms of definitions and exclusions and typically in terms of other concepts.

However what seems rarely done, and what I can't see as being easy to do, is to capture the "web-like" nature of concepts. That's what I was getting at by bringing up Saussure and semiotics. We all recognize that such matters occur, but being explicit about it seems quite difficult.

Posted by Clark; 2/08/2005 10:57 AM
Anonymous said...: Clark, you're right, it's not easy, and it's rarely done. Lakoff and Johnson try to do it by treating concepts as metaphorical, but they end up with a theory that's empirically absurd and, for all intents and purposes, largely falsified by experimental evidence.

The problem is really in how to test theories of the interconnections between concepts. Concept research with natural categories is too messy, given the techniques we have now, to really develop a sophisticated picture of the ways in which concepts are related. Most expeirments utilize artificial categories, and these don't tend to have a lot of structure, and producing a set with a rich enough structure to test hypotheses about the interrelations between concepts would be tough.

We can do it with words, and there are programs like LSA that involve multidimensional spaces in which the distance (measured by a vector) between two points, which represent concepts or words, represents how related they are. That's a pretty powerful way to embody some empirical phenomena (like co-occurrence), and it even has the ability to provide some insight into relationships between meaning, but ultimately it's limited. It does, however, seem to capture some of the insights of Sausurre, in that concepts are defined by the differences in their position in a space.

There is one theory of concepts which is in its infancy in which some (perhaps all, I don't know) concepts are defined by their position in a relational system. To consider an oversimplified example, the concept game would be defined by its position in the relation play(x,y) where any value filling the variable x is considered a player and any value filling the variable y is considered a game (it can be read as x plays y). You can then contrast this with other concepts, such as job, which would be defined by the relation works at(x,y), where x now defines worker and y defines job. I'll talk more about this view in the last post on concepts, which may take me til Sunday to get to.

Posted by Chris; 2/08/2005 11:19 AM
Anonymous said...: I've written programs that do what you mention in multidimensional space. But as a practical tool they often just aren't as useful as you'd think. It is almost always better to have rules. The problem is that such relations are, as I mentioned before, simply too broad. I'd note, doing some armchair philosophizing, that we seem to have the ability to distinguish between broad, loose connections and more specific ones. While I'd not ascribe scientific implications to it, it seems that how we actually do things like remember based upon stimulus can't be done in this fashion.

I suspect that rather than a broad connection what is going on is something more like a Wiki encyclopedia, only with far more hyperlinks than you see in most internet encyclopedias. I think there are some indications, from what I've read, that dreaming in ones sleep helps forms these links. This would then capture some of what Umberto Eco claims about thinking being more akin to an encyclopedia rather than a dictionary.

Posted by clark; 2/08/2005 12:34 PM
Anonymous said...: I agree with you about the LSA programs. To be honest, I don't really like any spatial models, which is why, ultimately, I dislike connectionism (connectionist models are, at base, just spatial models).

I also agree with you about the encyclopedic nature of thought. After I deal with exemplar models (and I really dislike exemplar models), I'm going to talk about the theory theory, which argues that our concepts are structured like theories, with causal, thematic, and other types of relational information that simply aren't captured by the feature models that classical, prototype, and exemplar theories represent. Out of the theory theory came the types of categories that I described in the last comment, which are defined by their position in a relational system. The implication of that sort of theory would be that our knowledge is structured in such a way that everything is related by its role in a relational system in which all (or at least significant subsets) of our concepts participate. It's like saying you can't define game unless you know a.) what the play relation is, and how it differs from other relations (like work), and b.) how game fits into the play relation.

Posted by Chris; 2/08/2005 12:41 PM
Anonymous said...: Clark, I'd be interested in knowing what programming language you use and if you can point to any actual coding examples ....
Dave; 7/17/2007 6:40 PM
Anonymous said...: You can find the latest and most Footwear tide of information, including NFL Jerseys, MBT shoes, nike tn, and other related content.; 2/01/2010 8:30 PM
Anonymous said...: discount Ed Hardy Sunglasses discount Ed Hardy Sunglasses discount Ed Hardy Bags discount Ed Hardy Bags discount Ed Hardy SPECIALS discount Ed Hardy SPECIALS discount Ed Hardy Belts discount Ed Hardy Belts womens christian audigier hoodies womens christian audigier hoodies mens christian audigier hoodies mens christian audigier hoodies womens christian audigier t shirt womens christian audigier t shirt mens christian audigier t shirt mens christian audigier t shirt; 3/25/2010 3:55 AM

Sunday, February 06, 2005

Concepts II: Prototypes

7 comments: