11.07.2015 Views

Dimension-Wide vs. Exemplar-Specific Attention in Category ...

Dimension-Wide vs. Exemplar-Specific Attention in Category ...

Dimension-Wide vs. Exemplar-Specific Attention in Category ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

distributed to the non-rule dimensions than exemplarsrepresent<strong>in</strong>g rule-follow<strong>in</strong>g items. Moreover,as predicted more attention was distributed to thenon-rule dimensions for the exemplar encod<strong>in</strong>g categoryB’s exception than for the exemplar encod<strong>in</strong>gcategory A’s exception.There were more errors <strong>in</strong>volv<strong>in</strong>g the category Bexception than the category A exception becausemore rule-follow<strong>in</strong>g items <strong>in</strong> category A were similarto the category B exception. Such errors were accentuated<strong>in</strong> ESSW-ALCOVE, and the category B exceptionresulted <strong>in</strong> more uniform attention. ESSW-ALCOVE predicts a memory advantage for the categoryB exception because it is differentiated frommany similar rule-follow<strong>in</strong>g items from category A.Filtration and CondensationIncorporat<strong>in</strong>g exemplar-specific attention and accentuatederrors allowed ALCOVE to predict previouslychalleng<strong>in</strong>g data. It is crucial that a modelthat <strong>in</strong>corporates new mechanisms can still accountfor other basic psychological phenomena. Exam<strong>in</strong><strong>in</strong>gfiltration and condensation tasks (Gottwald &Garner, 1975; Kruschke, 1993, Matsuka, <strong>in</strong> press) isimportant because these tasks <strong>in</strong>vestigate how humansallocate attention.Humans (and ALCOVE) f<strong>in</strong>d it easier to learn filtrationtasks, <strong>in</strong> which <strong>in</strong>formation from only onedimension is required for perfect classification, thancondensation tasks, <strong>in</strong> which <strong>in</strong>formation from two(or more) dimensions is needed (e.g., Kruschke,1993). This filtration advantage was predicted byESSW-ALCOVE (and ES-ALCOVE). In filtration,all exemplars attend to the predictive dimension.This leads to <strong>in</strong>creased psychological distances betweencategory A and category B members. In condensation,some items are closer to the oppos<strong>in</strong>g category’sexemplars even with exemplar-specific attention,and ESSW-ALCOVE has less “confirmations”from some exemplars. ESSW-ALCOVE f<strong>in</strong>ds it easierto learn the filtration task, <strong>in</strong> which it receivesmore “evidence” for correct category membership.DiscussionHumans can flexibly attend to different dimensionsof an item depend<strong>in</strong>g on the values of the dimensionsthat are not critical for classification (Aha &Goldstone, 1992). Such flexible attention allowedES-ALCOVE to differentiate exceptions from rulefollow<strong>in</strong>gitems and predict a memory advantage forexceptions. In ES-ALCOVE, each exemplar selectedwhich dimensions to attend to. ES-ALCOVE attendedto the non-rule dimensions of exemplars encod<strong>in</strong>gexceptions but to the rule dimension of exemplarsencod<strong>in</strong>g rule-follow<strong>in</strong>g items. This differentialattention made exceptions dist<strong>in</strong>ctive <strong>in</strong> memory.However, ES-ALCOVE was unable to accountfor the f<strong>in</strong>d<strong>in</strong>g that memory for a violat<strong>in</strong>g itemis stronger when the violated structure is stronger.This f<strong>in</strong>d<strong>in</strong>g was predicted by ESSW-ALCOVE,which accentuated errors. With errors raised tothe 10th power, ESSW-ALCOVE dist<strong>in</strong>guished importanterrors (e.g., miss-classification) from trivialones (e.g., correct classification with 90% confidencelevel). ESSW-ALCOVE learned attentionmore rapidly <strong>in</strong> response to larger errors and m<strong>in</strong>imizedthe impact of smaller errors. A similar effectcan be obta<strong>in</strong>ed by updat<strong>in</strong>g the attention weightsmultiple times on each tra<strong>in</strong><strong>in</strong>g trial (e.g., Kruschke,2001). ESSW-ALCOVE better remembered itemsthat violated a stronger rule because those itemswere associated with “important” errors.In addition to the exception memory f<strong>in</strong>d<strong>in</strong>gs reviewed<strong>in</strong> this paper, ESSW-ALCOVE was ableto predict a filtration advantage. We should notethat ALCOVE that accentuated errors withoutexemplar-specific attention was unable to predict theexception memory f<strong>in</strong>d<strong>in</strong>gs. Thus, ALCOVE needsboth exemplar-specific attention and accentuated errorsto account for all of the f<strong>in</strong>d<strong>in</strong>gs described <strong>in</strong>this paper. One question to ask is whether thesemechanisms are also required by other models.Cluster<strong>in</strong>g ApproachSUSTAIN (Love et al., 2004), which uses dimensionwideattention, can predict the memory advantagefor exceptions as well as the greater memory advantagefor exceptions that violate a stronger rule. SUS-TAIN clusters together similar items and recruits anew cluster <strong>in</strong> response to a prediction error. SUS-TAIN develops rule-follow<strong>in</strong>g clusters and shifts attentionto the rule dimension. All clusters sharethe same attention along a dimension. When anexception item elicits a prediction error, SUSTAINrecruits an additional cluster to encode the item.While rule-follow<strong>in</strong>g items tend to cluster with oneanother, each exception item will be isolated <strong>in</strong> itsown cluster. This differential storage makes exceptionsmore dist<strong>in</strong>ctive <strong>in</strong> memory.SUSTAIN also predicts better recognition forthe category B exception that violates more rulefollow<strong>in</strong>gitems from category A. A prediction erroroccurs when SUSTAIN attempts to cluster togetherhighly similar items from compet<strong>in</strong>g categories.The exception clusters brought about sucherrors by attract<strong>in</strong>g rule-follow<strong>in</strong>g items from theoppos<strong>in</strong>g category. Because there were more categoryA rule-follow<strong>in</strong>g items, there were more opportunitiesfor such errors <strong>in</strong>volv<strong>in</strong>g the category Bexception to occur, and a greater number of categoryA rule-follow<strong>in</strong>g clusters were recruited. Theseclusters formed a highly contrastive backdrop for thecategory B exception and enhanced recognition.As ESSW-ALCOVE, SUSTAIN treats “important”and “negligible” errors differently. Large discrepanciesbetween target and predicted output valuesresult <strong>in</strong> prediction errors. SUSTAIN recruits


new clusters <strong>in</strong> response to such errors but “ignores”small discrepancies.SUSTAIN’s success <strong>in</strong> predict<strong>in</strong>g these results us<strong>in</strong>gdimension-wide attention suggests that attentionmechanisms <strong>in</strong>teract with <strong>in</strong>ternal representations ofa model (cf., Matsuka, <strong>in</strong> press). Cluster<strong>in</strong>g allowsSUSTAIN to be sensitive to rule violation withoutexemplar-specific attention. In contrast, by stor<strong>in</strong>gevery studied item, ALCOVE needs exemplarspecificattention to capture the rule-violat<strong>in</strong>g natureof exceptions.Future DirectionA more flexible model may use dimension-wide orexemplar-specific attention depend<strong>in</strong>g on a giventask. For example, humans may <strong>in</strong>itially be biasedto attend to the same dimensions for all exemplars,analogous to a prior, but over time optimize learn<strong>in</strong>gby utiliz<strong>in</strong>g a separate attention for certa<strong>in</strong> items.In learn<strong>in</strong>g about the rule-plus-exception categorystructures, humans may <strong>in</strong>itially attend to the ruledimension for all the items. When an exception appearsand a prediction error occurs, a separate attentionmay be used for the exception. Such processessuggest that humans have simplicity bias (cf., Matsuka,2004). For example, filtration tasks may result<strong>in</strong> dimension-wide attention, whereas condensationtasks may lead to exemplar-specific attention. Futurework should exam<strong>in</strong>e when exemplar-specific ordimension-wide attention is more appropriate.AcknowledgmentsThis work was supported by AFOSR Grant FA9550-04-1-0226 to B.C. Love, and by the James McDonnellFoundation and the National Science Foundation(EIA-0205178).ReferencesAha, D. W., & Goldstone, R. L. (1992). Conceptlearn<strong>in</strong>g and flexible weight<strong>in</strong>g. In Proceed<strong>in</strong>gs ofthe 14th Annual Conference of the Cognitive ScienceSociety (pp. 534–539). Hillsdale, New Jersey:Lawrence Erlbaum Associates.Allen, S. W., & Brooks, L. R. (1991). Specializ<strong>in</strong>gthe operation of an explicit rule. Journal of ExperimentalPsychology: General, 120, 3–19.Barsalou, L. W., & Med<strong>in</strong>, D. L. (1986). Concepts:Static dimensions or context-dependent representations?Cahiers de Psychologie Cognitive, 6, 187–202.Erikson, M. A., & Kruschke, J, K. (1998). Rulesand exemplars <strong>in</strong> category learn<strong>in</strong>g. Journal ofExperimental Psychology: General, 127, 107–140.Gottwald, R. L., & Garner, W. R. (1975). Filter<strong>in</strong>gand condensation tasks with <strong>in</strong>tegral and separabledimensions. Perception & Psychophysics, 2,50–55.Kruschke, J. K. (1992). ALCOVE: An exemplarbasedconnectionist model of category learn<strong>in</strong>g.Psychological Review, 99, 22–44.Kruschke, J. K. (1993). Three pr<strong>in</strong>ciples for modelsof category learn<strong>in</strong>g. In G. V. Nakamura, R.Taraban, & D. L. Med<strong>in</strong> (Eds.) Categorization byhuman and mach<strong>in</strong>es: The psychology of learn<strong>in</strong>gand motivation (Vol. 29, pp. 57–90). San Diego,CA: Academic Press.Kruschke, J. K. (2001). Toward a unified model ofattention <strong>in</strong> associative learn<strong>in</strong>g. Journal of MathematicalPsychology, 45, 812–863.Lewandowsky, S., Kalish, M., & Ngang, S. K. (2002).Simplified learn<strong>in</strong>g <strong>in</strong> complex situations: Knowledgepartition<strong>in</strong>g <strong>in</strong> function learn<strong>in</strong>g. Journal ofExperimental Psychology: General, 131, 163–193.Love, B. C., Med<strong>in</strong>, D. L., & Gureckis, T. M. (2004).SUSTAIN: A Network Model of Human <strong>Category</strong>Learn<strong>in</strong>g. Psychological Review. In press.Matsuka, T. (<strong>in</strong> press). Generalized exploratorymodel of category learn<strong>in</strong>g. International Journalof Computational Intelligence.Matsuka, T. (2004). Biased stochastic learn<strong>in</strong>g <strong>in</strong>computational model of category learn<strong>in</strong>g. In Proceed<strong>in</strong>gsof the 26th Annual Conference of the CognitiveScience Society.Med<strong>in</strong>, D. L., & Schaffer, M. M. (1978). Contexttheory of classification learn<strong>in</strong>g. Psychological Review,85, 207–238.Nosofsky, R. M. (1986). <strong>Attention</strong>, similarity,and the identification-categorization relationship.Journal of Experimental Psychology: General,115, 39–57.Nosofsky, R. M., Palmeri, T. J., & McK<strong>in</strong>ley, S. C.(1994). Rule-plus-exception model of classificationlearn<strong>in</strong>g. Psychological Review, 101(1), 53–79.Palmeri, T. J., & Nosofsky, R. M. (1995). Recognitionmemory for exceptions to the category rule.Journal of Experimental Psychology: Learn<strong>in</strong>g,Memory, & Cognition, 21, 548–568.P<strong>in</strong>ker, S. (1991). Rules of language. Science, 253,530–535.Rojahn, K., & Pettigrew, T. F. (1992). Memoryfor schema-relevant <strong>in</strong>formation: A meta-analyticresolution. British Journal of Social Psychology,31(2), 81–109.Sakamoto, Y., & Love, B. C. (<strong>in</strong> press). Schematic<strong>in</strong>fluences on category learn<strong>in</strong>g and recognitionmemory. Journal of Experimental Psychology:General.Smith, E. E., & Langston, C., & Nisbett, R. E.(1992). The case for rules <strong>in</strong> reason<strong>in</strong>g. CognitiveScience, 16, 1–40.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!