Def<strong>in</strong><strong>in</strong>g a test as "a systematic method of elicit<strong>in</strong>g per<strong>for</strong>mance which is <strong>in</strong>tendedto be the basis <strong>for</strong> some sort of decision mak<strong>in</strong>g" (Skehan, 1998, p. 153), Skehannotes the tendency of testers to place an emphasis on "care and standardization <strong>in</strong>assessment <strong>in</strong> the belief that such methods of exam<strong>in</strong><strong>in</strong>g per<strong>for</strong>mance will have moreto contribute to reliable measurement than <strong>in</strong><strong>for</strong>mal assessment by people who may bevery familiar with particular language users" (Skehan, 1998, p. 153). This attitudefollows on from the assumption that "there are knowable best ways of learn<strong>in</strong>g andthat these can be discovered us<strong>in</strong>g a scientific method which has long been discardedby contemporary philosophers (Popper), scientists (Medawar) and physicists(Heisenberg)" (Harri-Augste<strong>in</strong> & Thomas, 1991, p. 7), and has been at the heart oflanguage test<strong>in</strong>g from its "pre-scientific" stage (Spolsky, 1975, p. 148), to itspsychometric-structuralist "scientific" stage (when discrete-po<strong>in</strong>t test<strong>in</strong>grepresented the accepted behaviorist truth). Accord<strong>in</strong>g to this view, language can belearned by study<strong>in</strong>g its parts <strong>in</strong> isolation, acquisition of these parts can be tested andwill successfully predict per<strong>for</strong>mance levels, and the learner will somehow reconstructthe parts <strong>in</strong> mean<strong>in</strong>gful situations when necessary. This view cont<strong>in</strong>ued <strong>in</strong>to the"psychol<strong>in</strong>guistic-sociol<strong>in</strong>guistic" stage (the 1970's), when <strong>in</strong>tegrative test<strong>in</strong>g (e.g.cloze tests and dictation) claimed to come from a sounder theoretical base (Oller,1979) but was shown by commentators such as Alderson (1981), Morrow (1979) andCarroll (1981, p. 9) to be still concerned with usage rather than use, there<strong>for</strong>e be<strong>in</strong>gonly <strong>in</strong>direct tests of potential efficiency. Kelly (1978, pp. 245-246) also po<strong>in</strong>ted outthat it is possible to develop proficiency <strong>in</strong> the <strong>in</strong>tegrative test itself, and that <strong>in</strong>directtests cannot diagnose specific areas of difficulty <strong>in</strong> relation to the authentic task. Suchtests can only supply <strong>in</strong><strong>for</strong>mation on a candidate's l<strong>in</strong>guistic competence, and havenoth<strong>in</strong>g to offer <strong>in</strong> terms of per<strong>for</strong>mance ability (Weir, 1998).A consensus that "knowledge of the elements of a language <strong>in</strong> fact counts <strong>for</strong>noth<strong>in</strong>g unless the user is able to comb<strong>in</strong>e them <strong>in</strong> new and appropriate ways to meetthe l<strong>in</strong>guistic demands of the situation <strong>in</strong> which he wishes to use the language"(Morrow, 1979, p. 145), and an acknowledgement that the easily quantifiable, reliable,and efficient data obta<strong>in</strong>ed from discrete (and cloze) test<strong>in</strong>g implies that proficiency isneatly quantifiable <strong>in</strong> such a fashion (Oller, 1979, p. 212), led to a perception that theability to per<strong>for</strong>m should be tested <strong>in</strong> a specified socio-l<strong>in</strong>guistic sett<strong>in</strong>g. Based onwork by Hymes (1972), Canale & Swa<strong>in</strong> (1980), and Morrow (1979), the emphasisshifted from l<strong>in</strong>guistic accuracy to the ability to function effectively through language<strong>in</strong> particular contexts of situation (a demonstration of competence and of the ability touse this competence), and communicative test<strong>in</strong>g was adopted as a means ofassess<strong>in</strong>g language acquisition (though with some lack of <strong>in</strong>itial agreement or direction,cf. McClean 1995, p. 137; Benson, 1991).The components of communicative language ability to be tested were variouslydescribed at this time (Canale & Swa<strong>in</strong>, 1980; Canale, 1983), and early frameworks<strong>for</strong> test<strong>in</strong>g communicative competence were proposed. However, these were neitherpractical, systematic, nor comprehensive (cf. Cziko, 1984), and were unable toadvance prediction and generalization <strong>in</strong> any substantial way. This problem wasaddressed by Bachman (1990) through the application of categories to real contexts,and resulted <strong>in</strong> a model of oral test<strong>in</strong>g which was: i) more detailed <strong>in</strong> its specificationof component language competences; ii) more precise <strong>in</strong> the <strong>in</strong>terrelationshipsbetween the different component competences; iii) more grounded <strong>in</strong> contemporaryl<strong>in</strong>guistic theory; and iv) more empirically based, allow<strong>in</strong>g a more effective mapp<strong>in</strong>g ofcomponents of competence on to language use situations, and more pr<strong>in</strong>cipledcomparisons of those components. Despite these improvements, however, Bachman'smodel still lacked a "rationale founded <strong>in</strong> psychol<strong>in</strong>guistic mechanisms and processes(and research f<strong>in</strong>d<strong>in</strong>gs) which can enable [it to] make functional statements about the
nature of per<strong>for</strong>mance and the way it is grounded <strong>in</strong> competence" (Skehan, 1998, p.164). Skehan (1988) articulated the dilemma of communicative language test<strong>in</strong>g at theend of the 1980s:What we need is a theory which guides and predicts how an underly<strong>in</strong>gcommunicative competence is manifested <strong>in</strong> actual per<strong>for</strong>mance; how situationsare related to one another, how competence can be assessed by examples ofper<strong>for</strong>mance on actual tests; what components communicative competenceactually has; and how these <strong>in</strong>terrelate. S<strong>in</strong>ce such def<strong>in</strong>itive theories do notexist, testers have to do the best they can with such theories as are available.(Skehan, 1988, cited <strong>in</strong> Weir, 1998, p. 7)2. Task-based oral test<strong>in</strong>gBachman's (1990) model used familiar empirical research methods <strong>in</strong> which datawas perceived <strong>in</strong> terms of the underly<strong>in</strong>g structural model, to <strong>in</strong>fer abilities, via astatic picture of proficiency, based on the assumption that there arecompetence-oriented underly<strong>in</strong>g abilities made up of different <strong>in</strong>teract<strong>in</strong>g components(cf. Canale & Swa<strong>in</strong>, 1980). However, cognitive theory shows that second languageper<strong>for</strong>mers, faced with a develop<strong>in</strong>g <strong>in</strong>ter-language and per<strong>for</strong>mance pressures suchas fluency, accuracy and complexity, do not draw upon "a generalized and stableunderly<strong>in</strong>g competence", (Skehan, 1998, p. 169) but allocate limited process<strong>in</strong>gattention <strong>in</strong> appropriate ways, draw<strong>in</strong>g on parallel cod<strong>in</strong>g systems <strong>for</strong> efficiency ofreal-time communication. Skehan there<strong>for</strong>e proposed a construct of "ability <strong>for</strong> use",which would allow a process<strong>in</strong>g competence to operate and to be assessed, andadvocated the use of tasks as a central unit with<strong>in</strong> a test<strong>in</strong>g context (Skehan, 1998, p.169). In contrast to per<strong>for</strong>mance evaluations which use “reliable” analytic scales (<strong>in</strong>areas such as grammar, vocabulary, fluency, appropriateness, and pronunciation) butwhich do not allow <strong>for</strong> affect and <strong>for</strong> compet<strong>in</strong>g demands on attention, a process<strong>in</strong>gapproach <strong>in</strong> a task-based framework allows generalizations to be made on the threebasic language-sampl<strong>in</strong>g issues of: i) fluency; ii) breadth/complexity of languageused; and iii) accuracy (Skehan, 1998, p. 177), though these criteria compete <strong>for</strong>process<strong>in</strong>g resources <strong>in</strong> the per<strong>for</strong>mer, and the score may be <strong>in</strong>fluenced by whicheverprocess<strong>in</strong>g goals are emphasized by him/her.While advocat<strong>in</strong>g tasks as the basic unit of oral test<strong>in</strong>g (c.f. Lee, 1991), Skehannotes that "we need to know more about the way tasks themselves <strong>in</strong>fluence (andconstra<strong>in</strong>) per<strong>for</strong>mance" (Skehan, 1998, p. 169), and that tasks also need to be rated<strong>in</strong> terms of plann<strong>in</strong>g, time pressure, modality, stakes, opportunity <strong>for</strong> control,manufactured surprise, and degree of support, s<strong>in</strong>ce these factors will also affect theoutcome. Task per<strong>for</strong>mance conditions and the way these affect per<strong>for</strong>mancerepresent "a fertile area <strong>for</strong> research" (Skehan, 1998, p. 177).3. <strong>Authentic</strong> assessmentKohonen (1999) extended Skehan's task-based framework, propos<strong>in</strong>g "authenticassessment" as a process-oriented means of evaluat<strong>in</strong>g communicative competence,cognitive abilities and affective learn<strong>in</strong>g (Kohonen, 1999, p. 284; cf. Hart, 1994, p. 9;O'Malley & Pierce, 1996, pp. x-6), us<strong>in</strong>g reflective <strong>for</strong>ms of assessment <strong>in</strong><strong>in</strong>structionally-relevant classroom activities (communicative per<strong>for</strong>mance assessment,language portfolios and self-assessment), and focus<strong>in</strong>g on curriculum goals,enhancement of <strong>in</strong>dividual competence and <strong>in</strong>tegration of <strong>in</strong>struction and assessment.In this two-way process, "the essentially <strong>in</strong>teractive nature of learn<strong>in</strong>g is extended tothe process of assessment" (Williams & Burden, 1997, p. 42), exam<strong>in</strong><strong>in</strong>g what