Stanford-Binet Intelligence Scales, Fifth Edition ... - Faculty.utep.edu

Stanford-Binet Intelligence Scales, Fifth Edition [ 233 ]terpretation, and do have both fall and springnorms available. Within subtests, the layout ofitems on the page is easy to follow, and color isused generously to make the pages and items morevisually appealing.The option Ofversions (Forms DIE) havingLanguage subtests that emphasize elements of thewriting process might be of interest to some educators.Though these sub tests do include questionsabout topic selection, organizing and finding information,the "actual writing processes" are mostlymanifested as micro-editing type tasks, usingmultiple-choice format. No direct writing samplesare included in versions of the core tests. Anypotential user of the Stanford 10, or any otheracademic achievement series, should carefully reviewthe skills measured, as well as how they aremeasured, to determine the degree of fit to thelocal curriculum.The vertical equating allows for the scaledscores to be used as a way of tracking performanceacross grades. Users of the series need to understandthat median year-to-year increases in scaledscores diminish in size with increasing grade leveland can, with some subtests of the TASK, showlittle or no increase or even a decrease.Though scores may be dis aggregated to thecluster or standard level, only the cluster leveltends to have potentially sufficient items withinsubtests to warrant interpreting scores for individuals.There are typically too few items at thestandard level to recommend making decisionsabout specific examinees. For class, school, ordistrict-wide summaries, these disaggregated scoresmight be of interest. No reliability, error of measurement,or, beyond the general discussion ofcontent coverage and generation, validity informationis presented for cluster-level or standard-levelpercent correct summaries. The ersatz ThinkingSkills score is also made available without reliabilityinformation, though normative scores are given.SlJ1\tlMARY. The Stanford 10 can serve as avehicle for comparing local s.tudents' academicattainment to that of a national norms group intraditional subject areas. The subtests are easy toadminister, can be scored locally or sent to thepublisher, may be used for small or large groups ofexaminees (with appropriate proctoring assistance),include practice items, and practice tests may bepurchased. The utility of any achievement series,though, rests in large part on the extent to whichthe skills assessed· reflect outcomes important tothe user of the test scores.Anyone looking for a nationally non:nedmeasure that features "authentic" assessment ofstudent attainment or accomplishment beyond themultiple-choice framework of the core test will bedisappointed with the Stanford 10. However, forthose who judge the Stanford 10 to tap importantskills in an appropriate manner, the developershave succeeded in updating an achievement batterythat can help educators satisfY some of theseemingly endless assessment demands emergingat both the state and national levels.REVIEVVER'S REFERENCESAmerican Educational Research Association, American Psychqlogical Association,& National Council on Measurement in Education. (1999). Standardsforeducational andpsychological lming. Washington, DC: American Educational ResearchAssociation.Cronbach, L. J. (1990). Emntials ofpsycholog,url luting (5'" ed.). NewYork:Harper Collins Publishing.[233 ]Stanford-Binet Intelligence Scales, FifthEdition.·· .Purpose: Designed to assess "intelligence and cognitiveabilities.". Population: Ages 2-0 to 89-9. Publication Dates: 1916:"'2003. Acronym: SB5. .-Scores, 13: NonverbalFluid Reasoning, Verbal Fluid Reasoning, Nonverbal Knowledge, Verbal Knowledge, Nonverbal Qgantitative Reasoning, Verbal Qlantitative Reasoning, Nonverbal Visual-Spatial Processing, Verbal Visual-Spatial Processing, Nonverbal Working Memory, Verbal Working Memory, Nonverbal 10.. Verbal 10.. Full Scale IQ Subtests and Partial Batteries! Abbreviated Battery (Nonverbal Fluid Reasoning and Verbal Knowledge). Administration: Individual. Price Data: Available from publisher .. Time: (45-75) minutes; (15-20) minutes for Abbrevi .. ated Battery. Author: Gale H. Roid. Publisher: Riverside Publishing. Cross References: For information on an earlier edition, see T5:2485 (245 references) and T4:2553 (120 references); for reviews by Anne Anastasi and Lee J. Cronbach, see 10:342 (89 references); see also 9:1176(41 references), T3:2289' (203 references), 8:229 (176references), and T2:525 (428 references); for a reviewby David Freides, see 7:425 (258 references); for areview by Elizabeth D. Fraser and excerpted reviews byBenjamin Balinski, L. B. Birch, James Maxwell, MarieD. Neale, and Julian C. Stanley, see 6:536 (110 references);for reviews by Mary R. Haworth and Norman975

'1, Ii.jI[233] Stanford-Binet Intelligence Scales, Fifth Editionreferences); for a review by Boyd R. McCandless, see4:358 (142 references); see also 3:292 (217 references);; I Ifor excerpted reviews by Cyril Burt, Grace H. Kent, andI D. Sundberg of the second revision, see 5:413 (121II I!IM. Krugman, see 2:1420 (132 references); for reviews:i by Francis W. Maxfield, J. W. M. Rothney, and F. L.Wells, see 1:1062.II!Review rfthe Stanford-Binet Intelligence Scales,Fifth Edition by JUDY A. JOHNSON, AssistantProfessor rfPsychology in the School Psychology Programat the University rfHouston-Victoria, Victoria,Texas, and RlK CARL D'AMATO, Assistant Deanand Director rfthe Centerfor Collaborate Research inEducation, College rfEducation and Behavioral Sciences,M. Lucile Hamson Professor rf Excellence,School rfProfessionalPsychology, University rfNorthernColorado, Greeley, CO:DESCRIPTION. The Stanford-Binet IntelligenceScales, Fifth Edition (SB5) is the latestversion of one of our fundamental assessmentinstruments for which the original version wascreated almost 10Qyears ago. The SB5, whichtook 7 years -to com.plete, is the long-awaitedupdate of the Fourth Edition, which was publishedin 1986. The SB5, like its predecessors, is acomprehensive, nOlm-refer~nc:edjndividually administeredtest ofintelligence and cognitive abilities.The SB5 can be used with examinees whorange from 2 years old to over 8?years old. Thisexpansive age range has always been one of themost rem::lr~2te.fe::a!Ures ofthis instrument. Typicaluses of the SB5 include diagnosjngexceptionalities and developmental disabilities inadults, adolesceilts, and c4iLdn~n.«The scales yield a Full Scale IQtwo domainscores (Nonverbal IQand Verbal' 102, and fivefactor Indexes «Fluid Reasoning, Knowledge,Qyantitative Reasoning, Visual-Spatial Processing,and Working Memory). Unlike previous editions,the means of the IQand factor index scoresare.JQO.with a standard deviation ofJ5. The FullScaleIQis based on the administration of 10subtests and is the global.measur~gf c9gplfiveability. Each of the five factors are consideredusing both verbal and nonverbal measures. TheSB5 has 10 subtests including 2 special routingsubtests (Nonverbal-Fluid Reasoning: Object SerieslMatricesand Verbal-Knowledge: Vocabulary),which are administered at the beginning of theSB5. These routing subtests determine the developmentalstarting point for the remaining subtests.1Depending on the examinee's performance on therouting subtests, the examiner is directed to begin Iat Level 1, 2, 3, 4, or 5. This unique way of !addressing development is a longstanding tradiItion with the Binet scales.t.The SB5 subtt:st namt:::; and the activitiesthat make up the subtests for a specific level are ,listed below. The SB5 Nonverbal subtests includeNonverbal Knowledge (Picture Absurdities [Lev !els 4-6] and Procedural Knowledge [Levels 2-3]); IINonverbal Qyantitative Reasoning [Levels 2-6];Nonverbal Visual-Spatial Processing (Form Board[Levels 1-2] and Form Patterns [Levels J-:-6]);and Nonverbal Working Memory (Block Span[Levels 2-6] and Delayed Response [Level 1]).The Verbal subtests include Verbal Fluid Reasoning(Early Reasoning [Levels 2-3J, Verbal Absurdities[Level 4], and Verbal Analogies [Levels 56]); Verbal Qyantitative Reasoning [Levels 2-6J;Verbal Visual-Spatial Processing (Position andDirection [Levels 2-6]); and Verbal WorkingMemory (Memory for Sentences [Levels 2-3] andLast Word [Levels 4-6]). These subtests yield ascaled score with a mean of10 and with a standarddeviation of3. The SB5 also yields an AbbreviatedBattery IQthat is based on the two routing subtestsand can be used to supplement another battery oftests that has been administered or when a briefmeasure of intelligence is sufficient.The administration time of the SB5 variesfrom 15 to 75.minutes, depending on which scalesare gIven. Most of the SB5 items are not timedand time bonuses are not admissible. The estimatedtime to acquire a Full Scale IQis 45 to 75minutes, whereas the Abbreviated Battery IQtakes15 to 20 minutes to administer to a client. Further,the examiner can choose to administer only theVerbal IQ(based on the five Verbal subtests) orthe Nonverbal IQ(based on the five Nonverbalsubtests); each of these takes about 30 minutes tocomplete. The technical manual states the NonverbalIQcan be used for assessing those withhearing impairments, autism, communication dis-'orders, limited English-language backgrounds, andother areas where verbal ability is limited. This isa unique and helpful feature of the instrument.For ease of administration, the SB5 is organizedinto three item books printed on an easelformat that includes all administration directions.The easel format allows examinees to easily viewtest materials, but also allows the examiner toi%I976

Stanford-Binet Intelligence Scales, Fifth Edition [ 233 Jmance on the:cted to beginuque way ofanding tradi:1the activitiescific level are)tests includelI'dities [Lev-Levels 2-3]);:Levels 2-6];(Form Board:"evels 3-6]);I(Block Spane [Level 1]),luid Reason- ~'erbal Absurf!s [Levels 5- '- I:Levels 2-6]; i)osition and,al Workingrels 2-3] and Iltests yield a Ith a standardAbbreviated .~lting subtests,er battery ofi'lhen a brief! SB5 varieswruch scalese not timede. The estilis45 to 75eryIQtakes:nt. Further,ter only thesubtests) or: NonverbalI minutes to!s the Nonthosewithlication disrounds,andted. This isstrument.:B5 is orga::m an easel. directions.easily viewXanu.ner toJeasily record and score responses behind the easel.The first item book includes the two routing subtests,whereas the second item book contains Levels 1-6 ofthe Nonverbal subtests. The thlrd Item book containsLevels 2-6 of the Verbal subtests.A variety of scores are derived from the rawtest data. In addition to standard scores for eachcomposite score, subtest scaled scores, percentileranks, confidence intervals, age equivalents, andChange-Sensitive Scores may be computed.Change-Sensitive Scores (CSS), based on itemresponse theory, provide a means to identify aclient's change in scores over a period of time. TheSB5 may be hand scored, but scoring can be madeeasier by using the SB5 Scoring Pro. The SB5Scoring Pro is a Windows-based program thatallows the examiner to enter background information,age, and raw scores. The resulting reportincludes an extended score report and a brief,narrative summary report that, if desired, can beexported to a word processing file for editing.DEVELOPMENT. Several significantchanges were made from the SB4 to the SB5.Frrst, the SB5 was renormed on a large, representativesample that ranged from preschool age tomature adults. The SB5 made a significant changeto the structure of the test by adding anotherfactor, Working Memory. Working Memory.vasadded because it has been shown to be related toboth reading and math achievement, and, in fact,is a deficient area in many of those children andadults with learning problems. It is also a novelarea that is not covered on many traditional testsofintelligence. Another important change includedenhanced coverage of nonverbal intelligence. TheNonverbal 10.. unlike other cognitive measures, iscomposed of nonverbal items that cover the fivefactors ofFluid Reasoning, Knowledge, Q,tantitativeReasoning, Visual-Spatial Processing, andWorking Memory. Because the U.S. population isbecoming more and more diverse, increasing thenonverbal coverage (where items require no or aminimal verbal response) was an important change.Other changes in the new edition have includedupdated items that extend the scales upwardand downward as well as allowing assessmentofindividuals who display very high or verylow levels offunctioning. Materials have also been"revamped" to be more appealing to both examineesand examiners. The SB5 includes "childfriendly" toys and manipulatives that are appealingto younger clients. Examiners will also welcomethe easel-style item books and computerized scoringprogram; which can make the assessmentprocess more user friendly. Further, the examinerfriendlyrecord form includes directional arrows whensums are to be transferred to another area on therecord form, bold print to identify correct answers forthe examiner, as well as lightly printed areas, whichshow the total points possible on a specific subtest., The SB5 includes comprehensive technical(Roid, 2003c) and exanu.ner's manuals. The technicalmanual guides the examiner through the testdevelopment process, which is consistent with theStandardsfor Educational and Psychological Testing(American Educational Research Association,American Psychological Association, &- NationalCouncil on Measurement in Education, 1999). Inaddition, the technical manual highlights the historyof the measure, theoretical foundations, evidenceof reliability and validity, and provides detaileddescriptions of professional and ethical useof the SB5. Furtht;:r, the examiner's manual succeedsin familiarizing the examiner with test userqualifications, assessment of special populationswith the SB5, as well as key administration, scoring,and interpretive guidelines. A more detailedSB5 interpretive manual is available, but it isdisappointing that it must be purchased separately.[Editor's Note: The publisher advises thatbeginning in 2006 the manual may be purchasedas part of the kit.]TECHNICAL. The SB5 is based on the... CatleU-Born-CarrQU(GIIG) theory of cognitivefunctioning that has been investigated empiricallyover several decades (D'Amato, Fletcher-Janzen,&- Reynolds, in press). Several other recent intelligencetests have also been based on elements ofCHC theory, and studies on earlier versions oftheBinet scales revealed distinguishable CHC factors.The SB5 is composed of ~.factors (out of 10) oftheCHC model including (CHC factors in parentheses):Fluid Reasoning (Fluid Intelligence orGf), Knowledge (Crystallized Knowledge or Gc),Q,tantitative Reasoning (Q,tantitative Knowledgeor Gq), Visual-Spatial Processing (Visual Processingor Gv), and Working Memory (Short-TermMemory or Gsm).The selection of only five of the CHC factorsfor the SB5 was based on );:.esear~h of therelations ofthe factors to ach.ieye!11.~!1t, gi.fi,".

"';\ 'I11 the Binet scales. Three of the CHC factors on the\!I \ III111III \ 1III![233]· Stanford-Binet Intelligence Scales, Fifth EditionVerbal sub tests ranged from .~4J() ..89, ReliabilitySB5 (Gf, Gc, and Gq) have thehighestg loadings for the IQand Factor Index scores were computed in the model and are seen as key factors in general using "the formula for a reliability of a sum of reasoning ability. Further, the factors most predic multiple tests" (technical manual, p. 63). Average tive ofschool achievement (Gc, Gsm, Gq) are also reliability coeffici~nts for the Full Scale IQ(.98),) included on the SB5. Finally, Roid has stated in Nonverbal IQ.{.95.), Verbal IQS(.96), anCt'1lie the technical manual that these five factors have Abbreviated BatteryIQ(.91) were extr~me1y high. been identified in earlier editions of the Binet When reliability coefficients were computed for the scales. Overall, these reasons, as well as practical Factor Index scores, the average values included the considerations (ease of test administration, total following: Fluid Reasoning (.90), Knowledge (.92), testing time, factors measured by other tests, etc.), Q!,lantitative Reasoning (.92), Visual-Spatial Proled to the selection of five factors after eight CHC cessing (.92); and Working Memory (.91). factors were extensively researched for the SB5. Four studies of test-retest stability were inThe other two factors, Reading and Writing (Grw) eluded in the technical manual. In the first study, and Decision Speed/Reaction Time (Gt) were not 96 young children 2 to 5 years old were adminischosen for the SB5 because they are usually ineludedtered the SB5 on two different occasions. The in other batteries.Pearson correlations were corrected for sample The norming of the SB5 is one of the most variability. Test-retest correlations ranged from impressive aspects of the instrument. A sample of.22 t() .95 for the IQ scores (Full Scale IQ.4,800 ~rticipants, ranging from age 2to ()ver.8.?w-e:r~·=cl()~lY=tii1!tcQe9-.t().yariables-.in.2001.U.S._Census Bureau de>.~ul!!e,T1Js. Th(itratif!cati

litytedof1ge18),thegh.thethe'2),roranged from .74-.97 (median of .90) in the investigationsinto interscorer agreement on the SB5.These findings show adequate inters corer agreementon the SB5.Preliminary evidence o(£qntent:-related,'criterion...re1ated,concurrent, and construct-relatedvaliditY is presented in the SB5 technical manual.Ofcourse, validity ofthe SB5 will alsob~g~then:dafter publication as it is used in the field and inresearch studies. The test development ofthe SB5,Stanford-Binet Intelligence Scales, Fifth Edition [233)examiner's manual are limited. Further, it wouldbe helpful to see appropriate evidence-based interventionrecommendations generated from SB5data included with the test materials. For the SB5to have the most utility in educational settings,there needs to be additional research on how theresults of the SB5 can be linked to successfulinterventions. However, the strengths of the SB5. clearly outweigh its' weaknesses. The SB5 is anoutstanding measure with few significant weak.nesses. For those who require an individually administered,norm-referenced intelligence test, theSB5 is clearly an exceptional instrument. The SB5may again take its place as one of the seminal andgrimary measures ofintelligence in our field. Manyofthe unique features of the SB5 make it the idealchoice when selecting an instrument from the longlist of currently available intelligence tests.REVIEWERS'REFERENCESAmerican Educational Research Association, American Psychological Association,& National Council on Measurement in Education. (1999). Standards foreducational andpsych.logical /(Sting. Washington, DC: American Educational ResearchAssociation.D'Am.to, R.. C" Fletcher-Jamon, E., & Reynolds, C, R. (Eds.). (in press).The handoooJ. ofschool neuropsy,hology. Hoboken, NJ: John Wiley and Sons.wlll.9 w.il_~,~ 7 -ye:u process, underwent extensive'::*~f!.F~vie"').of iteIIls and subtests, numerousmpilot studi~s~ and reviews of the tryout edition.:iy,ISheDIe>mC2.,.he~stng}t(lditionauy, the SB5 was found to beJlighlyc:orrelat~~_with major cognitive tt:sts such as theweChsler scales and previous . editions of theStanford-Binet. Studies were conducted with specialpopulations (such as those classified as giftedor with mental retardation) and expected resultswere found using the SB5. Further, s()I1fif1I1!l:!()!Yfactor analyses of the SB5 subtests provided evidence'lor'afive-factor solution. Throughout the~sttest developmentpr()cess, the items of the SB5 were:Q_~yie:w~~Lfuf..;,fairnessJ1nrega!aleiliii!City;curtjije;~anare1igious .... b~ckground. Studies Fifth Edition byJOSEPH C. KUSH, Associate Pro76 ofitem-9.iia"test bli~were conducted and problematic fessor, Duquesne University, Pittsburgh, PA:!st I items were'1ffife'ted from each successive version. DESCRIPTION. The Stanford-Binet IntelgeOverall, the SB5 appears tohay~~ade.qY::t~~yalidity ligence Scales, Fifth Edition (SB5) is an individuforclients of a great variety of backgrounds..---.....,.. ally administered intelligence test designed fored~COMMENTS AND SUMMARY. The SB5 examinees between the ages of 2 and 85+ years.heis the long-awaited revision of the Stanford-Binet The test consists of five verbal and five nonverbal:st4 and the publication of the SB5 was worth the subtests. Verbal subtests can require individuals toIewait. The test development process followed the read, speak, and comprehend age-appropriateltsStandards for Educational and Psychological Testing English. Nonverbal subtests expect minimal ret(AERA, APA, & NCME, 1999) and resulted in ceptive language and additionally require fmea:lewell-designed, technically sound instrument that -motor coordination to manipulate toys and puzzleIdfollows in the footsteps of earlier editions of the pieces, and to be able to point to correct answers.$,Binet scales but also integrated new research on The SB5 utilizes very few time limits, and bonusest.. intelligence into the measure. Especially impres for speeded performance are not given. Unlike pre;..~dsive is the structure ofthe new test that now includes vious versions, however, the SB5 now utilizes aIeworking memory, a neuropsychological area that will metric common to all other major tests of intellibe:lSespecially useful in assessing those with learning gence: a mean 0[100 and a standard deviation of15.tproblems. Additionally, the e;,q>anded emphasis on Administration and scoring. The test begins~-nonverbal intelligence will be useful in the assess by presenting participants With two routing testslement of a variety of clients in oUr changing world. that are used to determine the proper starting level1.The scoring software will also reduce clerical errors for the remainder ofthe subtests. The first routingLSand save examiners"a great deal of time.sub test, Vocabulary, has been used in all previousnOfcourse, improvements could be made on editions ofthe SB scale. Beginning with the Fifth'0the SB5. The interpretative manual is not in Edition, the nonverbal subtest Object SerieslMa:"s,cluded with the test materials and the interpreta trices has been added as a second routing subtest.Ittive data in the SB5 technical manual and The Full Scale IQBattery normally takes betweeniI\)979

[233] Stanford-Binet Intelligence Scales, Fifth Edition45 and 75 minutes to administer, the Verbal andNonverbal IQ scales each take approximately 30minutes to administer, and the Abbreviated Batterytakes between 15 and 20 minutes to administer.Consistent with guidelines established bythe American Psychological Association (APA,2000), the SB5 examiner's manual describes userqualifications as assuming that all test users "havethe college and/or graduate-level training in generalmeasurement and statistical concepts essentialfor understanding test scores" (p. 8). Additionally,"All test users should have a thoroughunderstanding of the standardized administrationprocedures for the SB5 and the scoringprocedures for calculating accurate raw scores,subtest scaled scores, and all other scores on theSB5 Record Form." Finally, "Supervised administrationshould include a sufficient number ofpractice cases to establish reliable, standardizedtesting skills. Typically, supervised testing iscompleted as part of training workshops orgraduate-level testing courses" (examiner'smanual, p. 8).In considering changes from earlier editionsof the instrument, the SB5 examiner's manualindicates that beyond "a general modernization ofartwork and item content", the SB5 has: (a) addeda fifth factor to the scale (Visual-Spatial Processing);(b) increased the number of toys and coloredmanipulatives, primarily used by young children;(c) increased the nonverbal content of the instrument;(d) added new items that measure very lowand very high functioning; (e) redesigned therecord form and item books for easier administrationand interpretation; and (f) expanded the normsto include the measurement of elderly examinees'abilities. Additionally, the two routing tests can becombined to calculate an abbreviated IQ(ABIQ2,and the SB5 is linked to the Woodcock-JohnsonIII Tests ofAchievement, an addition designed toenhance the identification o'f students with learningdisabilities. The number of subtests containedon the SB5 has also been reduced from 15 (SB4)to 10. This reduction now gUarantees that individualsof all ages will complete identical subtestsreflecting an important improvement over the SB4;critics of the SB4 ,¥ere troubled by the fact thatdepending on their ages and performance on therouting test, two different individuals might producea Full Scale IQ that was derived from a. different combination of the 15 subtests.DEVELOPMENT. The author of the SB5indicated that the development of the instrumentwas heavily influenced by the theoretical work ofCarroll (1993). The SB5 was constructed on afive-factor hierarchical model of human intelligence.The hierarchy flows from overall, Full ScaleIQ or g, to a second level consisting of twodomains (Verbal and Nonverbal) and five factors(Fluid Reasoning [FR], Knowledge [KN], QgantitativeReasoning [QR], Visual-Spatial Process. ing [VS], Working Memory [WM]) to a thirdlevel consisting of 10 subtests, then a fourth levelconsisting of five to six testlets per subtest, andfinally to the individual item level. Although theCattell-Hom-Carroll (CHC) theory identifies 8to 10 factors believed to comprise human intelligence,the author of the SB5 selected the fivefactors that required no specialized timing or testapparatus and that were thought to be most heavilyrelated to school achievement. Additionally, itappears that the content ofthe memory factor nowplaces greater emphasis on working memory thanon the often-criticized emphasis on short-termmemory found on the SB4. One theoretical inconsistencyof the SB5, however, relates to thedecision to include the verbal and nonverbal domains.As presented on page 25 of the examiner'smanual "subtests are combined into either one ofthe two domains or· one of the five factor indexes.At the most general level, either the two domainsor the five factor indexes combine to form the FullScale IQ(FSIQ2." It is not clear why an instrumentso heavily influenced by CHC theory wouldinclude verbal and nonverbal domains as neither iscontained asa CHC Stratum I (Narrow) or StratumII (Broad) ability. Certainly a test that includesa verbal/nonverbal dichotomy offers clinicaladvantages for certain types of referral questions(e.g., motor-impairment, limited English proficiency);however, the decision to retain this terminology istheoretically inconsistent. As a result, it is not clearwhen users of the SB5 should attempt to make testinterpretations based on the two verbal/nonverbaldomains or instead on the five factor scores.TECHNICAL.Standardization and norms. Careful attentionwas given to the standardization of the SB5and the norms will generalize to most segments ofthe United States population. The sample wasmatched to percentages of the stratification variablesin the most recent United States CensusII980

Stanford-Binet Intelligence Scales, Fifth Edition [233]15ntofalIerors1srde1td1e8liTestly 'itwLfim11e):s.1S111ltdis11al1Sr);isarstal1Sl-IS(2001). The norming sample consisted ofa total of4,800 individuals who were stratified into 30 agegroups, by gender, and by race/ethnicity (White orAnglo American, Black or Mrican American,Hispanic [Latino or Spanish], American Indianand Alaskan Native, Asian, and Native Hawaiianor Other Pacific Islander). Additionally, the categoryof Other included 2.7% of the standardizationsample, consistent with the percentage includedin the U.S. Census and typically describedas individuals of mlxed origins.Two final SB5 stratification variables includedgeographic region (Northeast, Midwest,South, and West) and socioeconomic level, definedas educational attainment. For adults, educationalattainment was defined as the number ofyears of completed education; for children underage 18, educational attainment was defined as thenumber of years of education completed by parentsor guardians. Although not included as stratificationvariables, community size, type of schoolattended, and special education or clinical treatmentwere recorded and included in the SB5technical manual. Additionally, 11% of the SB5normative sample attended parochial or privateschools and 6.8% received special services. Approximately2% of the sample were identified asintellectually gifted students. Finally, the technicalmanual reports a series of studies in which 1,365students receiving special education or clinicalservice were oversampled. Although the numberof students included in each of these separatestudies was relatively small, their resulting scoreprofiles were typical for their special educationclassification (e.g., learning disabled, attentiondeficit disorder) providing a promising base forfuture research that exa~ines the characteristics ofexceptional populations. This future research willalso extend the important, but preliminary, findingspresented in the technical manual that wereconducted on test items for possible gender, ethnic,religious, and socioeconomic test bias.Like the SB4~ when over- or underrepresentationofindividuals occurred in particular normativecategories, the imbalance was adjustedthrough a weighting procedure. Although this is acommon practice for many current psychologicaltests, the practice can magnify sampling error.This approach introduces error that consists ofsystematic bias, in addition to naturally occurringrandom error, thus, the sample is less representativeof the true target population. In this approachone individual's prome is given more weight torepresent a larger proportion of minority promesin the sample.Reliability. The SB5 technical manual providesconsiderable evidence in support of the reliabilityof scores from the instrument. Internalconsistency reliability coefficients ranged from .95to .98 for IQscores and from .90 to .92 for eachof the five Factor Index scores. Spearman-Browncorrected, split-half reliabilities are also reportedfor each of the subtests, 1(1. and Factor Indexscores by age. Again, all reliability coefficientswere quite high and appropriate for an instrumentof this magnitude. As would be expected due toincreased length, Full Scale IQreliability exceededFactor Index reliability, which in turn exceededindividual subtest reliability. Subtest reliabilitieswere also strong with an average of .84 to .89reported for the 10 individual subtests.Standard errors of measurement are also reportedfor IO§ and Factor Indexes across age levels.The overall SEMs of the SB5 across age levels are arespectable 2.30 for Full Scale 1(1.3.26 for Nonverbal1(1. and 3.05 for Verbal IQ The technicalmanual also does a nice job ofdescribing the conceptof SEM as well as appropriate suggestions for interpretation,including the use of confidence intervals.Additionally, the technical manual reportsthat Rasch modeling techniques, a one-parameterlogistic item response theory (IRT) model, wereused to estimate item difficulty, examinee ability(based on the Rasch W metric), and test precision(information and standard errors at each abilitylevel). Using ,the Rasch model, items from previouseditions, all previous SB4 items, and newlycreated SB5 items were formed to create a calibrateditem bank. Items were in turn formed intothe five cognitive index scales. Roid reports in the2003 interpretive manual (p. 20) that "Each scaleshowed excellent fit to the one-parameter Raschmodel." However, the specific criteria for fit arenot reported. Although the Rasch model certainlyis theoretically preferred in its model simplicityand its resultant unit weighted items, the criterionthat all items must be equally discriminating isquite stringent for ability tests. In reporting thetest information curves of rehited change-sensitivescores (CSS), Roid writes "the SB5 CSS providehigh levels ofprecision (high information) throughoutthe average age-equivalent range of the test981

[233] Stanford-Binet Intelligence Scales, Fifth Edition~,1-. ',>,',1(CSS values of 430 to 520, ...). Also the shapes ofthe curves show the greatest precision in the advancedlevels of performance ..., an excellent attributefor a test such as the SB5 that is widely11 used in gifted screening" (technical manual, p. 68).An examination of the CSS curves presented inthe technical manual supports this conclusion but11also reveals that corresponding SEMs are notequivalent across ability levels (CSS). The presentationof unequal precision across ability reflects aIdistinct advantage of IRT over classical standarderrors of measurement. Although it appears thatthe SB5 does produce tighter SEMs and, thusmore precise ability estimates for gifted examinees,less precision is offered for examinees at thelower end of the ability continuum: children andadolescents with cognitive impairments. Such afinding is not atypical in ability measurement.However, given that the IRT findings indicateunequal SEMs across ability levels, future researchmight report actual SB5 SEMs across ability levels(rather than by age alone, see Figure 3.2 in technicalmanual), so that psychologists and educatorswho work with exceptional populations will havemore accurate information for making diagnosticand placement decisions.Differential Item Functioning (DIF) wasdetected using the Mantel-Haensze1 approach.This approach is appropriate only if the Raschmodel fits the data, further justifYing the need formore specific reporting of fit statistics. That is, ifitems do not fit the Rasch model, it does not makesense to compare poorly estimated item difficulties,even using observed score data. Furthermore, moredetail is needed regarding the DIF analysis, includingwhether purification was used, as per Hollandand Thayer (1988), the criterion for DIF determination,and the sample sizes for each analysis.Test-retest reliability is also reported in aseries of studies across four age groups (Ns = 96,87, 81, and 92). The amount of time between testsessions ranged from 1 to 39 days with a medianof 5, 8, 7, and 7 days, respectively. Reported testretestcoefficients (as well as coefficients correctedfor range restriction) were good, with correctedtest-retest coefficients ranging from .66 (NonverbalWorking Memory at ages 21 to 59) to .93(Verbal Knowledge at ages 21 to 59). The testreteststudies reported for the SB5 are much improvedover the SB4. Unfortunately, however, onecritique of SB4 test-retest studies at the youngerHilages of the instrument (Cronbach, 1989) continuedto be ignored, "Far too little was invested inretest studies .... Retests with a change of examinersshould be made on 100 cases at each earlyage and at spaced later ages" (p. 774). i\lthoughthis is a stringent requirement for any test pub ,)lisher, it is hoped that future independent research Istudies will address this issue. For the most part,SB5 coefficients reflect credible stability for theSB5 for intervals of time between testing sessionsof up to approximately 1 month; future researchshould examine SB5 consistency across longerintervals of time and across ethnic groups andclinical populations, with a particular focus on thestability of the instrument for young children.Finally, the SB5 technical manual indicatesthat numerous interrater reliability studies wereperformed during the initial tryout and standardizationphases of the instrument. Items that demonstratedpoor interrater agreement at that timewere eliminated from the final published editionof the scale. The only study included in the technicalmanual, following the final publication, reportedthat a single pair of examiners each rescoredselected subtests of 120 protocols. Specifically,each of two new examiners rescored polychotomous(scored 0, 1, or 2) items and compared theirscoring with the results of the original standardizationexaminer. Interscorer agreement rangedfrom .95 to .98 across the Vocabulary routing test;Picture Absurdities correlations ranged from .90to .97; Verbal Absurdities yielded correlations of.82 to .89; and Form Patterns test-retest correlationsranged from .87 to .94. Although theseresults reflect a promising beginning for the instrument,additional research should examine inter-rateragreement across all sub tests.Validity. In an initial attempt to examine thecriterion validity of the SB5, the technical manualdescribes a study in which 104 individuals receivedthe SB5 and the SB4 in counterbalanced order.The correlation between Full Scale scores was .90,representing good criterion-related evidence ofvalidity. Consistent with the Flynn Effect (Flynn,1985, 1987), the SB5 Mean Full Scale Score waslower than the SB4 Mean Composite Score (SB5= 107.9, SB4 = 111.4). Sirnilarresults were foundin a second study that compared the Full ScaleScores of the SB5 and the SB L-M (r '" .85).Additional evidence of criterion-related validitywas found in studies that compared the SB5 withI982

Stanford -Binet Intelligence Scales, Fifth Edition [233]tinlinlmarly19hubrchart,theDnsrchgermdthe.tesererdmmelon:hre:edUy,toeu:d;ed:st;90ofla~senn-hetal .,ed~r.10, of the Wechsler Preschool and Primary Scale ofIntelligence-Revised (WPPSI-R) (r == .83); theWechsler Intelligence Scale for Children-ThirdEdition (WISC-III) (r:: .84); the Wechsler AdultIntelligence Scale-Third Edition (WAIS-III) (r:: .82); and the vVoodcock-Johnson III Tests ofCognitive Abilities (r .78).Next, t~e SB5 technical manual presents aseries of stuclies that compare the instrument withtests of academic achievement. Two studies arepresented that compare the SB5 vvith the Woodcock-JohnsonIII Tests of Achievement and theWechsler Individual Achievement Test-II. Theresulting pattern of correlations is quite variedwith coefficients ranging from .33 to .84. Dependingupon the pragmatic orientation of the user,these correlations will either support or not supportthe utilization of the instrument. If, for example,the SB5 (and most other commerciallyavailable IQtests) is perceived as a predictor ofschool success, then the SB5-Achievement correlationswill be seen quite favorably. The SB5 isclearly highly correlated with tests of academicachievement and users of the SB5 will be able tomake accurate predictions about the academicperformance ofstudents who complete the test. Incontrast, users of the SB5 who are instead lookingfot a measure of "pure" intelligence will interpretmany of the correlations with achievement as"too high" (e.g., SB5-WJ III Reading Comprehension,r :: .84; SB5-WJ III Math Reasoning,r = .80; SB5-WIAT-II Math, r =.79)and will argue that the SB5 is too heavily achievement-loaded.With correlations in the .80 range,approximately two-thirds of the informationcontained on the SB5 and tests of academicachievement reflects shared variance or overlappingcontent, a figure that. may be too high forinstruments thought to be measuring related yetdiscrete constructs.Finally, a series of studies is described insupport of the construct validity of the SB5, AI ...though the SB4 was based on a four-factor model,the SB5 was constructed on a five-factor hierarchicalmodel. All SB5 subtests, across all ages,demonstrated average principal component loadingsofgreater than :70 on the g, or general factor,indicating that each subtest was a good measure ofg. The proportion of SB5 variance accounted forby the g factor ranged from 56% to 61%, dependingon the factoring method. These percentagesare slightly higher than found on the SB4 but arecomparable for other current IQtests.Conftrmatory (CFA) analyses were also performedin an attempt to provide further supportfor the construct validity ofthe SB5. The technicalmanual indicates that a CFA, conducted on fiveage groups from the SB5 normative sample, examinedone- through five-factor models and indicatedthat the ftve-factor model yielded the best fttwhen compared to other models in the analysis. Inexamining these results it is important to note thatin order for the CFA analysis to be performed, thetest author were forced to split each subtest in halfso that 20 variables could be analyzed. Withoutthis adaptation, the test author did not have anidentified modeL Additionally, the CFA ftt statisticspresented in the technical manual are not asgood as would be desired according to some measurementstandards (Hu & Bentler, 1998, 1999).Finally, it is unclear why a hierarchical model wasexplicitly hypothesized yet a hierarchical modelCFA was not performed.To their credit, the developers of the SB5attempted to provide an empirically supportedmethod for interpretive analyses. The presentationofthe two-stage clustering technique contained inthe interpretive manual provides users with 10core profiles identified in the SB5 standardizationsample. These patterns of subtest profiles allowusers a normative comparison from which interpretivehypotheses can be generated, in contrast toa purely speculative "armchair" approach that looksat an individual's pattern of subtest strengths andweaknesses in isolation. Additional informationabout the cluster analysis, however, would assistusers in the development of more accurate interpretations.Specifically, additional justificationcould be provided as to why a two-stage ratherthan a three-stage analysis was performed(McDermott, 1998). Additionally, it is not clear(a) why an average linkage method was utilizedinstead of Ward's technique; (b) why only onestopping rule (profiles identifjring less than 5% ofthe population were dropped) rather than multiplestopping rules was employed; (c) how many caseswere relocated in the second stage of the analysis;and (d) what standard deviatio,n values correspondto the 10 identified core proftles. Future researchshould address these questions as well as examinethe stability and utility of these profiles in bothregular and exceptional populations.983

[ 234 ] STAR Math®, Version 2.0.1SUMMARY. The publication of the newestrevision ofthis well-established test ofintelligencecontinues an almost 100-year-old tradition ofevolutionand refinement. Despite some technical andstatistical limitations (e.g., lower stability for youngchildren and individuals with low cognitive abilities,problematically high correlations with achievement,uncertain factor structure) the SB5 offersimportant improvements over the previous versionofthe scale and remains one ofthe premier instrumentsfor the assessment of cognitive abilities ofchildren, adolescents, and adults.REVIEWER'S REFERENCESAmerican Psychologkal Association. (2000). Report

Stanford-Binet Intelligence Scales, Fifth Edition ... - Faculty.utep.edu

Create successful ePaper yourself

Delete template?

Save as template?