11.07.2015 Views

What is the Proper Role of Assessment in Education

What is the Proper Role of Assessment in Education

What is the Proper Role of Assessment in Education

SHOW MORE
SHOW LESS
  • No tags were found...

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Are <strong>Education</strong>al Tests Inherently Evil?Stephen G. SireciUniversity <strong>of</strong> MassachusettsPresidential Column for February 2007 Issue <strong>of</strong> <strong>the</strong> NERA Researcher(see http://www.nera-education.org/)Tests are given for many reasons <strong>in</strong> <strong>the</strong> educational system. Many <strong>of</strong> <strong>the</strong>se reasons arehated. For example, tests are major components <strong>in</strong> accountability systems that may haveundesirable consequences for teachers, schools, or d<strong>is</strong>tricts. Tests are also sometimes used as arequirement for someth<strong>in</strong>g, such as high school graduation, scholarship, or eligibility toparticipate <strong>in</strong> collegiate athletics. In <strong>the</strong>se <strong>in</strong>stances, tests are <strong>of</strong>ten seen as a hurdle to overcomeor as an unnecessary roadblock to an <strong>in</strong>herent right. Tests are also commonly used to assigngrades to students, particularly beyond elementary school. Given <strong>the</strong>se purposes, who couldpossibly like tests? The answer <strong>is</strong> hardly anyone, perhaps only <strong>the</strong> relatively few high achieverswho enjoy a challenge or <strong>the</strong> opportunity to show what <strong>the</strong>y (we?) can do.Why are tests so widespread if <strong>the</strong>y are so hated? Is it <strong>the</strong> same reason we have<strong>in</strong>telligent design and global warm<strong>in</strong>g? Of course not. The reason <strong>is</strong> that educational tests, ifdeveloped carefully, used properly, and <strong>in</strong>terpreted appropriately, have enormous utility. Assoon as all sides <strong>of</strong> <strong>the</strong> educational community acknowledge that fact, we can make progresstoward a common goal <strong>of</strong> us<strong>in</strong>g assessments to improve student learn<strong>in</strong>g.To properly understand educational tests, particularly <strong>the</strong>ir benefits and limitations, wemust consider <strong>the</strong>ir use <strong>in</strong> specific situations. In th<strong>is</strong> column, I d<strong>is</strong>cuss common perceptions andm<strong>is</strong>conceptions <strong>of</strong> educational tests and <strong>the</strong> role tests currently play <strong>in</strong> federal and state educationreform efforts. A primary goal <strong>of</strong> th<strong>is</strong> d<strong>is</strong>cussion <strong>is</strong> to bridge <strong>the</strong> gap between proponents andopponents <strong>of</strong> standardized test<strong>in</strong>g so that we can work toge<strong>the</strong>r to improve student learn<strong>in</strong>g.Why Are Tests So Ubiquitous <strong>in</strong> <strong>Education</strong>?A popular, but <strong>in</strong>correct, myth <strong>is</strong> that educational are pushed by <strong>the</strong> extreme right <strong>of</strong> <strong>the</strong>political spectrum. Th<strong>is</strong> perception <strong>is</strong> simply false. Although <strong>the</strong> No Child Left Beh<strong>in</strong>d Act(NCLB) was proposed and signed by “he-who-must-not-be-named,” it was really an extension <strong>of</strong>Cl<strong>in</strong>ton’s Goals 2000: Educate America leg<strong>is</strong>lation, which was an extension <strong>of</strong> he-who-mustnot-be-named’sfa<strong>the</strong>r’s America 2000 leg<strong>is</strong>lation. Thus, educational reform and accountabilitymovements <strong>in</strong>volv<strong>in</strong>g test<strong>in</strong>g are one <strong>of</strong> <strong>the</strong> few bipart<strong>is</strong>an areas <strong>of</strong> leg<strong>is</strong>lation we have seen over<strong>the</strong> past several decades. There are, <strong>of</strong> course, strong differences <strong>in</strong> educational policy betweenDemocrats and Republicans, such as <strong>the</strong> f<strong>in</strong>anc<strong>in</strong>g <strong>of</strong> education, but it <strong>is</strong> important to note that<strong>the</strong> NCLB Act was sponsored by Democrat Ted Kennedy and Republican Judd Gregg <strong>in</strong> <strong>the</strong>Senate and Democrat George Miller and Republican John Boehner <strong>in</strong> <strong>the</strong> House. It passedoverwhelm<strong>in</strong>gly <strong>in</strong> both (87-10 <strong>in</strong> <strong>the</strong> Senate and 381-41 <strong>in</strong> <strong>the</strong> House).


Why do federal leg<strong>is</strong>lators agree that mandated test<strong>in</strong>g <strong>is</strong> an important part <strong>of</strong> educationreform? There are several reasons. First, assessment <strong>is</strong> seen as a critical component <strong>in</strong> <strong>the</strong>educational process. In fact, quality education requires cont<strong>in</strong>uous <strong>in</strong>teraction among <strong>in</strong>struction,curriculum, and assessment. Good <strong>in</strong>struction starts with good curricula and both <strong>in</strong>fluence eacho<strong>the</strong>r. The development <strong>of</strong> curricula at <strong>the</strong> d<strong>is</strong>trict and state levels <strong>is</strong> certa<strong>in</strong>ly <strong>in</strong>fluenced bywhat teachers teach <strong>in</strong> <strong>the</strong>ir classrooms. As <strong>the</strong> curricula are developed, teach<strong>in</strong>g practiceschange accord<strong>in</strong>gly. <strong>Assessment</strong>s are needed to d<strong>is</strong>cover what students are learn<strong>in</strong>g. Based onthat <strong>in</strong>formation, changes to <strong>in</strong>struction and curricula occur. My colleague Ron Hambletonrefers to th<strong>is</strong> dynamic <strong>in</strong>teraction as <strong>the</strong> curriculum-assessment-<strong>in</strong>struction cycle, which <strong>is</strong>d<strong>is</strong>played <strong>in</strong> Figure 1. Alignment <strong>of</strong> <strong>the</strong>se three components <strong>of</strong> <strong>the</strong> educational process <strong>is</strong>necessary for quality <strong>in</strong>struction.Figure 1The Curriculum-Instruction-<strong>Assessment</strong> CycleCurriculumInstruction<strong>Assessment</strong>A second reason tests play a prom<strong>in</strong>ent role <strong>in</strong> federal and state education reformmovements <strong>is</strong> that <strong>the</strong>y are an effective means for quickly chang<strong>in</strong>g <strong>in</strong>structional practices. AsMcDonnell (2004) described “although standardized tests are primarily measurement tools toobta<strong>in</strong> <strong>in</strong>formation about student and school performance, <strong>the</strong>y are also strategies for pursu<strong>in</strong>g avariety <strong>of</strong> political goals” (p. 2). McDonnell also po<strong>in</strong>ts out that <strong>the</strong>re are few alternativesavailable to policy makers to enforce <strong>the</strong>ir educational policies. As she put it “Test<strong>in</strong>g’s strongappeal <strong>is</strong> largely attributable to <strong>the</strong> lack <strong>of</strong> alternative policy strategies that fit <strong>the</strong> uniquecircumstances <strong>of</strong> public school<strong>in</strong>g…Standardized tests are one <strong>of</strong> <strong>the</strong> few, albeit <strong>in</strong>complete,ways to measure outcomes <strong>of</strong> teach<strong>in</strong>g” (p. 9).A third reason mandated test<strong>in</strong>g <strong>is</strong> a key component <strong>of</strong> education reform <strong>is</strong> that it forceseducators to align <strong>the</strong>ir <strong>in</strong>struction with state curriculum frameworks. No teacher likes to beoverly constra<strong>in</strong>ed regard<strong>in</strong>g what she or he should teach. However, no one wants teachersspend<strong>in</strong>g large amounts <strong>of</strong> <strong>in</strong>structional time teach<strong>in</strong>g knowledge and skills that most wouldconsider unimportant, relative to o<strong>the</strong>r skills. Thus, education <strong>in</strong>volves consensus about whatshould be taught. The development <strong>of</strong> curriculum frameworks without a means for assess<strong>in</strong>ghow well students master <strong>the</strong> objectives with<strong>in</strong> <strong>the</strong>m would create a situation <strong>in</strong> which <strong>the</strong> goodwork done <strong>in</strong> develop<strong>in</strong>g <strong>the</strong> frameworks could be simply ignored.Critics <strong>of</strong> state-mandated test<strong>in</strong>g argue that <strong>the</strong>se tests narrow <strong>the</strong> curriculum and forceteach<strong>in</strong>g-to-<strong>the</strong> test. Proponents counter that <strong>the</strong> tests are aligned with curriculum frameworks,which were developed through a consensus process, and so teach<strong>in</strong>g to <strong>the</strong> test <strong>is</strong> teach<strong>in</strong>g to <strong>the</strong>frameworks. As <strong>in</strong> most debates, <strong>the</strong> truth probably lies somewhere <strong>in</strong> <strong>the</strong> middle. Never<strong>the</strong>less,


it <strong>is</strong> important to bear <strong>in</strong> m<strong>in</strong>d that <strong>the</strong> idea beh<strong>in</strong>d consensus statewide curriculum frameworksand tests designed to measure <strong>the</strong>m <strong>is</strong> a noble one, because its goal <strong>is</strong> to improve <strong>in</strong>struction. Asa parent, I can understand th<strong>is</strong> position. After all, I want to know that my sons’ teachers areteach<strong>in</strong>g <strong>the</strong> important knowledge and skills <strong>the</strong>y will need to succeed personally andacademically. State curriculum frameworks and tests designed to measure <strong>the</strong>m attempt toensure that what <strong>is</strong> taught <strong>is</strong> important. Thus, <strong>the</strong>y aim toward facilitat<strong>in</strong>g quality <strong>in</strong>struction, asdepicted <strong>in</strong> Figure 1.Focus<strong>in</strong>g on Test UseOne <strong>of</strong> <strong>the</strong> greatest challenges we experience as educational researchers <strong>is</strong> ask<strong>in</strong>g <strong>the</strong>right research questions. With respect to educational test<strong>in</strong>g, <strong>the</strong> questions we ask should bespecific to us<strong>in</strong>g a test for a particular purpose. Thus, questions motivat<strong>in</strong>g research <strong>in</strong> th<strong>is</strong> areashould not be “Is <strong>the</strong> test bad?” or “Is <strong>the</strong> test fair?” but ra<strong>the</strong>r “Will <strong>the</strong> results <strong>of</strong> th<strong>is</strong> testprovide <strong>the</strong> <strong>in</strong>formation it <strong>is</strong> designed to produce?” Thus, evaluat<strong>in</strong>g a test means evaluat<strong>in</strong>g <strong>the</strong>use <strong>of</strong> a test for a particular purpose. Tests are not <strong>in</strong>herently “good” or <strong>in</strong>herently “bad,” butus<strong>in</strong>g a test for some purpose could be ei<strong>the</strong>r, depend<strong>in</strong>g on what <strong>the</strong> test was designed to doversus what it <strong>is</strong> used for. Th<strong>is</strong> notion <strong>is</strong> clear <strong>in</strong> <strong>the</strong> def<strong>in</strong>ition <strong>of</strong> validity presented <strong>in</strong> <strong>the</strong>Standards for <strong>Education</strong>al and Psychological Test<strong>in</strong>g (American <strong>Education</strong>al ResearchAssociation, American Psychological Association, & National Council on Measurement <strong>in</strong><strong>Education</strong>, 1999):Validity refers to <strong>the</strong> degree to which evidence and <strong>the</strong>ory support <strong>the</strong> <strong>in</strong>terpretations <strong>of</strong>test scores entailed by proposed uses <strong>of</strong> tests. (p. 9)As <strong>is</strong> evident <strong>in</strong> th<strong>is</strong> def<strong>in</strong>ition, it <strong>is</strong> not a test that <strong>is</strong> validated per se, but <strong>the</strong> use <strong>of</strong> a testfor a particular purpose. Defend<strong>in</strong>g <strong>in</strong>ferences derived from test scores <strong>in</strong>volves both qualitativeevidence based on <strong>the</strong>ories <strong>of</strong> what <strong>is</strong> be<strong>in</strong>g measured and quantitative evidence <strong>in</strong>dicat<strong>in</strong>g <strong>the</strong>scores reflect <strong>the</strong> measured attribute. Thus, all educational researchers can contribute to researchon test<strong>in</strong>g, regardless <strong>of</strong> <strong>the</strong>ir particular research orientation. I will not d<strong>is</strong>cuss specific meansfor validat<strong>in</strong>g <strong>in</strong>ferences <strong>in</strong> th<strong>is</strong> column (see Kane, 1992 or Sireci, 2005 for examples). Instead, Ifocus on test use <strong>in</strong> educational test<strong>in</strong>g and how it can help or hurt <strong>the</strong> educational process.How are tests used <strong>in</strong> education? Teachers use tests to measure how well students grasp<strong>the</strong> material taught (e.g., classroom tests). Counselors use tests to diagnose students’ strengthsand weaknesses and make referrals for remediation, advanced courses, or o<strong>the</strong>r placementdec<strong>is</strong>ions. Policy makers use tests to evaluate teachers, schools, d<strong>is</strong>tricts, states, countries, andvarious educational programs. Tests are also used as one criterion for high school graduationand for o<strong>the</strong>r types <strong>of</strong> certification such as an honors diploma, and for adm<strong>is</strong>sions <strong>in</strong>topostsecondary and graduate education. As <strong>the</strong> stakes associated with educational tests <strong>in</strong>crease,such as <strong>in</strong> <strong>the</strong> cases <strong>of</strong> grant<strong>in</strong>g a high school diploma or evaluat<strong>in</strong>g <strong>the</strong> performance <strong>of</strong> particularteachers, <strong>the</strong> critic<strong>is</strong>ms also <strong>in</strong>crease. And <strong>the</strong>y should <strong>in</strong>crease. If a test <strong>is</strong> used to make a “big”dec<strong>is</strong>ion, <strong>the</strong> use <strong>of</strong> <strong>the</strong> test for that purpose should be supported by “big” evidence. Thus, aseducational researchers, our assessment research activities should be focused on ask<strong>in</strong>g <strong>the</strong> rightquestions about test use (e.g., Is <strong>the</strong>re evidence to support use <strong>of</strong> th<strong>is</strong> test as a high schoolgraduation requirement?).


I am a psychometrician work<strong>in</strong>g <strong>in</strong> educational measurement and so it <strong>is</strong> pretty obviousthat I must believe <strong>in</strong> <strong>the</strong> usefulness <strong>of</strong> educational tests. However, my strong belief <strong>in</strong> <strong>the</strong>utility <strong>of</strong> educational tests stems not from my psychometric tra<strong>in</strong><strong>in</strong>g, but from my experience as aparent. How do I know if my sons are receiv<strong>in</strong>g a good education? The class work,assignments, and report cards that come home give me some <strong>in</strong>dication, but <strong>the</strong> norm-referencedand criterion-referenced test score reports give me a lot more to go on. The Iowa Tests <strong>of</strong> BasicSkills that our local school d<strong>is</strong>trict uses allows me to compare my sons’ performance to nationalnorms. The Massachusetts Comprehensive <strong>Assessment</strong> System (MCAS) tests allow me to seehow my sons are do<strong>in</strong>g with respect to <strong>the</strong> performance standards establ<strong>is</strong>hed by <strong>the</strong> State. Now,when my wife and I speak with <strong>the</strong>ir teachers or <strong>the</strong> Pr<strong>in</strong>cipal, we can talk about <strong>the</strong>se<strong>in</strong>dependent assessments, and how th<strong>is</strong> <strong>in</strong>formation can be used to improve <strong>the</strong>ir <strong>in</strong>struction.Look<strong>in</strong>g Forward: Collaborat<strong>in</strong>g on <strong>Education</strong>al <strong>Assessment</strong> ResearchIn th<strong>is</strong> column, I merely touched on a few <strong>of</strong> <strong>the</strong> important <strong>is</strong>sues that concerneducational assessment policy and <strong>the</strong> proper use <strong>of</strong> tests <strong>in</strong> our nation’s schools. I know <strong>the</strong>reare many people who will never acknowledge <strong>the</strong> utility <strong>of</strong> a standardized test, but I also know<strong>the</strong>re are many more who hate someth<strong>in</strong>g else—bad teach<strong>in</strong>g! Teachers who are not help<strong>in</strong>g ourstudents reach <strong>the</strong>ir academic potential are much more dangerous to our children than any test. Ido not advocate us<strong>in</strong>g tests to “police” teachers or us<strong>in</strong>g test results to provide sanctions andrewards for teachers (a very bad idea, given <strong>the</strong> different types <strong>of</strong> students taught by differentteachers). However, I like <strong>the</strong> idea <strong>of</strong> measur<strong>in</strong>g students’ achievement with respect to standardsdeveloped through a consensus process, and I like <strong>the</strong> idea <strong>of</strong> provid<strong>in</strong>g as much <strong>in</strong>formation aspossible to parents and o<strong>the</strong>rs about <strong>the</strong> academic achievement and progress <strong>of</strong> <strong>the</strong>ir children.Are <strong>the</strong>re problems with our current educational assessment policies? I th<strong>in</strong>k so. Thereare several valid critic<strong>is</strong>ms about current educational tests. Personally, I am concerned about <strong>the</strong>amount our students are tested, and I am very concerned about <strong>the</strong> pressure that <strong>is</strong> put onstudents before <strong>the</strong>y take a test. So, <strong>the</strong>re <strong>is</strong> much room for improvement, which <strong>is</strong> where we, aseducational researchers, come <strong>in</strong>. Let us not throw <strong>the</strong> baby out with <strong>the</strong> bathwater and simplyd<strong>is</strong>m<strong>is</strong>s tests as useless. Instead, let us research what seems to be work<strong>in</strong>g, what seems to beharmful, and what needs to be improved. By work<strong>in</strong>g toge<strong>the</strong>r, we can improve educationalassessment and provide advice to educational policy makers that <strong>is</strong> based on solid research. Ifwe can do that, we will improve curriculum development, <strong>in</strong>struction, and assessment, with <strong>the</strong>happy consequence <strong>of</strong> improv<strong>in</strong>g student learn<strong>in</strong>g.Clos<strong>in</strong>g RemarksI hope I have <strong>in</strong>spired everyone who read th<strong>is</strong> column to focus on test use when th<strong>in</strong>k<strong>in</strong>gabout educational assessments. Some <strong>of</strong> you may vehemently d<strong>is</strong>agree with one or more <strong>of</strong> <strong>the</strong>po<strong>in</strong>ts I ra<strong>is</strong>ed. O<strong>the</strong>rs may agree, but have additional <strong>is</strong>sues to br<strong>in</strong>g to <strong>the</strong> d<strong>is</strong>cussion. In ei<strong>the</strong>rcase, I encourage you to write a short response for a future edition <strong>of</strong> <strong>the</strong> NERA Researcher, orwrite to me directly at Sireci@acad.umass.edu. I seriously believe that everyone on both sides <strong>of</strong><strong>the</strong> test<strong>in</strong>g debate needs to work toge<strong>the</strong>r to improve <strong>the</strong> <strong>in</strong>struction and learn<strong>in</strong>g experiences <strong>of</strong>our children. Please also contact me if <strong>the</strong>re <strong>is</strong> an assessment <strong>is</strong>sue you would like me to address


<strong>in</strong> a future edition <strong>of</strong> th<strong>is</strong> column. And one o<strong>the</strong>r th<strong>in</strong>g—don’t forget to get your proposals <strong>in</strong> forNERA 2007!ReferencesAmerican <strong>Education</strong>al Research Association, American Psychological Association, & NationalCouncil on Measurement <strong>in</strong> <strong>Education</strong>. (1999). Standards for educational andpsychological test<strong>in</strong>g. Wash<strong>in</strong>gton, D.C.: American <strong>Education</strong>al Research Association.Kane, M.T. (1992). An argument-based approach to validity. Psychological Bullet<strong>in</strong>, 112,527-535.McDonnell, L. M. (2004). Politics, persuasion, and educational test<strong>in</strong>g. Cambridge, MA:Harvard University Press.Sireci, S. G. (2005). Validity <strong>the</strong>ory and applications. Encyclopedia <strong>of</strong> stat<strong>is</strong>tics <strong>in</strong> <strong>the</strong>behavioral sciences (Volume 4, pp. 2103-2107). West Sussex, UK: John Wiley & Sons.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!