07.04.2014 Views

Anonymous (XXXX) Rubric scoring and item writing.pdf

Anonymous (XXXX) Rubric scoring and item writing.pdf

Anonymous (XXXX) Rubric scoring and item writing.pdf

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

When Designing <strong>and</strong> Developing a Performance Assessment<br />

1. Identify Outcomes & Indicators to be assessed<br />

Identify observable/measurable indicators<br />

DoUse<br />

Solve<br />

Design<br />

Write<br />

Compare<br />

Draw<br />

Persuade<br />

InvestilZte<br />

Avoid<br />

Know<br />

Underst<strong>and</strong><br />

Appreciate<br />

(things that can't be measured or ob~rved)<br />

2. Create a meaningful task CONTEXT<br />

• Real issues<br />

• Real problems<br />

• Themes ofwork<br />

• Trainees job situations<br />

3. Some important questions<br />

• What types oftasks are implied by the learning targets?<br />

• Should tasks be structured or unstructured (scaffolding)?<br />

• Which parts ofthe tasks should be structured & to what degree?<br />

• What do I need to communicate to students so they know what they need to do to perform.<br />

• Is task unbiased (culturaVsocioeconomic)?<br />

4. Identify thinking skills & processes<br />

ThinkinJt skills<br />

Thinkin~ processes<br />

• Comparing • Decision making<br />

• Classifying • Problem solving (overcoming obstacles)<br />

• Induction (conclusions from data) • Experimental inquiry(making predictions &<br />

• Deduction(generalize rules) testing them)<br />

• Error analysis • Invention(create improve)<br />

• Constructing support-argue a position • Investigation(historical, projective or<br />

• Creating abstractions hypothetical), definitional(new theory)<br />

• Analyzing different perspectives<br />

s. Identify Products & Performances<br />

Product or performance should<br />

• Provide evidence ofproficiency<br />

• Be related to the identified outcome or st<strong>and</strong>ard<br />

Written products<br />

Oral performances<br />

Visual products/product performance combination<br />

6. Identifying Evaluative Criteria<br />

Relates to outcomes & indicators<br />

To show to what extent the outcome has been achieved<br />

• Underst<strong>and</strong>ing ofcontent<br />

• Proficiency ofskill or process


7. Design the Scoring Methods .<br />

Constructing the <strong>Rubric</strong> or Activity SpecIfic key<br />

Need to be able to discriminate among the full range ofdifferent degrees ofunderst<strong>and</strong>ing, proficiency or<br />

quality<br />

(# points for each level or degree ofproficiency)<br />

elements included<br />

weighting<br />

8. Task Design Options<br />

Considerations<br />

• Timeltimeframe<br />

• Supplies<br />

• i Equipment<br />

• Friends<br />

• Feasibility<br />

• ResoW'Ce availability<br />

• Levelofsuppo~independence<br />

• Who will evaluate<br />

9. Review for Bias/Sensitivity<br />

10. Language Bias<br />

11. Stereotyping (avoid it)<br />

12. Sensitive topics (avoid them)<br />

Types & Sources ofMeasurement Errors & Biases in Performance Assessment<br />

Sources oferror associated with raters-arise when raters don't use unifonn st<strong>and</strong>ards-they can be<br />

eliminated by using some kind ofranking procedure & through training<br />

Halo effect: a rater's general impressions ofa person influence the ratings-ex: rank student high on<br />

arithmetic skills because he is neat, h<strong>and</strong>some, good h<strong>and</strong><strong>writing</strong>, etc.-traits unrelated to arithmetic<br />

Generosity error: raters who 'continually favor the desirable end ofthe continuum<br />

Severity error: tendency for rater to be overly harsh<br />

Central tendency error: Favors the middle positions-rates everyone about average<br />

BIAS: tendency to rate high or low because offactors other than trait(s) being rated. Bias is usually<br />

subconscious & hard to prove--ex: higher grade for typed than h<strong>and</strong>written<br />

To improve rater behavior:<br />

• Training session<br />

• Avoid using raters who are overly critical or solicitous<br />

Other sources oferror in rating scales<br />

• Ambiguity oftrait-what is "aggressive"<br />

• Ambiguity ofscale-what is "good"<br />

Need to have operational definitions to make rating/<strong>scoring</strong> as objective as possible


General Benefits<br />

General Limitations<br />

• Can clarify meaning ofcomplex learning Difficult to craft (match learning targets)<br />

targets (particularly when students underst<strong>and</strong> Scoring rubrics are difficult to create<br />

rubric being used to grade) Assessment is time consuming<br />

• Assess really "doing" rather than answer Scoring is time consuming<br />

questions about doing Scores may be unreliable<br />

• Can involve integration ofold <strong>and</strong> new • Score on one task can't necessarily be<br />

knowledge & skills<br />

generalized to make inference about students<br />

• Can involve complex tasks ability in other tasks<br />

Assess of"processf'product" Aren't always appropriate<br />

• Student involvement Can be discouraging to the less able<br />

Cultural bias issues<br />

• Can also be corruptible to "teaching the tesf'<br />

GOAL<br />

(1) Make a judgement based evaluation process as systematic <strong>and</strong> objective as it can be<br />

(2) Focus on the important attributes or performance


Criterion-Based Scoring Tools<br />

Holistic <strong>scoring</strong> rubrie<br />

• rate or score the product or process as a whole without <strong>scoring</strong> parts or components separately-easier,<br />

takes less time, appropriate for extended response essays involving synthesis ofideas where answer<br />

can't be pre-specified<br />

• Sources oferror-paying attention to one element in one paper <strong>and</strong> another element in another<br />

• Bias can be a problem<br />

• Gives no feedback to students-they have no idea how to improve<br />

Analytic <strong>scoring</strong> rubric-lists criteria for grading a piece ofwork<br />

• Rate or score separate parts or characteristics ofproduct or process fIrSt, then sum part scores to obtain<br />

total score--<br />

I<br />

• more time consuming~<br />

• helps students leam-gives details about strengths <strong>and</strong> weaknesses<br />

• articulates gradations ofquality for each criterion (excellent to poor). Criteria in column on left;<br />

columns to right describe degrees ofquality<br />

• makes expectations clear<br />

• organizes feedback & time spent evaluating student work<br />

ExampleofRubnc for Essay Items<br />

Points Content Answer or Opinion Explanation or Defense<br />

3 Ample evidence of Clearly answered or Convincingly explained<br />

underst<strong>and</strong>ing ofthe stated or defended<br />

content<br />

..<br />

2<br />

Some evidence of Answered or stated Explained or defended<br />

underst<strong>and</strong>ing ofthe<br />

content<br />

1 Little evidence of Ambiguous answer or Explanation or defense<br />

underst<strong>and</strong>ing ofthe statement is not convincing<br />

content


2<br />

Example of<strong>Rubric</strong> for Math problems<br />

Points Ouality ofExplanation Answer Explanation<br />

6 Excellent explanation<br />

(complete, clear,<br />

unambiauous).<br />

5 Good explanation<br />

(reasonable clear <strong>and</strong><br />

complete)<br />

4 Acceptable explanation Correct Complete, clear, logical<br />

(problem completed but<br />

may contain minor flaws<br />

in explanation).<br />

3 Needs improvement (on Almost correct or Essentially correct but<br />

the right track, but may partially correct incomplete or not<br />

contain serious flaws,<br />

entirely clear<br />

demonstrates only<br />

partial underst<strong>and</strong>ina)<br />

2 Incorrect or inadequate Incorrect but reasonable Vague or unclear but<br />

explanation (shows lack attempt with redeeming features.<br />

ofunderst<strong>and</strong>ing of<br />

problem).<br />

1 Incorrect without Incorrect with no Irrelevant, incorrect, or<br />

. attempt at explanation relationship to the no explanation<br />

problem<br />

Activity-Specific Scoring Key-is identical to a rubric but it is only useful for a single activity or<br />

assessment, whereas a rubric can be used for any assessment occasion where the same thing is being<br />

assessed over <strong>and</strong> over. An example ofa key would be one for <strong>scoring</strong> a project on a specific topic-it<br />

would only be used once, for that particular project. An example ofa rubric would be one for <strong>scoring</strong><br />

<strong>writing</strong> to inform-the same <strong>scoring</strong> tool is used every time the students are asked to write to inform.<br />

Checklist<br />

• list ofspecific behaviors, characteristics or activities <strong>and</strong> place to check absent or present (or yes or no)<br />

• used for sequence or subtask that can be formed in list-science-set up microscope---auto<br />

shop-change oil in car<br />

Examp e af<br />

persona -SOCI eve opment checkl· 1st Dor 1'1rimary grades<br />

Check if Behaviors<br />

''yes''<br />

Follows directions<br />

Seeks help when needed<br />

Works cooperatively with others<br />

Waits in tum for usin2 materials<br />

Shares materials with others<br />

Tries new activities<br />

Completes stated tasks<br />

Returns equipment to proper place<br />

Cleans work space


3<br />

Rating Scale<br />

Consists ofa set ofcharacteristics or qualities to be judged <strong>and</strong> some type ofscale for indicating the degree<br />

to which each attribute is present.<br />

Example-Speeeb Rating Seale<br />

Directions: Mark an X anywhere along the horizontal line to indicate student's speech performance<br />

Content & organization: Opening remarks<br />

Inappropriate: Distract from<br />

speech topic<br />

Commonplace. No particular<br />

contribution to the speech<br />

Arouse interest. Direct<br />

attention to speech topic<br />

Deliv~ry:<br />

Gestures<br />

Movements are<br />

monotonous or<br />

distracting<br />

Generally<br />

effective. Some<br />

distracting<br />

mannerisms<br />

Natural, expressive<br />

movements which<br />

emphasize speech<br />

Other examples ofrating scales-<br />

To what extent does student participate in group discussions?<br />

Never, seldom, occasionally, frequently, always<br />

In opinion surveys you see rating scales:<br />

Strongly Disagree, Disagree, Unsure, Agree, Strongly Agree<br />

Types & Sources ofMeasurement Erron & Biases in Performance Assessment<br />

Sources oferror associated with raters-arise when raters don't use uniform st<strong>and</strong>ards--they can be<br />

eliminated by using some kind ofranking procedure & thrOlJgh training<br />

Halo effect: a rater's general impressions ofa person influence the ratings--ex: rank student high on math<br />

computation skills because he is neat, h<strong>and</strong>some, good h<strong>and</strong><strong>writing</strong>, etc.-traits unrelated to math<br />

computation<br />

Generosity error: raters who continually favor the desirable end ofthe continuum<br />

Severity error: tendency for rater to be overly harsh<br />

Central tendency error: Favors the middle positions--rates everyone about average<br />

BIAS: tendency to rate high or low because offactors other than trait(s) being rated. Bias is usually<br />

subconscious & hard to prove-ex: higher grade for typed than h<strong>and</strong>written<br />

To improve rater behavior:<br />

• Training session<br />

• Avoid using raters who are overly critical or solicitous<br />

Other sources oferror in rating scales<br />

• Ambiguity oftrait-what is "aggressive"<br />

• Ambiguity ofscale-what is "good"<br />

Need to have operational defmitions to make rating/<strong>scoring</strong> as objective as possible


Performance Assessment<br />

Advantages & Disadvantages ofperformance assessments (measurement & instructionaJ)<br />

Advantages<br />

Disadvantages<br />

Classroom • Traditional tests encourage • Time consuming to craft<br />

Assessment mastery ofthe wrong things • Not appropriate for<br />

• Form oftest is harmful to assessment ofa1lleaming<br />

learning<br />

targets<br />

• Useful for improving<br />

instruction<br />

• Consistent with modem<br />

learning theory<br />

~easurementlssues • Validity-simulates real • Subjective element in<br />

world activity<br />

<strong>scoring</strong> decreases reliability<br />

ofscores<br />

• Subject to bias & corruption<br />

• Validity ofscores can be<br />

lowered by (see below)<br />

• PA can assess a number oflearning targets (traditional.is limited to one)-involve using a variety of<br />

knowledge <strong>and</strong> skills, but they should be well planned.<br />

• PA use doesn't insure that mastery oflearning targets will be met<br />

• The use ofboth PA <strong>and</strong> traditional multiple choice (mc) tests improves the validity ofassessment<br />

• Pa is referred to as authentic because it is supposed to provide a simulation ofa real-life<br />

situation-there is a continuum ofleast authentic to most authentic. Think about activities <strong>and</strong> where<br />

they fall in the spectrum ofthis continuum. PA involves making an application ofskills an knowledge;<br />

it is more consistent with modem learning theory than traditional assess methods (mc tests, for<br />

example)<br />

• PA involves making an application ofknowledge/skills rather than just recognizing .the right answer.<br />

• PA is not the most appropriate assessment method for assessing all types oflearning targets.<br />

• Me <strong>and</strong> other objective methods ofassessment provide an· indirect way ofassessing whether mastery<br />

ofa learning target has been met.<br />

• Validity can be lowered in PA by the following<br />

- Students work together-8uzie's project doesn't reflect Suzie's knowledge <strong>and</strong> skills-primary use<br />

ofgroup projects is to assess social skills(working with others, etc)<br />

- The teacher is biased downward certain presentation fonnats, topics, etc.<br />

• PA projects require more teacher supervision <strong>and</strong> management than conventional assessment methods<br />

• Appropriate portfolio use does not involve saving ALL work-the work put into the portfolio should<br />

be selected for a purpose--usually "best work" or "growth"<br />

• One benefit ofportfolio use is that it involves selfevaluation, "growth <strong>and</strong> best work" <strong>and</strong> students<br />

learn how to present their best ("best work")<br />

• PA not necessarily better for assessing logical thinking skills/problem solving<br />

• PA are more time consuming to craft <strong>and</strong> to score(for teachers) & are more time consuming for the<br />

student to do.<br />

'PA scores are generally less reliable than mc scores-in terms ofstudents performance & teachers <strong>scoring</strong><br />

(2 teachers won't necessarily agree)


Alternative Assessments<br />

Example Benefit Limitation<br />

Individual Projects Require creativity, originality & Be sure each employee has the<br />

integration ofmany skills same experience or access to<br />

resources<br />

Group projects Assess ability to work with others Groups have an impact on<br />

individual behavior, can't assess<br />

what an individual has done or<br />

achieved<br />

Portfolio Student participation Scoring rubric has to be carefully<br />

"best work"<br />

constructed<br />

"growth" .<br />

Demonstration Less complex than a project Need clear learning target &<br />

<strong>scoring</strong> rubric<br />

Experiment Can assess whether students use Time consuming---expensive in<br />

proper inquiry skills & methods materials<br />

"process"<br />

Oral Presentation Assess skills not able to be Must decide how to weigh<br />

Debate assessed with test "project" content vs. performance(well<br />

Dramatization<br />

presented, but material was all<br />

wrong)<br />

Structured Tasks Can be linked to teaching Scoring rubric<br />

(write story, build something activities<br />

Naturally occurring tasks<br />

You have to wait for the event to<br />

occur<br />

• Extended response essay <strong>item</strong> is particularly useful to assess student ability to organize ideas <strong>and</strong><br />

include relevant infonnation--disadvantage is that <strong>scoring</strong> is inconsistent- can be problematic for<br />

elementary school students<br />

• Restricted response <strong>item</strong>s can be used for more than just rec~ll <strong>and</strong> comprehension, but they need to be<br />

carefully crafted-advantage-it gives more reliable results (responses will be similar because<br />

response fonnat is focused <strong>and</strong> organized in similar way for all students)<br />

• Best <strong>item</strong> formats:<br />

• Matching-symptoms & contaminants<br />

• Recall facts-short-answer (don't use PA for this)<br />

• Comprehension/basic underst<strong>and</strong>ing-mc<br />

• Rater drift---results in inconsistent <strong>scoring</strong><br />

• Carry-over effeets-to minimize grade all #1 <strong>item</strong>s ofall students, then go on to next <strong>item</strong> & do for all<br />

students<br />

• <strong>Rubric</strong>s are designed to improve reliability ofessay scores <strong>and</strong> PA -they define different levels of<br />

quality the instructor will use to evaluate essay responses or performance<br />

• Can use verbal descriptions ofquality at each level<br />

• Can use examples ofprevious students performance or products<br />

• Analytic Scoring rubrie-gives more detailed feedback-helps students know what to expect, so<br />

they can assess their own work before turning it in-helps students identify areas which need<br />

remediation<br />

• Holistic <strong>scoring</strong> rubric-less objective<br />

• Process/procedure vs. product<br />

• Problem solving vs. story or display<br />

• PA-unstructured vs. structured-scaffolding-directions or structure provided to students---less<br />

structure means less "scaffolding", i.e., linking ofsuccessive task requests<br />

• Checklist-can cover positive <strong>and</strong> negative behaviors ..<br />

• Portfolio-purpose is important-what will results be used for-promotion to next grade?


Thinking Skills, Trigger Words, <strong>and</strong> Evaluative Criteria<br />

_ adapted by W. D. Schafer from the Quellmalz & Bloom et al. taxonomies <strong>and</strong> Stiggins's materials<br />

Quellmalz proposes five categories (QI-Q5, below) but the order is·not viewed as important.<br />

Bloom et a/. propose six cat~gories (BI-B6, below) ordered by complexity of the cognitive operation.<br />

Recall (Q1) - verbatim repetition [knowledge (Bl)] <strong>and</strong> paraphrasing [comprehension(B2)]<br />

Trigger Words (knowledge): define, label, locate, recite, name, state<br />

Trigger Words (comprehension): describe, restate, paraphrase, rewrite, express, identify<br />

Evaluative Criteria: accuracy, comprehensiveness<br />

Analysis (Q2 & B4) - division into component elements (e.g., part-whole, cause-effect)<br />

Trigger Words: analyze, subdivide, outline, order, inventory<br />

Evaluative Criteria: identification (sufficiency & importance), description, explanation, justification<br />

Comparison [Q3 (& B4)] - identification of similarities <strong>and</strong> differences<br />

Trigger Words: compare, contrast, categorize, sort, separate, classify, differentiate<br />

Evaluative Criteria: distinctiveness, sufficiency, accuracy, explanation, clarity<br />

Inference (Q4) - deductive reasoning [application (B3)] <strong>and</strong> inductive reasoning [synthesis (B5)]<br />

Trigger Words (application): predict, demonstrate, apply, solve, use, sketch<br />

Evaluative Criteria: choice of rule (generalization), plausibility of argument, accuracy of result<br />

Trigger Words (synthesis): construct, formulate, speculate, hypothesize, theorize<br />

Evaluative Criteria: use of data, identifying generalization, logic of argument, noting of exceptions<br />

Evaluation [Q5 & B6 (& B4)] - judging worth, quality, credibility, or practicality<br />

Trigger Words: prioritize, judge, evaluate, appraise, recommend, critique, weigh, [question] & why<br />

Evaluative Criteria: data orientation, explanation of criteria, defensibility of application of criteria<br />

(note - it is not the judgment, but its explanation <strong>and</strong> the argument presented that is rated)<br />

From Fleming & Chambers (1983): Percent of Actual Teachers' Items by Cognitive Levels<br />

(n=8800 <strong>item</strong>s) Cognitive Level (revised from their use of the Bloom et a/. categories)<br />

Level in School Recall Analysis Comparison Inference Evaluation<br />

Elementary 83% 10% 0% 7% 0%<br />

Junior High 97% 3% 0% 0% 0%<br />

High School 97% 3% 0% 0% 0%<br />

(note: most inference <strong>item</strong>s are found in mathematics <strong>and</strong> are deduction, or application of principles)<br />

Sample table of specifications [%'s are emphases for objective (an operation using a content) classes]:<br />

Content Area (Civil War)<br />

During War Post-War<br />

25% 5%<br />

5% 5%<br />

Operation (Cog. Level)<br />

Recall<br />

Analysis<br />

Comparison<br />

Inference<br />

Evaluation<br />

Total<br />

Pre-War<br />

5%<br />

5%<br />

5%<br />

5%<br />

5%<br />

25%<br />

10%<br />

10%<br />

50%<br />

5%<br />

10%<br />

25%<br />

Total<br />

35% r<br />

15%<br />

15%<br />

20%<br />

15%<br />

100%<br />

Avoid the "recall trap," where test <strong>item</strong>s seem to measure higher-order thinking but the specific<br />

instances have been taught in class, so the students are merely remembering material as it has been<br />

taught. To engage higher-order thinking, the student must be presented with novel material.<br />

Beyond thinking skills are thinking processes, when skills are used in sequence or in combination.<br />

Examples are problem solving, decision making, <strong>and</strong> scientific inquiry.


Common Flaws in Multiple-choice Items <strong>and</strong> Tests<br />

1. Weak Stem: the stem should present a clearly-defined problem, usually specific enough to st<strong>and</strong><br />

alone as a completion <strong>item</strong>.<br />

2. Heterogeneity of Options: the choices should all be of a similar nature, otherwise one or more will<br />

be implausible.<br />

3. Grammatical Cues: all options should be grammatically consistent with the stem.<br />

4. Lack of Parallelism: the logical flow of both grammatical construction <strong>and</strong> ideas should be<br />

consistent from the stem to each option.<br />

5. Complicated Stem: stems which include several ideas can be open to different interpretations by<br />

different examinees.<br />

6. Complicated Options: it is usually too difficult a task to make a rapid comparison among<br />

complex options.<br />

7. Explanatory Stem: stems which "teach" material not necessary to the question serve no useful<br />

measurement purpose <strong>and</strong> waste measurement time.<br />

8. Negatively Stated Stem: stems usually present the task of choosing the "correct" or "best"<br />

option; a change in the ground rules should be avoided <strong>and</strong>, if one is necessary, the change should<br />

be emphasized.<br />

9. Repetition of Option Lead-ins: if all options begin with the same wording, it should be included<br />

at the end of the stem.<br />

10. Specific Determiner: words or phrases which tend to be present in only true or only false<br />

statement should be avoided in the options.<br />

11. Choices Not at the End of the Stem: <strong>item</strong>s should be written so that the natural location of the<br />

answer is at the end of the stem.<br />

12. More than One Acceptable Answer: only one option should be best; the others should be clearly<br />

unacceptable.<br />

13. Use of "All of the Above" as an Option: this option will likely be selected when at least two of<br />

the others are correct, allowing the examinee to receive credit for partial knowledge.<br />

14. Clues from Other Items: no <strong>item</strong> should be answerable through the use of information tcontained<br />

in other <strong>item</strong>s.<br />

15. Pattern of Keyed Answers: the position of the keyed option should not follow any discernable<br />

rule throughout the test; the keyed position should be determined r<strong>and</strong>omly or by a consistent<br />

practice such as placing all options in alphabetical or numerical order for each <strong>item</strong>.<br />

16. Over-inclusive Options: if one or more options cover all possibilities, the choice will likely be<br />

made ignoring the other options.<br />

17. Inconsistent Lengths of Options: usually the longer option tends to be the correct one.


Robe~t w. Lissitz<br />

First compiled by M.J. Wantman<br />

Directions for Test Construction <strong>and</strong> Item Writing<br />

General directions for <strong>item</strong> <strong>writing</strong><br />

1.<br />

2.<br />

Express the <strong>item</strong> as clearly as possible<br />

of grammar, rhetoric, <strong>and</strong> punctuation. '<br />

observing the rules<br />

Use only <strong>item</strong>s which have an answer upon which all experts<br />

agree.<br />

3.<br />

4.<br />

5.<br />

6.<br />

7.<br />

8.<br />

9.<br />

10.<br />

11.<br />

12.<br />

Do not use trick or catch <strong>item</strong>s--<strong>item</strong>s that are phrased so<br />

that the correct answer depends on a single, obscure key word<br />

to which even good students are unlikely to give sufficient<br />

attention.<br />

Avoid irrelevant clues--i.e., wording that enables the correct<br />

answer to be determined merely by intelligence <strong>and</strong> not because<br />

of knowledge which the question is designed to measure.<br />

Items should not be inter-dependent.<br />

One <strong>item</strong> should not furnish the answer to another one.<br />

Avoid textbook wording.<br />

The position of the correct alterative should be r<strong>and</strong>om.<br />

SUbject matter <strong>and</strong> phrasing should be such that no emotional<br />

antagonism will be aroused.<br />

The <strong>item</strong> should call for a knOWledge of concepts, reasons, <strong>and</strong><br />

relationships rather than for mere factual information,<br />

whenever the former is appropriate.<br />

omit any part of the <strong>item</strong> which can be omitted without<br />

significantly influencing the distribution of responses.<br />

If possible, the question should be stated in positive terms.<br />

MUltiple-Choice questions<br />

13. The stem or introductory statement should be as complete as<br />

possible.<br />

14. Distractors or options should be as short as possible.<br />

15. Distractors should be plausible.<br />

16. The length of a distractor should not vary consistently with<br />

the correctness of the distractor.


17 • The more homogeneous the alternatives, the highe~·· the level of<br />

underst<strong>and</strong>ing tested for.<br />

18. Avoid responses that overlap or include each other.<br />

19. If popular misconceptions in the field covered by the question<br />

exist, the distractors should be designed to attract<br />

c<strong>and</strong>idates holding these misconceptions.<br />

20. The distractors should be of about the same length <strong>and</strong><br />

complexity as the correct answer.<br />

21. Repetitions in the distractors should be avoided by putting<br />

the repeated thought into the stem of the <strong>item</strong>.<br />

22. The stem of the <strong>item</strong> should be phrased to ask for the best<br />

rather than the correct answer whenever there is no clear-cut<br />

"correct" answer or where one or more of the distractors is<br />

not ·'wrong."<br />

23. If there are other answers conceivably as good as the intended<br />

answer, the stem should be limited by the phrase "of the<br />

following."<br />

24 • If the c<strong>and</strong>idate could distinguish the correct answer from the<br />

incorrect answers without having read the stem, the <strong>item</strong> is in<br />

need of revision.<br />

25. There should be no possibility that a c<strong>and</strong>idate may select an<br />

alternative because it is the only one con~aininq the same<br />

words or phrases as the stem or because of other external<br />

characteristics.<br />

26. Keep about four or five alternatives.<br />

Matching sets<br />

27. The <strong>item</strong>s should be homogeneous.<br />

28. The alternative responses should be homogeneous.<br />

29. There should be more unequal options anyway than questions in<br />

a set, but the list of options should not be too long-­<br />

possibly an upper limit of 7.<br />

30. Allow the same option to be the correct answer for more than<br />

one question.<br />

31. Be specific about the basis on which questions <strong>and</strong> options are<br />

to be matched.<br />

32. Material should be briefer in options than in questions if<br />

there is a choice.


33. Arranqe the options in some logical order so that students can<br />

find the correct answer quickly. For example: numerical,<br />

chronological, alphabetical order, etc•<br />

.<br />

34. Deep all of one question on the same page.<br />

Simple recall or completion <strong>item</strong>s<br />

35. Require a short, definite answer.<br />

36. Include in the key all possible correct answers.<br />

37. Do not evaluate spelling unless the test is a test of<br />

spelling.<br />

38. Make the question very specific.<br />

39. A direct question seems to be easier than a fill-in statement.<br />

True-false <strong>item</strong>s<br />

40. Do not put the false part of a statement in a qualifying<br />

phrase.<br />

41. In "Why" questions, the false aspect should be in the<br />

"because" phrase.•<br />

42. Avoid specific determiners, such as "entirely," "always,"<br />

"never," "generally," "often," "seldom."<br />

43. Use quantitative rather than qualitative language where·<br />

possible.<br />

Typography <strong>and</strong> test assembly<br />

44. List alternative responses in a vertical column if possible.<br />

45. Arrange questions by <strong>item</strong> type <strong>and</strong> by order of difticulty<br />

within the <strong>item</strong> type.<br />

46. Do not split an <strong>item</strong> between tow pages. In a matching set, do<br />

not split a set.<br />

47. In completion or simple recall <strong>item</strong>s, the length of lines<br />

indicating blank spaces must be the same for all<br />

questions.


Robert W.<br />

Lissitz<br />

General Rules for Using, Constructing, <strong>and</strong> Evaluating Essay Exams<br />

.JlQ....<br />

1. Limit the problem which the<br />

question poses so that it will<br />

have an unequivocal meaning to<br />

most students.<br />

Do<br />

Not<br />

1. Judge papers on the basis of<br />

external factors unless those<br />

have been clearly stipulated.<br />

2. Use words which will convey<br />

clear meaning to the student.<br />

3. Use an essay question for the<br />

purposes it best serves, i.e.,<br />

organization, h<strong>and</strong>ling<br />

complicated ideas <strong>and</strong> <strong>writing</strong>.<br />

4. Prepare enough questions to<br />

sample the material of the<br />

course broadly, within a<br />

reasonable time limit.<br />

5. Prepare questions which require<br />

considerable thought, but which<br />

can be answered in relatively<br />

few words.<br />

6. Determine in advance how much<br />

weight will be accorded each of<br />

the various elements expected in<br />

a complete answer.<br />

7. Score each question for all<br />

students before looking at the<br />

next question without knowledge<br />

of students names. Use several<br />

scores if possible.<br />

8. Require all students to answer<br />

all questions on the test.<br />

9. Write questions about materials<br />

immediately germane to the<br />

course.<br />

10. Study past questions to<br />

determine how students performed<br />

with them.<br />

2. Make a generalized estimate of a<br />

paper's worth.<br />

3. Construct a test consisting of<br />

only one question.<br />

4. Allow students to select the<br />

particular questions they wish to<br />

answer.


11. Make gross judgments of the<br />

relative excellence of answers<br />

as a first step 1n grading.<br />

12. Make the wording of a question<br />

as simple as possible in order<br />

to make clear the task imposed.<br />

13. After <strong>writing</strong> a question, leave<br />

it for a day or so, then look at<br />

it again <strong>and</strong> consider the type<br />

of range of answers which it may<br />

evoke. Revi se if neces,sary.<br />

14. It is generally best to use<br />

"Check-list Point-score Method"<br />

in <strong>scoring</strong>.


Sample True-False Items<br />

Lissitz<br />

T F 1.<br />

T F 1.<br />

T F 2.<br />

T F 2.<br />

T F 3.<br />

'1' F 3.<br />

Good teeth depend upon diet.<br />

A<br />

good diet is a basic factor in having good teeth.<br />

Alfred Binet was a French abnormal psychologist <strong>and</strong><br />

botanist who introduced the concepts of judgment, adaptation,<br />

<strong>and</strong> self-criticism into the study of intelligence.<br />

The concepts of judgment, adaptation <strong>and</strong> self-criticism<br />

were first introduced into intelligence measurement by<br />

Alfred Binet.<br />

Minneapolis is the largest city in the Midwest because it. is<br />

a major railroad eenter.<br />

Chicago is the largest city in the Midwest because many<br />

conventions are held there.<br />

T F 4. The brightest stars are always the closest.<br />

T F 4. If all the stars were the same distance from earth, the sun<br />

would still be the brightest.<br />

T F 5. No picture-no sound in a television set may indicate a bad<br />

SU4G tube.<br />

T F 5. A bad SU4G tube in a television set will result in a no<br />

picture-no sound.<br />

T F 6. By 1861, a great many states had established universitiies.<br />

T F 6. By 1861, twenty states had establisned universities.<br />

T F 7. The Dutch were the first nationality to settle in New York<br />

<strong>and</strong> the largest single nationality in the colonies in the<br />

early 18th century.<br />

T F 7.. The Dutch were the first nationality to settle in New York.<br />

T F 8. The Dutch were the largest single nationality in the colonie<br />

in the early 18th century.<br />

T F 9. Gabro is course-grained igneous rock.<br />

T F S. Gabro is course-grained igneous rock.<br />

T FlO.<br />

T FlO.<br />

Freezing water is not infrequently unknown in Southern<br />

california.<br />

Winter temperatures dip below freezing in parts of Southern<br />

California.


-2- Lissitz<br />

11. World War II was fought in Europe <strong>and</strong> the Far East.<br />

12. A remarkable transaction ocqurred toward the end of the<br />

reign of COnstantine the Great.<br />

R W 13.<br />

Y N 14.<br />

Glancing-down the famous street, signs of every kind were<br />

visible. (Sentence structure)<br />

Does a ripsaw have larger teeth than a crosscut saw?<br />

15. The iron ore mined in Minnesota is<br />

a. less than half of the iron ore mined in the United<br />

States.<br />

b. shipped by railroad to the great iron <strong>and</strong> steel<br />

centers of the nation.<br />

*c. scooped out of open pits by huge power shovels.<br />

d. found in the southern part of the state.<br />

15. The iron ore mined in Minnesota is<br />

T F 1. about two-thirds of the iron ore mined in the<br />

United States.<br />

T F 2. shipped by railroad to the great iron <strong>and</strong> steel<br />

centers of the nation.<br />

T F 3. scooped out of open pits by huge power shovels.<br />

T F 4. found in the southern part of the state.<br />

T F J) 16. In a skewed distribution of test scores the median is<br />

larger than the mean.<br />

Directions: In each of the following true-false statements<br />

the crucial element is underlined. If the statement is true, circle<br />

the T on the left. If the statement is false, circle the F, cross<br />

out the underlined word, <strong>and</strong> write in the blank space the word<br />

which must be substituted for the crossed-out word in order to make<br />

the statement true.<br />

T F ______17• Th~ Pleiades <strong>and</strong> the Hyades are in the<br />

constellation orion<br />

......................-.---------<br />

Directions: Each of the following statements may be stated<br />

conversely. You are to decide whether each statement is true or<br />

false, <strong>and</strong> whether its converse is true or false. Indicate your<br />

answers by encircling ~ of the symbols before each <strong>item</strong>, as<br />

follows:<br />

CIRCLB<br />

T if the statement is true CT if the converse is true<br />

F if the statement is false CF if the converse is false<br />

T F CT CF 18.<br />

To be reliable" a test must be valid.<br />

T F eJ'J! CF 19. All rectangles are quadrilaterals.


- 3- Lissitz<br />

Directions: Each of the following <strong>item</strong>s may be true without<br />

qualification, true with qualifications, or false. If it is<br />

true without qualification, circle the T <strong>and</strong> mark a "c" in the space<br />

provided. If it is true with one of the listed qualifications,<br />

circle T <strong>and</strong> mark the letter of the appropriate qualification ~n the<br />

space. If the <strong>item</strong> is false, circle the F.<br />

T F 20.<br />

-<br />

T F _23.<br />

Statements<br />

The total resistance in an electrical<br />

circuit is equal to the sum of the<br />

individual resistances.<br />

The total current in an electrical<br />

circuit is equal to the sum of the<br />

currents in the individual parts<br />

of the circuit.<br />

The total current in an electrical circuit<br />

is equal to the electromotive<br />

force in the circuit divided by the<br />

resistance of the circuit.<br />

The power supplied to a circuit is<br />

equal to the product of the total<br />

resistance <strong>and</strong> the amount of current<br />

in the circuit.<br />

Qualifications<br />

a. if the resistances<br />

are<br />

connected in<br />

parallel.<br />

b. if the resistances<br />

are<br />

connected in<br />

series<br />

c. no qualification<br />

24. The numerical value of pi is 3.<br />

25. The numerical value of pi is 3.1416<br />

26. The numerical value of pi, correct to four decimal<br />

places, is 3.1416.<br />

27. calcium chloride attracts a film Qf moisture to its<br />

surface <strong>and</strong> gradually goes into solution.<br />

23. No satisfactory explanation has ever been given for the<br />

migration of birds.<br />

29. The nourishment assimilated by the body depends upon the<br />

amount of food eaten.<br />

30. If a square <strong>and</strong> an equilateral triangle are inscribed in<br />

-.... -....- -- _. ·the- same circle, the side of the square is longer than<br />

the side of the triangle.<br />

31. It is possible for an erect man to see his entire image<br />

in a vertical plain mirror one half as tall as he is.


DIREcrIONS: Column A. belOil consists of types of' medicines, medications<br />

frequently given to patients. Column B. lists methcx:1s of<br />

administration of medications. In the space provided before each<br />

<strong>item</strong> in column A, write the letter or letters of methods of<br />

administration which may be used for each medication. Next<br />

encircle the one most OJ!LiiQlJ method of administration. A letter<br />

may be used once, 1lK)re than once, or not at all.<br />

A. TYpes of medication B. Methods of Administration<br />

Sample <strong>item</strong>: (b) c x. :Mo%phi.ne sulfate<br />

1..<br />

2.<br />

3.<br />

4.<br />

5.<br />

6.<br />

7.<br />

Insulin<br />

Penicillin<br />

Demerol<br />

Pareqoric<br />

Paraldehyde<br />

I.mninal<br />

Dicumarol<br />

a. Hypc:xiennically<br />

b. IJItIanulscul.arly<br />

c. Orally


DIRECrIONS: 'lhree lists are presented below. Fa1rDus Erglish authors of plays<br />

are listed in the c:x>lumn farthest to the right, nanes of well­<br />

:kr1aIm plays are listed in the center column, am in the column<br />

farthest to·the left are names of charact:ers in sane of these<br />

plays. Yal are to look at the name of the character listed,<br />

decide in which play this character appears, am identify the<br />

author of this play. Place the letters in the spaces provided,<br />

taw letters for each character. Nat all answers will be used.<br />

1. Mildred Tresham a. '!be Silver Box<br />

-- 2. Ralph Rackstraw b. Riders to the sea<br />

-- 3. Algernon Moncrieff c. Fasy virtue<br />

-- 4. Elizabeth saunders d. IE Pinafore<br />

--5. :Marion· Whittaker e. A Bill of Divorcement<br />

--6. Bartley f. A Blat on the scut:cheon<br />

-- 7. Montague IJJshi.ngtan g. our Betters<br />

-- h. '!he Masqueraders<br />

i. '!be Inpo:rt:anaa of Bei.rg Ean1est<br />

A. John Millirgtan ~<br />

B. Clemence Dane<br />

c. Robert Browni.rg<br />

D. W. - MaUJham<br />

E. Henry Arthur Jones<br />

F. Noel Cbwal:d<br />

G. oscar Wilde<br />

H. W. S. Gi.lbert<br />

I. John Galswort:hy


DIRECI'IONS:<br />

Match each of the poets in the column at the left with the<br />

quotation on the right which is at:t::r.ilJuted to him. Write the<br />

letter of the con:ect quotation in the space corresporxii.rq to<br />

the authOr of the quotation.<br />

Poets<br />

Quotations<br />

1. !.om Byron a. Flow gently, sweet Afton, am:m:J thy<br />

--<br />

green braes,<br />

Flaw gently, I sirq thee a sa:g in th~<br />

praise;<br />

__2. Percy B. Shelley b. Break, break, break on thy cold gray'<br />

stones, 0 sea<br />

And I would that my tongue could utteJ:<br />

rrhe thoughts that arise in me.<br />

3.<br />

--<br />

William Words\\1Orth c. Hail to thee, blithe Spirit!<br />

Bird thou never wert.<br />

'!hat fran Heaven or near it,<br />

Pcurest thy full hear in profuse<br />

strains of U11p%emeditated art.<br />

4. Robert Burns d. She walks in beauty, like the night of<br />

--<br />

cloudless climes ani st:arJ:y skies<br />

And all thatI s best of dark ani bright<br />

meet in her aspect ani her eyes;<br />

__5. sammel Taylor OJleridge e. My heart leaps up when I behold a<br />

:rai.nbcM in th.e sky;<br />

So was it when my life began;<br />

So is it 11Ctt1 I am a man<br />

So be it when I shall grav old,<br />

6. John Keats<br />

--<br />

or let me die!<br />

7 • Alfred I.ord Tennyson<br />

--


DIRECI'IONS:<br />

Quotations fran poetry written durirg the Raoantic Period are<br />

listed in the column at the left belaN. In the column at the<br />

right, names of fcmDlS poets are listed. You are to iJxlicate<br />

the author of each of the quotations by writirg in the space<br />

:before the number of the quotation the letter correspondi.rg<br />

the name of the author in the right-h<strong>and</strong> column.<br />

4.<br />

ouotation CRCmantic Period)<br />

My heart leaps up when I behold<br />

a rainbcw in the sky;<br />

So was it when my life began;<br />

So is it roN I am a man;<br />

So be it when I shall grarl old,<br />

or let me die!<br />

a. Robert Burns<br />

b. IDrd Byron<br />

c. san'Sl Taylor COleridge<br />

d. John Reats<br />

5. A t:hiJq of beauty is a joy for- e. Percy B. 'Shelley<br />

ever;<br />

Its loveliness increases; it f. Alfred rord Tennyson<br />

will never Pass into nothingness;<br />

but still will keep q. william Wordsworth<br />

A bower quiet for us, arx:J. a<br />

sleep<br />

Full of S\4eet dreams, arx:J. health,<br />

arx:J. quiet brea1::hi.rg. .<br />

DIRECI'IONS: Listed belai are major events in Asiatic histo:ry(left-h<strong>and</strong><br />

column) am names of united states presidents (right-h<strong>and</strong><br />

column). Match the event or incident with the name of the u.s.<br />

President t,bo was in office when the event took place. Inclica.te<br />

your answer by placing the proper letter in the space provided. A<br />

letter may be used DCre than once or not at all.<br />

1.<br />

2.<br />

3.<br />

4.<br />

--5.<br />

6.<br />

Historic Event<br />

Panay incident a.<br />

arllippine independence b.<br />

Manchurian war (involvirg RLJssia) c.<br />

Irdo-Olina war d.<br />

Korean war started e.<br />

Boxer Rebellion f.<br />

q.<br />

President<br />

COOlidge<br />

Eisenhower<br />

Hoover<br />

McKinley<br />

F.D. Roosevelt<br />

Tnnnan<br />

Wilson


DIRECI'IONS: In each of the columns listed belOW' are significant events in the<br />

growth of American demo:%aCY. You are to dete1:mi.ne between which<br />

pair of events in the right-ham cx>lumn each of the events in the<br />

left-hard CDlumn fits. nxlicate your ans;wer by marki.rg the letter<br />

fOlD1 !;)etween the pairs on the right in the space to the left of<br />

the lUnnber of the event at the left.<br />

1.<br />

2.<br />

--3.<br />

4.<br />

5.<br />

Split of the Republican party,<br />

givirg birth to the Denccratic­<br />

Republican party.<br />

War of 1812<br />

Use. of the "Spoils System"<br />

for political office.<br />

'!he purchase of the I.auisiana<br />

Territory.<br />

Fall of the Federalist party.<br />

First_Bank<br />

established by~<br />

a.<br />

Appointment of Chief Justice<br />

John Marshall<br />

b.<br />

Expansion of Anerlcan trade due<br />

to the Napoleonic Wars.<br />

c.<br />

Treaty of Ghent<br />

d.<br />

~ection of James Monroe as<br />

president<br />

ė.<br />

Proclamation of the Monroe<br />

J:bctrine<br />

f.<br />

Election of Ardrew Jackson as<br />

president.<br />

q.<br />

OJllapse of the Government Bank.<br />

h.<br />

Lissitz\Workshop

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!