A Brief Introduction to Evidence-Centered Design CSE Report 632 ...

More documents

Recommendations

Info

will be used for task level feedback. In an operational assessment, evidencerules guide the Response Scoring process. It is important to note thatevidence rules concern the identification and summary of evidence withintasks, in terms of observable variables.For GRE items, the observable variable associated with each item is whetherit is answered correctly or incorrectly. The rule by which its value isdetermined is comparing the student’s response with the answer key:Correct if they match, incorrect if they don’t.For DISC tasks, there are generally several observable variables evaluatedfrom each complex task performance. In scenarios involving the initialassessment of a new patient, for example, there are five observables,including ratings of qualities of "Adapting to situational constraints" and"Adequacy of patient-history procedures." The evaluation rules are basedon whether the examinee has carried out an assessment of the patient thataddresses issues that are implied by the patient's responses, condition, andtest results such as radiographs and probing depths.• The Measurement Model part of the evidence model provides informationabout the connection between student model variables and observablevariables. Psychometric models are often used for this purpose, includingthe familiar classical test theory and IRT, and the less familiar latent modelsand cognitive diagnosis models. In an operational assessment,measurement models guide the Summary Scoring process. Measurementmodels concern the accumulation and synthesis of evidence across tasks, interms of student model variables.Looking ahead again, a graphical model containing both the student modelvariables and observable variables is the machinery that affects probabilitybasedaccumulation and synthesis of evidence over task performances. Forour GRE example, the measurement model is IRT. Figure 5 shows themeasurement model used in the GRE CAT. It gives the probability for acorrect or incorrect response to a particular Item j, as a function of astudent’s IRT proficiency variable, θ. When it comes time to update beliefabout a student’s θ based on a response to this item, this fragment is joinedwith the student model discussed above and the updating proceduresdiscussed in, for example, Mislevy (1995) enter into play. Figure 6 depicts ameasurement model for a more complex DISC task, in which five aspects ofperformance are captured as observable variables, and two aspects ofproficiency are updated in terms of probability distributions for studentmodelvariables.10
θXjFigure 5: The measurement model used in GRE-CAT. This figure shows thatthe probability distribution of the variable for the response to Item j, or X j ,depends on the student's proficiency variable θ. When a response to X j , isobserved, one uses Bayes Theorem to update belief about θ, in terms of itsprobability distribution in the student model.Adapting to situational constraintsInformation gathering/UsageAddressing the chief complaintAdequacy of examination proceduresAssessmentAdequacy of history proceduresCollection of essential informationContextFigure 6: The measurement model used in the DISC prototype. This is a measurement model for scenariosin which an examinee is examining a new patient, and there are no special medical or ethicalconsiderations. The five variables on the right represent observable variables. The two toward theupper left are two of the variables from the student model, which are modeled as governing, inprobability, students' responses in these situations. The variable at the bottom center accounts for thedependencies among the observables that arise from evaluating multiple aspects of the same complexperformance.Where Do We Measure It: The Task ModelTask Models describe how to structure the kinds of situations we need to obtainthe kinds of evidence we need for the evidence models. They describe thepresentation material which is presented to the examinee and the work productswhich are generated in response. They also contain task model variables thatdescribe features of tasks as well as how those features are related to the presentation11
Page 1 and 2: A Brief Introduction to Evidence-Ce
Page 3 and 4: A Brief Introduction to Evidence-Ce
Page 5 and 6: models, this primer places such mod
Page 7 and 8: As powerful as it is in organizing
Page 9 and 10: ActivitySelection ProcessAdministra
Page 11: evidence in the form of behavior in
Page 15 and 16: constraints describe how tasks must
Page 17 and 18: concerns administering pre-assemble
Page 19 and 20: are all being carried out as testin
Page 21 and 22: ReferencesAlmond, R.G. Steinberg, L
Page 23 and 24: Appendix AFurther Readings about th
Page 25 and 26: Mislevy, R.J., Steinberg, L.S., & A
Page 27 and 28: Mislevy, R.J. (1994). Evidence and
Page 29 and 30: Appendix BA Glossary of Evidence-Ce
Page 31 and 32: Four Processes. Any assessment must
Page 33: Task. A Task is a unit of work requ

A Brief Introduction to Evidence-Centered Design CSE Report 632 ...

Create successful ePaper yourself

Delete template?

Save as template?