Chapter 6: Assessment of Student Learning ... - scienceinquirer
Chapter 6: Assessment of Student Learning ... - scienceinquirer
Chapter 6: Assessment of Student Learning ... - scienceinquirer
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Chapter</strong> 6: <strong>Assessment</strong> <strong>of</strong> <strong>Student</strong> <strong>Learning</strong><br />
Introduction<br />
This chapter will address the nature <strong>of</strong> assessment and the purposes <strong>of</strong> assessment at different<br />
levels in the educational system from the classroom, to the district and state, to the national and<br />
international levels.<br />
The big idea <strong>of</strong> assessment is that assessments are cyclical in nature— teachers and students use<br />
assessment to monitor student progress which in turn informs instructional decisions that support<br />
learning. This chapter will discuss the role <strong>of</strong> the teacher and the role <strong>of</strong> the student in the<br />
assessment instruction cycle. It will also address a variety <strong>of</strong> assessments designed to test student<br />
mastery <strong>of</strong> higher-order thinking and the integral role <strong>of</strong> the investigation and experimentation<br />
standards as part <strong>of</strong> assessment.<br />
Results from classroom assessments provide quality feedback to teachers. This chapter will<br />
address using data and results that will allow teachers to improve student learning, inform and<br />
guide instruction, and research their teaching practices. This chapter will also cover strategies<br />
for the assessment <strong>of</strong> English learners and special needs students. Information regarding the<br />
current statewide assessment system in science will also be covered.<br />
I. The Nature <strong>of</strong> <strong>Assessment</strong><br />
The nature <strong>of</strong> assessment is the essence <strong>of</strong> knowing what students should know, do, and<br />
understand. In science, assessments should provide students the opportunity to demonstrate their<br />
understanding <strong>of</strong> important and meaningful science content, to use scientific tools and processes,<br />
and to apply their knowledge and understanding to real-life situations.<br />
<strong>Assessment</strong> does not exist in isolation, but must be closely aligned with the goals <strong>of</strong> state<br />
standards, curriculum, and instruction to support learning. Current research calls for “balanced<br />
assessment systems” that align and restore:<br />
• Comprehensiveness—the use <strong>of</strong> multiple sources <strong>of</strong> evidence to draw inferences<br />
about an individual student’s pr<strong>of</strong>iciency,<br />
• Coherence—a shared model <strong>of</strong> learning that links curriculum, instruction and<br />
assessment within the classroom and links the classroom with large-scale assessments<br />
and<br />
• Continuity between classroom, district and state assessments calling for a longitudinal<br />
assessment <strong>of</strong> learning progress over time. i<br />
<strong>Assessment</strong>, testing, and educational measurement are <strong>of</strong>ten used interchangeably to refer to a<br />
process by which educators use students’ responses to stimuli in order to draw inferences about<br />
students’ knowledge and skills. ii While testing usually refers to standardized multiple-choice<br />
instruments, assessment per se denotes a more comprehensive view <strong>of</strong> student performance. iii<br />
1
The terms assessment and evaluation are also used interchangeably and in many contexts.<br />
<strong>Assessment</strong> refers to the judgment <strong>of</strong> student performance and evaluation refers to the judgment<br />
<strong>of</strong> programs or organizational effectiveness. iv<br />
2
II. Purpose <strong>of</strong> <strong>Assessment</strong><br />
<strong>Assessment</strong> is a systemic, multi-step process involving the collection and interpretation <strong>of</strong><br />
educational data. As the primary feedback mechanisms in the educational system, assessments<br />
provide information to students about how well they are performing; to teachers about how well<br />
their students are learning; to districts about the effectiveness <strong>of</strong> their teachers and programs; and<br />
to policymakers about the effects <strong>of</strong> their policies. The intent <strong>of</strong> this feedback is to allow<br />
stakeholders within the educational system to make informed decisions regarding improved<br />
student learning, teacher development, program modifications, and changes in policy. v<br />
The purposes <strong>of</strong> assessment can be categorized into three main areas: (1) support <strong>of</strong> student<br />
learning; (2) certification, which includes reporting individual student achievement, placement<br />
and/or promotion; and (3) accountability, which is designed to evaluate programs and inform<br />
policy. The first purpose focuses primarily on formative and summative classroom assessments<br />
while the second and third are geared more toward large-scale assessments including district,<br />
state, and national tests.<br />
Formative and Summative Classroom <strong>Assessment</strong>s<br />
Formative assessment is defined as assessment carried out during the instructional process for<br />
improving teaching or learning. vi <strong>Assessment</strong> becomes formative only when either the teacher or<br />
the student uses that information to inform teaching and/or to influence learning. vii<br />
Formative assessments are informal, ongoing assessments that provide continuous opportunities<br />
for teachers to observe, question, listen, and provide immediate feedback to students. Formative<br />
assessment also provides opportunities for students to become more involved in the assessment<br />
process and to become self-reflective about their own learning.<br />
The line between instruction and assessment is blurred in classrooms where formative<br />
assessment is used to support learning. “Everything students do—from conversing in groups,<br />
completing seatwork, answering and asking questions, to sitting quietly and looking confused—<br />
is a potential source <strong>of</strong> information about what they do and do not understand.The teacher who<br />
consciously uses assessment to support learning takes in this information, analyzes it, and makes<br />
instructional decisions that address the understandings and misunderstandings that these<br />
assessments reveal.” viii<br />
While formative assessments occur minute-by-minute and day-by-day, summative assessments<br />
are cumulative assessments, usually occurring at the end <strong>of</strong> a unit <strong>of</strong> instruction. Designed to<br />
measure what a student has learned after a certain period <strong>of</strong> time, summative assessments are<br />
administered less frequently than formative assessments. Teachers also use summative<br />
assessments as pretests to see what students understand before they teach a unit <strong>of</strong> instruction<br />
and as posttests afterwards to see what students learned as a result <strong>of</strong> their instruction.<br />
Summative assessments are also used for reporting grades at the end <strong>of</strong> a semester.<br />
Summative classroom assessments should (1) enable students to draw on what they have learned<br />
to explain new phenomena, think critically and make informed decisions ix , and (2) consist <strong>of</strong><br />
multiple measures including hands-on performance tasks, constructed response investigations,<br />
3
long and short essays, portfolios, interactive computer tasks, and well constructed multiplechoice<br />
tests.<br />
District Summative <strong>Assessment</strong>s<br />
School districts administer summative tests to students throughout the year to determine if<br />
students are learning the grade level science content recommended in the state standards and to<br />
evaluate the district science program. Examples <strong>of</strong> the types <strong>of</strong> summative science assessments<br />
used in school districts in California include benchmark and interim assessments and end-<strong>of</strong>course<br />
tests.<br />
Benchmark and interim assessments are used to monitor progress during the school year toward<br />
meeting state standards and NCLB performance goals. x These assessments usually consist <strong>of</strong><br />
multiple-choice questions and are administered at the end <strong>of</strong> every quarter. These types <strong>of</strong><br />
assessments focus on program evaluation and provide teachers with information about which<br />
science standards students have mastered. Current research does not show that benchmark or<br />
interim assessments help to improve student learning or achievement in science. xi<br />
End-<strong>of</strong>-course tests are used by districts at the high school level to determine the content learned<br />
by students as a result <strong>of</strong> taking a specific course <strong>of</strong> study. Districts also implement end-<strong>of</strong>course<br />
tests to: establish the effectiveness <strong>of</strong> the curriculum in each science domain; ensure that<br />
course content is focused on state standards; establish a common level <strong>of</strong> expected student<br />
performance; ensure that evaluation <strong>of</strong> student performance is consistent across classrooms and<br />
schools in the district; and to help identify students who need additional help to meet graduation<br />
requirements.<br />
Districts participating in state-funded projects also administer summative assessments; for<br />
example, districts participating in the California Mathematics Science Partnerships (CaMSP) are<br />
required to administer a standards-based assessment as a pretest and posttest to students in both<br />
treatment and control groups at the beginning and end <strong>of</strong> the school year. The analyses <strong>of</strong> the<br />
pretest and posttest results are used to determine if the treatment teachers’ training makes a<br />
difference in student learning and achievement <strong>of</strong> science. Districts are using a variety <strong>of</strong><br />
multiple-choice tests aligned to enduring grade level standards.<br />
State Summative <strong>Assessment</strong>s<br />
The California Standards Tests (CSTs) are summative assessments that measure student<br />
achievement <strong>of</strong> the Science Content Standards for California Public Schools. The CST’s are a<br />
battery <strong>of</strong> standardized tests that comprise the state's STAR (Standardized Testing and<br />
Reporting) Program. All students in grades two through eleven participate in the STAR Program<br />
including students with disabilities and students who are English learners. Section VII <strong>of</strong> this<br />
chapter addresses the science portion <strong>of</strong> the CST in more depth.<br />
National and International <strong>Assessment</strong>s<br />
The National <strong>Assessment</strong> <strong>of</strong> Educational Progress (NAEP)—also known as the Nation’s Report<br />
Card—measures fourth-, eighth-, and twelfth-grade students’ performance in science with<br />
assessments designed specifically for national and state information needs. The NAEP<br />
4
assessment contains multiple measures including: multiple-choice items, constructed response<br />
questions, hands-on performance tasks, and interactive computer tasks. All <strong>of</strong> the components <strong>of</strong><br />
NAEP are aligned to the content recommendations in the 2009 NAEP Science Framework.<br />
Participation in NAEP allows states to compare student achievement to the achievement <strong>of</strong><br />
students in other states.<br />
The international assessments, the Trends in International Mathematics and Science Study<br />
(TIMSS) and the Program for International <strong>Student</strong> <strong>Assessment</strong> (PISA), enable the United States<br />
to benchmark its performance—in fourth-grade and eighth-grade mathematics and science in<br />
TIMSS and in 15-year-old students’ mathematics, science, and reading literacy in PISA—to that<br />
<strong>of</strong> other countries.<br />
Each test was designed to serve a different purpose and each is based on a separate and unique<br />
framework and set <strong>of</strong> assessment questions although content areas assessed and ages and grade<br />
levels <strong>of</strong> students are significantly similar. The definitions <strong>of</strong> science differ among the three<br />
assessments: The NAEP framework defines science as physical science, life science and earth<br />
science. TIMSS also includes life science and earth science, but with regard to physical science,<br />
TIMSS splits it into separate domains for physics and chemistry. NAEP identifies three<br />
categories <strong>of</strong> “knowing and doing science” as conceptual understanding, scientific investigation,<br />
and practical reasoning. PISA takes a broader approach than both NAEP and TIMMS in<br />
addressing important competencies required for scientific literacy: identifying scientific issues,<br />
explaining scientific phenomena, and using scientific evidence. PISA’s content can be divided<br />
into knowledge <strong>of</strong> the natural world (in the fields <strong>of</strong> life systems, physical systems, Earth and<br />
space systems, and technology systems) and knowledge about science itself (scientific inquiry<br />
and scientific explanations). All three assessments are conducted regularly to allow the<br />
monitoring <strong>of</strong> student outcomes over time.<br />
5
III. <strong>Assessment</strong> <strong>of</strong> <strong>Student</strong> <strong>Learning</strong><br />
Quality classroom assessment informs instruction and improves student understanding, learning<br />
and achievement. Both formative and summative assessments make up quality classroom<br />
assessments. Formative assessment is defined as a planned, ongoing dynamic process in which<br />
teachers and students use evidence to adjust teaching and learning. Measurements <strong>of</strong> student<br />
learning, such as scores from a summative test, are just one component <strong>of</strong> the formative<br />
assessment process. While informal or formal assessments play a role in this process, they are<br />
not the process itself. xii<br />
In this chapter, assessment <strong>of</strong> student learning in science is defined as a process <strong>of</strong> formative<br />
assessment that integrates instruction with multiple measures <strong>of</strong> student ability including a<br />
variety <strong>of</strong> techniques for various learning styles and levels <strong>of</strong> readiness.<br />
The research base clearly supports the process <strong>of</strong> formative assessment in improved student<br />
learning and achievement. A synthesis <strong>of</strong> more than 4,000 research studies undertaken during the<br />
last 40 years consistently shows that, when implemented well, formative assessment can<br />
significantly improve student learning and achievement more than any other educational<br />
outcome. xiii<br />
“The big idea <strong>of</strong> formative assessment is that evidence about student learning is used to adjust<br />
instruction to better meet student needs; in other words, teaching is adaptive to the student’s<br />
learning needs and assessment is done in real time.” xiv When teachers engage in formative<br />
assessment, the purpose <strong>of</strong> assessment changes from just measuring what students know to<br />
enhancing student learning. xv In this new role, assessment is a shared responsibility between<br />
teachers and students.<br />
The Teacher’s Role in Classroom <strong>Assessment</strong><br />
The teacher’s role in ongoing assessment is to facilitate student growth, understanding, and<br />
learning. Teachers use continuous assessments to: improve classroom practice, plan curricula,<br />
develop self-directed learners, report student progress, and investigate their own teaching<br />
practices. xvi<br />
While numerous strategies can be used for formative assessment, current research shows that<br />
higher-level questioning, descriptive feedback, student self-assessment and reflection, and<br />
student self-regulated learning all have a positive effect on student achievement and the ability<br />
for students to transfer their learning to new situations. xvii<br />
Good questioning is at the heart <strong>of</strong> classroom practice. Teachers spend at least 80% <strong>of</strong> their time<br />
engaged in questioning on any given day. Research shows that questioning can improve student<br />
learning when teachers: (1) structure questions around information that is critical to the topic, not<br />
around information that might be interesting or unusual; (2) ask questions that are higher-level—<br />
the questions require students to analyze, synthesize, and apply information instead <strong>of</strong> recalling<br />
facts; (3) provide students with “wait time” after a question so that students have time to think<br />
about their response; and (4) help students establish a mental map to process their learning<br />
experiences. xviii<br />
6
Teacher feedback is central to formative assessment. The effectiveness <strong>of</strong> formative assessment<br />
is dependent on both the quality <strong>of</strong> the information gathered and the quality <strong>of</strong> the feedback<br />
provided. In a study on teacher grading and feedback, researchers investigated the effectiveness<br />
<strong>of</strong> different kinds <strong>of</strong> feedback over a series <strong>of</strong> lessons. The students were randomly assigned to<br />
one <strong>of</strong> three groups: Group A received written feedback clearly describing what they did<br />
correctly, what was incorrect, and what was needed to improve their work; Group B received<br />
only grades derived from scoring; and Group C received grades and general comments. The<br />
scores for the students in Group A that received constructive descriptive feedback increased<br />
significantly from the first to the second lesson, while the scores for the students in Groups B and<br />
C declined between the first and the second lesson. xix<br />
Research shows that in order for feedback to be effective it should be: (1) corrective in nature<br />
clearly describing what the student is doing that is correct, not correct, and what needs to be done<br />
to improve their work; (2) provided in a timely manner; (3) and specific to a criterion referencing<br />
a specific level <strong>of</strong> skill or knowledge. xx<br />
The <strong>Student</strong>’s Role in Classroom <strong>Assessment</strong><br />
<strong>Student</strong>s are ultimately responsible for taking action to bridge the gap between where they are<br />
and where they need to go in their learning. xxi Research shows that when students have insight<br />
into their own strengths and weaknesses and develop their own repertoires <strong>of</strong> strategies for<br />
learning, their learning improves. Self-assessment, peer-assessment, and self-regulation are<br />
metacognitive strategies that assist students in improving their own learning. xxii<br />
Peer-assessment is a powerful complement to formative assessment. <strong>Student</strong> discourse during<br />
peer-assessment is valuable because it allows students to assume the role <strong>of</strong> the teacher. In the<br />
role <strong>of</strong> teacher, students have to make sure that they understand the content so that they can<br />
evaluate the understanding <strong>of</strong> their peers. xxiii<br />
As students become self-regulated learners, they are able to describe their strengths, analyze<br />
learning tasks to consider their options, explain their choices in completing their learning tasks,<br />
and regularly set goals for future learning. xxiv<br />
In order for self-assessment, peer-assessment and self-regulated learning to become effective<br />
components <strong>of</strong> student learning, students must understand the criteria used to evaluate their work<br />
and the difference between quality work and substandard work. <strong>Student</strong>s should also be taught<br />
the habits and skills <strong>of</strong> the collaborative process used in peer-assessment, requiring them to see<br />
their work objectively. To become a self-directed learner, students set their sights on their own<br />
learning goals and understand the steps they must go through in order to get there. xxv<br />
Strategies and Techniques for Formative <strong>Assessment</strong><br />
Research maintains that the process <strong>of</strong> effective formative assessment consists <strong>of</strong> five key<br />
strategies. xxvi Figure 1 below outlines the five key strategies and one suggested technique for<br />
implementing each strategy. xxvii<br />
Figure 1: Key Strategies and Techniques for Effective Formative <strong>Assessment</strong><br />
7
Strategy Technique Description<br />
1. Clarifying <strong>Learning</strong><br />
Intentions and<br />
Sharing Criteria for<br />
Success<br />
Sharing<br />
Exemplars<br />
Before asking 11 th grade students to write a lab report, the teacher<br />
gives each student four sample lab reports representing varying<br />
degrees <strong>of</strong> quality. The lab reports are teacher-generated or from<br />
a previous class. <strong>Student</strong>s are asked to analyze the reports and<br />
2. Engineering<br />
Effective Classroom<br />
Discussions,<br />
Questions, and<br />
<strong>Learning</strong> Tasks that<br />
Elicit Evidence <strong>of</strong><br />
<strong>Learning</strong><br />
3. Providing Feedback<br />
that Moves<br />
Learners Forward<br />
4. Activating <strong>Student</strong>s<br />
as Learners <strong>of</strong> Their<br />
Own <strong>Learning</strong><br />
5. Activating <strong>Student</strong>s<br />
as Instructional<br />
Resources for One<br />
Another<br />
White<br />
Boards<br />
Find it and<br />
Fix it<br />
Traffic<br />
Lighting<br />
Pre-Flight<br />
Check List<br />
identify why certain reports are <strong>of</strong> a higher quality than others.<br />
During a 4 th grade lesson on magnetism, the teacher asks the class<br />
what would happen if two like poles <strong>of</strong> two magnets were placed<br />
together. He asks the class to write their answers on their white<br />
boards and hold them up on the count <strong>of</strong> three. Using this kind <strong>of</strong><br />
“all student response system” helps the teacher to get a sense <strong>of</strong><br />
what students understand while requiring all students to engage in<br />
the task. If all answers are correct, the teacher moves on. If none<br />
are correct, the teacher may choose to re-teach the concept in<br />
another way. If there are a variety <strong>of</strong> answers, the teacher can use<br />
the information from the student answers to direct a class<br />
discussion.<br />
<strong>Student</strong>s in a 7 th grade classroom just completed a task on plant<br />
and animal cells. Rather then checking all correct answers and<br />
putting a check next to incorrect ones, the teachers tells a student<br />
“three <strong>of</strong> your answers are incorrect; find them and fix them.” This<br />
requires the student to engage cognitively in response to the<br />
feedback rather than reacting emotionally to a letter grade.<br />
After students in a 3 rd grade class complete a lesson on energy and<br />
matter, they review the learning goal their teacher provided at the<br />
beginning <strong>of</strong> the lesson and hold up a colored circle to indicate their<br />
level <strong>of</strong> understanding. Green means I understand; yellow, I’m not<br />
sure; and red, I do not understand. At regular intervals, the teacher<br />
provides time in class for students to move their learning forward by<br />
turning their reds to yellows and their yellows to green.<br />
For homework, students in a 9 th grade class write a paper on a<br />
science–based societal issue. Before turning in their work, students<br />
trade papers with a peer. Each student completes a “pre-flight<br />
checklist” by comparing the peer’s document against a list <strong>of</strong><br />
required elements, e.g., identify a science-based societal issue, cite<br />
research studies, analyze data, and communicate findings.<br />
As teachers utilize these key strategies and techniques for formative assessment and integrate<br />
them into their practice, they view their own practice in new ways.<br />
Implementing and Sustaining Formative <strong>Assessment</strong> with Teacher <strong>Learning</strong> Communities<br />
Formative assessments are not common practices in most teachers’ classrooms and changes in<br />
teacher practice are not always easy to implement. Furthermore, pr<strong>of</strong>essional development in<br />
almost any aspect <strong>of</strong> assessment is sparse. By working with practicing classroom teachers in real<br />
time, researchers have identified practical suggestions for setting up Teacher <strong>Learning</strong><br />
Communities (TLC) to implement and sustain formative assessment. xxviii Figure 2 below outlines<br />
a strategy found to be successful in establishing a TLC. xxix<br />
8
Figure 2: Strategies for Implementing a Teacher <strong>Learning</strong> Community Around Classroom<br />
<strong>Assessment</strong><br />
Suggestion<br />
Plan for the TLC to run for at least<br />
two years<br />
Start with volunteers<br />
Meet monthly for at least 75<br />
minutes<br />
Aim for a group <strong>of</strong> 8-10 teachers<br />
Try to group teachers with similar<br />
assignments<br />
Establish building-based groups<br />
Require teachers to make<br />
detailed, modest, individual action<br />
plans<br />
Creating an Action Plan<br />
Teacher Leader to organize and<br />
coordinate meetings<br />
Rationale<br />
Formative assessment is not a quick fix. It takes time to learn,<br />
practice, and refine your strategies.<br />
Formative assessment cuts across many established practices in<br />
schools and volunteers are more likely to find ways around obstacles.<br />
Monthly meetings are more suited to teachers’ schedules and time.<br />
To ensure that all individuals have adequate time to report and share,<br />
the meeting should last 75 minutes or longer.<br />
When the group is too small, there are not enough differences <strong>of</strong><br />
opinion to provide for good teacher learning. When the group is too<br />
large, all members may not have time to talk.<br />
Teachers should work and share in small grade level groups.<br />
While cross-building meetings can be productive, it is best to work<br />
within sites so that support can be maintained with a group <strong>of</strong> trusted<br />
colleagues.<br />
At the first meeting, each teacher should made a specific plan about<br />
what they want to change. Teachers should focus on a small number<br />
<strong>of</strong> changes they can integrate into their practice.<br />
The following questions are intended to help teachers format their<br />
own action plans:<br />
1. What is one thing that you will find easy to change? What<br />
difference do you expect it to make to your practice?<br />
2. What is one thing that you would like to change that will require<br />
support? What help would you need?<br />
3. What other changes would you like to make later on in the year?<br />
What help might you need?<br />
4. What will you do differently or stop doing to implement these<br />
changes?<br />
Someone needs to make sure the meetings happen, e.g., secure a<br />
room, send out the agenda, secure refreshments and so on. This<br />
person should not be an expert. The idea <strong>of</strong> a TLC is that each<br />
person comes with a clear idea about what they want help with and<br />
the group helps that person with the task.<br />
The following five-part process xxx was also found to be successful in implementing and<br />
sustaining teacher learning community meetings:<br />
1. Introduction (5-10 minutes): Participating teachers agree on the goals <strong>of</strong> the meeting and<br />
agenda.<br />
2. How’s it going? (30-50 minutes): Each teacher provides a summary <strong>of</strong> what they did in<br />
relation to their action plan during the previous month. The other teachers listen and<br />
provide support for that teacher in moving their plans forward.<br />
9
3. New learning about formative assessment (25-40 minutes): The teacher leader or a small<br />
group <strong>of</strong> other teachers research and introduce new ideas in formative assessment to the<br />
group. The teachers engage in a shared activities intended to improve their understanding<br />
<strong>of</strong> formative and summative assessment.<br />
4. Individual teacher planning (10-15 minutes): Based on the group discussion, feedback<br />
and new learning, teachers may want to revise their action plans. Teachers need time to<br />
think through what they are planning to do in the next month. They may also want to<br />
discuss new ideas with their colleagues.<br />
5. Review <strong>of</strong> the meeting (5 minutes): The lead teacher redirects the group to the original<br />
goals and objectives for the meeting and checks to see if they were achieved.<br />
Teacher learning communities have the potential to support the implementation <strong>of</strong> formative<br />
assessments while installing ownership in teachers for their own pr<strong>of</strong>essional development. The<br />
strategies mentioned above provide the foundation for a practical and workable model that will<br />
enable schools to initiate and sustain teacher pr<strong>of</strong>essional development based on formative<br />
assessment.<br />
The <strong>Assessment</strong> Instruction Cycle<br />
During the assessment instruction cycle, teachers continuously observe student behavior, collect<br />
evidence, and make reasonable inferences about what students know. <strong>Assessment</strong> is central to<br />
teaching and to instruction—an invisible thread connects assessment, curriculum and teaching<br />
together in the service <strong>of</strong> learning. xxxi<br />
There are four major components to the assessment instruction cycle: (1) achievement<br />
expectations; (2) the cyclical nature <strong>of</strong> assessment and instruction; (3) multiple forms <strong>of</strong><br />
assessment; and (4) evidence and feedback. xxxii<br />
The bases for state, district and classroom assessment, as well as curriculum and instruction in<br />
California are the State Science Content Standards. Achievement expectations start with the state<br />
standards and there is strong alignment among the state standards, the state adopted science<br />
curricula, the teachers’ instructional practices and the students’ learning goals. <strong>Student</strong> learning<br />
goals are clearly translated into plain language that all students can understand. Teachers guide<br />
students through well-defined learning progressions and students understand where they need to<br />
go next to accomplish their goals. Teachers also provide students with criteria for how their work<br />
will be judged and exemplars or models <strong>of</strong> quality student work. xxxiii<br />
<strong>Assessment</strong> and instruction are cyclical in nature. Teachers and students use assessment to<br />
monitor student progress, which in turn informs instructional decisions that support learning.<br />
Teachers assess, determine needs, provide descriptive feedback, set goals, provide guided<br />
practice, and keep the cycle in continuous motion. <strong>Student</strong>s work with their teacher to know<br />
where they are in their learning continuum. With their teacher’s guidance, students track and<br />
manage their progress, assess and reflect on their learning, set goals, learn, and keep the cycle in<br />
continuous motion. xxxiv<br />
10
Teachers use multiple forms <strong>of</strong> assessment that yield accurate information about students to<br />
support their learning and achievement. Teachers are continuously collecting evidence, analyzing<br />
it, and providing timely descriptive feedback to students. The evidence and feedback are: directly<br />
related to the standards and to the students’ leaning goals; communicated and understood by<br />
students to encourage self-reflection and goal setting; and used to show growth and improvement<br />
over time for students, teachers, and parents. xxxv<br />
11
IV. Examples <strong>of</strong> Quality Formative and Summative Science <strong>Assessment</strong>s<br />
<strong>Assessment</strong>s should provide students the opportunity to demonstrate their understanding <strong>of</strong><br />
important and meaningful science content, to use scientific tools and processes, to apply their<br />
understandings to solve new problems, and to draw on what they have learned to explain new<br />
phenomena, think critically, and make informed decisions. xxxvi All assessments should have clear<br />
expectations for students, be valid, reliable, and free <strong>of</strong> bias.<br />
Validity<br />
Three types <strong>of</strong> validity are central to assessment: content validity; construct validity; and<br />
instructional validity. Content validity addresses the degree to which an assessment measures the<br />
intended content <strong>of</strong> the standards. Construct validity refers to the degree to which an assessment<br />
measures a “construct” or ability. The Investigation and Experimentation standards, for example,<br />
outline the skills or constructs necessary to engage in scientific inquiry. To make a valid claim<br />
about a student’s ability to conduct inquiry, the assessment would need to assess the range <strong>of</strong><br />
skills in the Investigation and Experimentation standards. Finally, an assessment has<br />
instructional validity if the content <strong>of</strong> the test matches what is actually being taught during<br />
instruction.<br />
Reliability<br />
When assessments are reliable, they consistently measure what they are intended to measure.<br />
There are three kinds <strong>of</strong> consistency in classroom assessments: stability—the consistency <strong>of</strong><br />
student scores over time; alternate test forms—consistency <strong>of</strong> results among two or more<br />
different forms <strong>of</strong> a test; and internal consistency—consistency in the way items on an<br />
assessment work. xxxvii<br />
Bias<br />
Sometimes assessments can be biased against particular groups <strong>of</strong> students. When an assessment<br />
is biased, the constructs <strong>of</strong> the test cause students to perform poorly. All assessments should be<br />
free <strong>of</strong> bias—they should not penalize students because <strong>of</strong> their gender, ethnicity, socioeconomic<br />
status, religion, or other defining characteristics. <strong>Assessment</strong>s should also not be <strong>of</strong>fensive to<br />
students. xxxviii Different forms <strong>of</strong> bias include: xxxix<br />
• Content Bias: Does the assessment contain content that is different or unfamiliar to<br />
different groups? Example: asking girls to compare the mass <strong>of</strong> different footballs when<br />
they have not had experience with footballs.<br />
• Language Bias: Does the assessment contain words that have different or unfamiliar<br />
meanings for different groups? Example: asking urban students about farming techniques<br />
such as forage pits.<br />
• Item Structure and Format Bias: Does the nature <strong>of</strong> the task confuse members <strong>of</strong> different<br />
groups? Example: requiring non-English learners to write a long essay in English.<br />
12
• Stereotyping: Does the assessment give a positive representation <strong>of</strong> different groups?<br />
<strong>Assessment</strong>s should be free <strong>of</strong> material that may be <strong>of</strong>fensive, demeaning, or emotionally<br />
charged.<br />
• Fairness: Is the assessment balanced in terms <strong>of</strong> being equally familiar to every group?<br />
Tests should be free <strong>of</strong> words or phrases that are generally associated with elitism-- polo,<br />
yacht, regatta; finances--venture capital, stock options; regionalisms--grinder, hoagie,<br />
parish; military topics--rapier, mortar, breech; political topics--alderman, pork barrel;<br />
legal topics--tort, docket; and farm topics--combine, thresher.<br />
Assessing the Science Content Standards for California Public Schools<br />
<strong>Assessment</strong>s should cover the content <strong>of</strong> the standards at each grade level including the standards<br />
for Investigation and Experimentation. The Investigation and Experimentation standards are<br />
central to the role <strong>of</strong> assessment in the teaching <strong>of</strong> science. Involving students in scientific<br />
inquiry helps them develop pr<strong>of</strong>iciency in: 1) understanding scientific concepts; 2) appreciating<br />
how and what we know in the realm <strong>of</strong> science; 3) understanding <strong>of</strong> the nature <strong>of</strong> science; 4) the<br />
ability to inquire about the natural world; and 5) the ability to use the skills and attitudes<br />
associated with science. xl<br />
The Investigation and Experimentation standards are multifaceted—they call for students to<br />
make observations, pose questions, make predictions, plan and conduct investigations, use tools<br />
to gather, analyze and use data, generate and evaluate evidence and explanations, use critical and<br />
logical thinking, examine information, consider alternative explanations, and communicate their<br />
results.<br />
<strong>Student</strong> understanding <strong>of</strong> this rich array <strong>of</strong> skills cannot be captured in a simple set <strong>of</strong> multiplechoice<br />
questions. <strong>Assessment</strong>s should consist <strong>of</strong> different strategies ranging from formative<br />
assessments which include teacher observations and feedback to challenge statements, to<br />
summative assessments which include hands-on performance tasks, constructed response<br />
investigations, open-ended questions, portfolios, and well constructed multiple-choice tests.<br />
Multiple-Measures <strong>of</strong> <strong>Student</strong> Achievement<br />
<strong>Assessment</strong>s should be based on multiple measures <strong>of</strong> student ability and include a variety <strong>of</strong><br />
techniques for various learning styles and levels <strong>of</strong> readiness. Figure 4 below outlines examples<br />
<strong>of</strong> formative and summative assessments.<br />
Figure 4: Examples <strong>of</strong> Formative and Summative <strong>Assessment</strong>s<br />
Formative<br />
Teacher Observation, Listening, Questioning and<br />
Feedback<br />
Self-reflection and Self-assessment<br />
Peer <strong>Assessment</strong> and Reflection<br />
Science Notebooks<br />
White Boards<br />
Summative<br />
Hands-On Performance Tasks<br />
Constructed Response<br />
Open-ended Questions<br />
Multiple-choice Questions<br />
Portfolios<br />
13
Graphic Organizers: Concept Maps, Concept Webs,<br />
Venn Diagrams, Flowcharts<br />
Challenge Statements<br />
Extended Research Projects<br />
<strong>Student</strong> Presentations<br />
Interviews<br />
Homework Assignments<br />
Interactive Computer <strong>Assessment</strong>s<br />
Constructed Response Items<br />
Constructed response items require students to write their own answers. <strong>Student</strong> responses are<br />
scored with a scoring rubric tailored specifically to each task. Scoring rubrics can be holistic<br />
(where a single score is assigned to the entire task) or analytical (where each question on a task<br />
receives an individual score). Analytical rubrics are more diagnostic in nature and provide more<br />
detailed information regarding student understanding <strong>of</strong> science content and inquiry constructs in<br />
the task.<br />
Hands-on Performance Tasks<br />
Hands-on performance tasks integrate standards for life, earth and/or physical science with<br />
Investigation and Experimentation constructs. During a hands-on task, students are presented<br />
with a scenario identifying a problem that needs to be solved. <strong>Student</strong>s are provided hands-on<br />
materials organized on a placemat, and asked to: make predictions; setup and conduct an<br />
investigation; record data and observations; organize data (graphs, charts, tables, etc.); explain if<br />
and how the results <strong>of</strong> their investigation either support or refute their prediction; analyze their<br />
results and use their own data and findings to explain their answers; use what they’ve learned in<br />
the task to make an application beyond the task; and/or think <strong>of</strong> another (new) question to<br />
investigate and briefly describe the steps <strong>of</strong> a plan for a new investigation. <strong>Student</strong>s work with a<br />
partner to conduct their investigation and to collect their data. They work individually to record<br />
their answers in their test booklet.<br />
Examples <strong>of</strong> performance tasks are in Appendix A.<br />
Constructed Response Investigations<br />
Constructed Response Investigations are extended paper/pencil tasks that integrate science<br />
concepts with inquiry and investigation. <strong>Student</strong>s are presented with a problem that students<br />
(hypothetical) in another school are trying to solve. They are provided a set <strong>of</strong> authentic data and<br />
a set <strong>of</strong> questions and required to: analyze the problem and the data; graph and interpret data;<br />
interpret relationships on graphs; construct models, questions, predictions and/or hypothesis;<br />
recommend solutions; and/or design new investigations to further explore the problem in the<br />
task. Although students usually work individually, these tasks can be designed to include<br />
information that students would discuss with a partner before writing their individual responses.<br />
Examples <strong>of</strong> constructed response tasks are in Appendix A.<br />
14
Open-ended Questions<br />
Open-ended questions are short paper/pencil tasks that focus on evaluating understanding and<br />
reasoning. They are designed to explore students’ abilities to: communicate scientific<br />
understandings; use inquiry; reason scientifically; express positions on societal issues; and<br />
design an experiment. <strong>Student</strong>s are presented with a prompt, usually in the form <strong>of</strong> a problem or<br />
scenario, and asked to communicate their understandings <strong>of</strong> scientific concepts and processes.<br />
<strong>Student</strong>s work individually to record their responses in their test booklet.<br />
Examples <strong>of</strong> open-ended questions are in Appendix A.<br />
Challenge Statements<br />
Challenge Statements are assessment probes designed to investigate students’ thinking about<br />
important science concepts. The assessment probe consists <strong>of</strong> a deliberately provocative or<br />
ambiguous statement about a science concept such as—“As electrical current passes through<br />
devices such as light bulbs and motors, some <strong>of</strong> it gets used up.” The learner is asked to agree or<br />
disagree with the statement and to explain their reasoning. <strong>Student</strong>s are expected to explain their<br />
thinking using everyday language and not use academic vocabulary. Academic vocabulary can<br />
be used as a screen for not revealing misconceptions. The goal <strong>of</strong> Challenge Statements is to<br />
make student thinking visible and not hide their misconceptions behind their science vocabulary.<br />
Challenge Statements are used before and after a unit <strong>of</strong> instruction. <strong>Student</strong>s start by thinking<br />
about the Challenge Statement and writing their thoughts individually. They discuss their ideas<br />
with their peers and then have an opportunity to revise their statement based on input from their<br />
group. Challenge Statements demand deeper thinking and investigation. They set the stage for<br />
meaningful discussion as part <strong>of</strong> learning.<br />
Challenge Statements are evaluated using a 5-point rubric modeled after the five levels <strong>of</strong><br />
pr<strong>of</strong>iciency measured in the California Standards Tests. In evaluating responses, valid<br />
conceptions and sophistication <strong>of</strong> reasoning are considered.<br />
<strong>Student</strong> Science Notebooks<br />
<strong>Student</strong> Science Notebooks engage students in scientific thinking as they explore questions,<br />
make predictions, plan and conduct investigations, collect, organize and use data, apply their<br />
learning, and communicate their understanding <strong>of</strong> science. As an assessment tool, science<br />
notebooks have been found to: help students construct their conceptual thinking; inform and<br />
guide instruction; enhance literacy skills; support differentiated learning; and foster teacher<br />
collaboration.<br />
White Boards<br />
White Boards are powerful tools for allowing students to make their thinking visible. The use <strong>of</strong><br />
white boards at the beginning <strong>of</strong> an instructional unit is an effective way to elicit students’ prior<br />
knowledge <strong>of</strong> the content to be taught. Before teaching a fourth grade lesson on circuits, a<br />
teacher may ask the class to quickly draw a complete circuit on their white boards and hold them<br />
up. The teacher can easily find out which students understand circuits and use this information to<br />
15
teach the lesson. During the lesson, the teacher may ask expert students to use their white boards<br />
to explain their thinking. This provides novice learners an opportunity to learn from expert<br />
thinking, which is usually hidden. xli At the end <strong>of</strong> the lesson, the teacher may have the students<br />
use the white boards to show what they learned and use this information to prepare for the next<br />
lesson.<br />
Graphic Organizers: Concept Maps, Venn Diagrams, Flowcharts<br />
Graphic organizers, such as concept maps, Venn diagrams, and flowcharts are mental maps <strong>of</strong><br />
student thinking and understanding. Concept maps help students see the connections between<br />
concepts and the differences among concepts. Venn diagrams help students see the relationships<br />
between ideas, and flowcharts can help students to sequence events. Like white boards, they can<br />
be used as assessment strategies for making student thinking visible, helping teachers assesses<br />
what students do and do not understand.<br />
Portfolios<br />
Portfolios are collections <strong>of</strong> student work designed to provide the best evidence <strong>of</strong> a student’s<br />
scientific literacy. They are used to measure student growth over time, showing achievement <strong>of</strong><br />
science concepts, the deepening <strong>of</strong> understanding <strong>of</strong> the scientific method, and the growth <strong>of</strong><br />
both communication and problem solving skills. Through portfolios, students can become<br />
actively engaged in their own learning, gaining a sense <strong>of</strong> pride and ownership <strong>of</strong> their work. As<br />
an assessment tool, portfolios provide opportunities for students to: reflect on and self-evaluate<br />
their learning and work; select a variety <strong>of</strong> different types <strong>of</strong> work they think best represent their<br />
understanding <strong>of</strong> science; and learn how to score and evaluate the work <strong>of</strong> peers. Teachers use<br />
student portfolios to evaluate the progress <strong>of</strong> the student, the class, the curriculum, and their<br />
instruction.<br />
Interactive Computer Tasks<br />
Computer simulations can present students with rich, interactive assessments that model systems<br />
in the natural world. Science simulations can model authentic environments and make concepts<br />
that are difficult to represent in a graphic format such convection currents, the movement <strong>of</strong><br />
molecules in solids, liquids and gases, and/or plate tectonics visible. In an interactive computer<br />
task, students have the opportunity to manipulate stimuli that they would not be able to<br />
manipulate in real time. In an assessment <strong>of</strong> plate tectonics and Earth’s structure, for example,<br />
students can investigate the results <strong>of</strong> different plate movements or how wind, water, and ice<br />
shape and reshape Earth’s surface. Interactive computer simulations allow students to<br />
demonstrate their understandings <strong>of</strong> science content and inquiry in an active manner. Moreover,<br />
computer technology associated with simulations can provide automatic feedback to students and<br />
teachers and can help to inform and guide instruction.<br />
Select Response Items<br />
Select response items are commonly called multiple-choice items. In responding to a multiplechoice<br />
item, students select one <strong>of</strong> four possible answer choices and record their responses on a<br />
separate answer sheet. Each multiple-choice item is: aligned to only one content standard;<br />
contains a stem with either a question or a completion format; and four different answer choices<br />
with only one correct answer. The four answer choices should be approximately the same length,<br />
16
have the same format, and have parallel syntax and semantic structures. At least 10 items are<br />
needed for each standard to reliably report student achievement for that standard. Ten items are<br />
also needed to reliably report student achievement for each domain level <strong>of</strong> life science, earth<br />
science, physical science, and investigation and experimentation. Two examples <strong>of</strong> multiplechoice<br />
items follow.<br />
Regular Multiple-choice Items<br />
A well-constructed multiple-choice item may be a valuable component <strong>of</strong> an assessment system<br />
because it can provide broad coverage <strong>of</strong> important topics and allow students to demonstrate a<br />
variety <strong>of</strong> skills and knowledge. Many “regular” multiple-choice items usually focus on lowerlevel<br />
recall—assessing small, topical pieces <strong>of</strong> information such as, what are the parts <strong>of</strong> a cell,<br />
or in what year was helium discovered. Multiple-choice items require higher-level and theyfocus<br />
more on important skills and can probe analytical reasoning.<br />
While any incorrect student answer can qualify as a misconception, there is a relatively large<br />
research base <strong>of</strong> documented student misconceptions in science. Documented misconceptions<br />
have been studied and confirmed by researchers through thorough investigations. Documented<br />
common student misconceptions in science can be built into the answer choices. If documented<br />
misconceptions are used in the answer choices, it is recommended that only one <strong>of</strong> the four<br />
answer choices contain the documented misconception.<br />
Justified Multiple-choice Items<br />
A modified multiple-choice question is called a justified multiple-choice question. <strong>Student</strong>s<br />
select an answer choice and then explain why they think the answer is correct. <strong>Student</strong>s are<br />
directed to use their understanding <strong>of</strong> specific science content and inquiry to explain why their<br />
answer is correct. Teachers use scoring rubrics specific to each question to score student work.<br />
Examples <strong>of</strong> justified multiple-choice questions are in Appendix A.<br />
Graphic Organizers for Monitoring and Tracking Formative and Summative <strong>Assessment</strong>s<br />
aligned to the California Science Content Standards<br />
Teachers can use various methods to monitor and track different classroom assessments aligned<br />
to the California Science Content Standards. The matrix shown in Figure 5 below shows general<br />
headings for formative and summative assessments. Enduring California science standards for<br />
grade 4 are listed down the left side <strong>of</strong> the matrix. Teachers can monitor and track specific<br />
assessments for formative and summative categories in the cells.<br />
17
Figure 5: Graphic Organizer for Monitoring Formative and Summative <strong>Assessment</strong>s<br />
aligned to the California Science Content Standards<br />
QuickTime and a<br />
TIFF (Uncompressed) decompressor<br />
are needed to see this picture.<br />
By using a variety <strong>of</strong> assessments that have clear expectations for students and are closely linked<br />
to the standards and to learning goals, teachers can capture the full range <strong>of</strong> student<br />
understanding and progress. They can also use the resulting data in thoughtful and powerful<br />
ways to improve student learning and achievement and to inform and guide their instruction.<br />
18
V. Analyzing and Using Data and Results<br />
Results from classroom assessments provide quality feedback to teachers allowing them to:<br />
improve student learning and achievement; inform and modify instruction; plan curriculum;<br />
target teaching; and research teaching practices.<br />
Once teachers collect data and results, they need to make sense <strong>of</strong> their findings before they can<br />
apply them to improved learning and instruction. Analyzing data involves: looking for patterns<br />
or trends in both individual student work and for similar patterns in the work <strong>of</strong> all students in<br />
the class; reflecting on inferences and plausible explanations for findings; making sense out <strong>of</strong><br />
clusters <strong>of</strong> information that go together; and making informed decisions for using the results with<br />
students and with their instruction.<br />
Tally Sheets<br />
Tally Sheets can be designed to record and analyze student results for multiple-choice tests. The<br />
Tally Sheet is a matrix with the item numbers and the codes for the standards assessed identified<br />
across the top <strong>of</strong> the matrix and the names <strong>of</strong> the students listed down the left side <strong>of</strong> the matrix.<br />
The teacher could enter (+) for a correct answer and (–) for an incorrect answer and then tally the<br />
number correct for each student and for each standard. By reading across the matrix from the left<br />
side to the right side, teachers can quickly determine how many items each student responded to<br />
correctly. By reading from the top <strong>of</strong> the matrix to the bottom <strong>of</strong> the matrix for each item,<br />
teachers can quickly determine which standards on this particular test were difficult for students<br />
and which were not. In order to make a reliable inference about student understanding <strong>of</strong> a single<br />
standard, there must be at least ten items for each standard. Figure 6 below shows a tally sheet<br />
made in Excel for recording student responses to a multiple-choice test. Several Tally Sheets can<br />
be made in Excel to keep track <strong>of</strong> student results and progress.<br />
Figure 6: Tally Sheet for Multiple-choice Answers<br />
QuickTime and a<br />
TIFF (Uncompressed) decompressor<br />
are needed to see this picture.<br />
19
Tally Sheets can also be used to capture and analyze information from a hands-on performance<br />
task. A hands-on performance task was administered to eighth grade students in a large urban<br />
school district. The students investigated variables related to force and motion. After the students<br />
took the test, each question in their booklets was scored with an analytical rubric and<br />
summarized in the Tally Sheet in Figure 7 below.<br />
The parts <strong>of</strong> the performance task and associated questions are listed at the top <strong>of</strong> the matrix. The<br />
score points—1 for a correct response, 0—for an incorrect response, and B—for blank are listed<br />
down the left side <strong>of</strong> the matrix. The data, reported in percentages for the 4, 500 students tested,<br />
is recorded in each cell in the table. The data for question 3B, for example, shows that 76% <strong>of</strong><br />
the 4,500 eighth grade students correctly recorded data from their investigation in a data table<br />
while 23% <strong>of</strong> the students did not record data in a table correctly. The matrix also shows that 1%<br />
<strong>of</strong> the students left the question blank. In contrast, the data for question 4 show that only 34% <strong>of</strong><br />
the 4,500 students were able to organize their results correctly on a graph while 62% <strong>of</strong> the<br />
students did not graph their data correctly. The matrix also shows that a 4% <strong>of</strong> the students did<br />
not attempt to graph the data from their investigation.<br />
Figure 7: Tally Sheet Showing <strong>Student</strong> Results for an Eighth Grade Performance Task<br />
QuickTime and a<br />
TIFF (Uncompressed) decompressor<br />
are needed to see this picture.<br />
The information in Figures 6 and 7 allow teachers to use data from a summative test to inform<br />
instruction and improve student learning. Teachers can identify specific areas where students are<br />
experiencing difficulty and target their instruction to address these areas. This allows teachers to<br />
use results from a summative test in a formative manner. Furthermore, research shows that when<br />
teachers identify specific student weaknesses and target their instruction using metacognitive<br />
teaching strategies to address those weaknesses, student achievement improves significantly. xlii<br />
20
<strong>Assessment</strong> data should be drawn from multiple sources and triangulated. Triangulation is a<br />
technique <strong>of</strong> using data from three different sources to determine student achievement <strong>of</strong> specific<br />
content. Three different sources <strong>of</strong> data provide teachers three different perspectives <strong>of</strong> student<br />
work and understanding <strong>of</strong> that content, making their inferences about student understanding<br />
more reliable.<br />
The Logic Model for <strong>Assessment</strong> in Figure 8 shows a graphic representation for triangulating<br />
data from pre-posttests, formative and summative assessments, and the state Content Standards<br />
Test.<br />
In this model, the grey box in the middle represents the formative and summative assessments<br />
that take place during the course <strong>of</strong> standards-based instruction throughout the school year. At<br />
the start <strong>of</strong> instruction in the fall, the teacher administers a pretest to determine students’ prior<br />
knowledge <strong>of</strong> the science concepts for that particular grade level. In this scenario, the school is<br />
participating in a CaMSP and required to pre- and posttest students. Throughout the course <strong>of</strong> the<br />
year, the teacher engages in continuous formative and summative assessment. In the spring, the<br />
teacher administers the California Standards Test for science and at the end <strong>of</strong> the year, the<br />
posttest is administered.<br />
The model shows that the intent <strong>of</strong> the data from the pre- and posttest and the CST is to: see how<br />
well students are achieving the Science Content Standards; determine if the school is meeting its<br />
state performance targets in science; investigate program effects between the schools<br />
participating in the CaMSP; to determine program impact; and to inform local and state<br />
evaluators.<br />
The model also shows that data from all assessments are triangulated to form a culminating body<br />
<strong>of</strong> evidence. At a larger grain size, the results <strong>of</strong> the culminating body <strong>of</strong> evidence are used to<br />
inform and guide instruction, inform and guide pr<strong>of</strong>essional development, plan instruction,<br />
allocate resources, and to disseminate findings <strong>of</strong> what worked to the larger learning network.<br />
21
Figure 8: Logic Model for <strong>Assessment</strong><br />
QuickTime and a<br />
TIFF (Uncompressed) decompressor<br />
are needed to see this picture.<br />
22
VI. Assessing English Learners and Special Needs students<br />
Inclusiveness <strong>of</strong> <strong>Assessment</strong>s<br />
The principles <strong>of</strong> universal design help to make assessments accessible to all students. The<br />
application <strong>of</strong> universal design principles to the development <strong>of</strong> classroom assessments will: xliii<br />
• Allow for the widest range <strong>of</strong> student participation, including students with<br />
disabilities and English Language Learners (ELL)<br />
• Ensure that the assessments themselves are not obstacles to improved learning<br />
• Provide valid inferences about the performance <strong>of</strong> all students<br />
• Provide each student a comparable opportunity to demonstrate their understanding <strong>of</strong><br />
the content tested<br />
The seven elements <strong>of</strong> universally designed assessments include:<br />
1. Inclusive assessment population—addresses the context <strong>of</strong> the entire student population<br />
to be assessed. California classrooms include students with different cognitive, cultural,<br />
and linguistic backgrounds. These students represent a wide range <strong>of</strong> skills, abilities, and<br />
diverse learning needs.<br />
2. Precisely designed construct—recommends that all assessments are designed to measure<br />
what they intend to measure. Formative and summative assessments at all grade levels<br />
need to closely align to the intent and content <strong>of</strong> the standards.<br />
3. Accessible, non-biased items—maintains that all items used in classroom assessment are<br />
not biased against any groups <strong>of</strong> students.<br />
4. Amenable to accommodations—addresses the use <strong>of</strong> appropriate accommodations during<br />
testing. While experts maintain that universally designed assessments will be accessible<br />
to most students, some students will still require accommodations. These<br />
accommodations can include: alternate settings (alternate rooms, non-school settings,<br />
special lighting, furniture, and/or acoustics, other school personnel); scheduling and<br />
timing (to correspond with medical or learning needs, short breaks, extended time);<br />
presentation formats (Braille, large print, signing directions, translation, underlining<br />
words/phrases, visual magnification or reduction, acetate shields); and response formats<br />
(use <strong>of</strong> word processor, typewriter, computer, adult transcription, Brailler, student<br />
dictation).<br />
5. Simple, clear, and intuitive instructions and procedures—maintains that students should<br />
respond to a task in the manner that the test developer intended. Regardless <strong>of</strong> a student’s<br />
ability, language skills, knowledge, or experience, test directions and instructions need to<br />
be simple, clear, consistent, and easy to understand.<br />
6. Maximum readability and comprehension—focuses on the use <strong>of</strong> vocabulary and<br />
sentence complexity appropriate for an intended grade level. Research is showing that<br />
linguistic simplification <strong>of</strong> vocabulary—the use <strong>of</strong> plain language—can benefit all<br />
students, including students with limited English pr<strong>of</strong>iciency. Plain language strategies<br />
23
include: reducing wordiness and removing irrelevant material; eliminating unusual or low<br />
frequency words; avoiding ambiguous and irregularly spelled words; avoiding proper<br />
names; avoiding inconsistent naming and graphic conversions; and marking all questions.<br />
7. Maximum legibility—refers to clear, uncomplicated, and legible text, graphs, tables, and<br />
graphics, and response formats.<br />
English Language Learners<br />
Science teachers who assess English learners will need to insure that these learners have a<br />
reasonable way to communicate what they are learning. Language barriers in the testing process<br />
need to be modified so that the focus <strong>of</strong> the assessment is on science learning, not on the mastery<br />
<strong>of</strong> English. xliv<br />
A variety <strong>of</strong> accommodations can be implemented that can make assessments fair for English<br />
learners. These accommodations should address the same content standards for all students<br />
while, at the same time, <strong>of</strong>fering students different ways <strong>of</strong> performing that respects their<br />
differences and yields accurate results. Accommodations are intended to elicit the most accurate<br />
information about what students know and can do without providing an unfair advantage to<br />
students who do not receive an accommodation. xlv<br />
The table in Figure 9 below describes common testing accommodations that teachers may use in<br />
their classrooms with English learners. These accommodations can be used with formative and<br />
summative assessments. xlvi<br />
Figure 9: <strong>Assessment</strong> Accommodations for English Learners<br />
Test Accommodations<br />
Extra Time<br />
Word Walls, Glossaries,<br />
Dictionaries<br />
Notes in Primary Language<br />
Models & Rubrics<br />
Enhanced Test Directions<br />
Checklists<br />
Oral Responses<br />
Purpose or Use<br />
Extra time is required to read and understand test questions. English<br />
learners need to engage in extra thinking to respond to questions in<br />
English.<br />
Word walls created during instruction provide reference during<br />
assessment so English learners can communicate understanding<br />
easier. Use English and/or bilingual dictionaries when appropriate.<br />
<strong>Student</strong> notes from instruction in their primary language helps them to<br />
produce answers they know in their primary language.<br />
Provide models <strong>of</strong> expected work for students who have not<br />
experienced the type <strong>of</strong> assessment before. Preview the rubric that will<br />
be used to score student work. Previewing models and rubrics before<br />
an assessment helps students understand assessment objectives.<br />
Read directions aloud and rephrase them so that students know what<br />
is expected. Simplify test directions as much as possible—one step at<br />
a time—allowing students to respond in between steps. Use checklists<br />
for directions.<br />
Test anxiety can make communication in English more difficult. Allow<br />
English learners to give oral responses. Prompt students individually<br />
and scaffold the conversation to elicit meaningful responses. Provide<br />
support for constructed response items with sentence frames for<br />
24
Illustrations, Graphic Organizers<br />
Hands-on Activities<br />
Language Conventions<br />
Small Groups<br />
written answers.<br />
Allow students to express ideas with labeled drawings, diagrams or<br />
graphic organizers. Ask students to follow up with oral explanations or<br />
demonstrations.<br />
Have students perform an activity or experiment and tell what they are<br />
doing and thinking. Orally prompt students as needed.<br />
Focus on student understanding <strong>of</strong> science content during a science<br />
assessment and ignore language conventions. Address language<br />
conventions during instruction.<br />
Administer assessments to small groups <strong>of</strong> English learners using<br />
prompts and scaffolds and allowing for oral responses.<br />
Special Needs <strong>Student</strong>s<br />
<strong>Student</strong>s with special needs should have access to the same content standards curriculum and<br />
high quality instruction as students without disabilities. This can be accomplished through: a)<br />
adaptations in delivery <strong>of</strong> content to make it accessible to students’ level <strong>of</strong> understanding, and<br />
b) differentiation in level <strong>of</strong> expectation for student achievement to focus on prioritized target<br />
skills within that content that are both meaningful to students and build growth in academic<br />
learning.<br />
25
VII. The California Standards Test<br />
The purpose <strong>of</strong> the California Standards Test (CST) is to determine students’ achievement <strong>of</strong> the<br />
California content standards for each grade or course in science. <strong>Student</strong>s’ scores are compared<br />
to preset criteria to determine whether the students’ performance on the test is advanced,<br />
pr<strong>of</strong>icient, basic, below basic, or far below basic. The state target is for all students to score at the<br />
pr<strong>of</strong>icient and advanced levels. CST scores are used for calculating each school’s Academic<br />
Performance Index (API) and Adequate Yearly Progress (APY).<br />
The California Science Standards Tests are multiple-choice and administered annually to<br />
students in grades five, eight, and ten. The following tables provide information about the<br />
content and test blueprints for each grade level test.<br />
Grade 5<br />
Content Area Grade Level<br />
Standards<br />
Physical Science 5<br />
4<br />
Life Science 5<br />
4<br />
Earth Science 5<br />
4<br />
Investigation and<br />
5<br />
Experimentation<br />
4<br />
Grade 8<br />
Number<br />
<strong>of</strong> Items<br />
Percentage on<br />
Test<br />
Reference Sheets<br />
8<br />
29 • Periodic Table <strong>of</strong><br />
6<br />
Elements<br />
7<br />
29 • Mineral<br />
7<br />
Information<br />
8<br />
29<br />
6<br />
4<br />
13<br />
2<br />
48 100<br />
Content Area Content Standards Number Percentage on Reference Sheets<br />
<strong>of</strong> Items Test<br />
Physical Science Motion 8 13 • Periodic table <strong>of</strong><br />
Forces 8 13<br />
the elements,<br />
Structure <strong>of</strong> Matter 9 15<br />
formulas, and<br />
Earth in the Solar System 7 12<br />
conversions<br />
(Earth Science)<br />
Reactions 7 12<br />
Chemistry <strong>of</strong><br />
3 5<br />
Living Systems<br />
(Life Science)<br />
Periodic Table 7 12<br />
Density and<br />
5 8<br />
Buoyancy<br />
Investigation and<br />
6 10<br />
Experimentation<br />
60 100<br />
26
Grade 10<br />
Content Area Content Standards Number<br />
<strong>of</strong> Items<br />
Percentage on<br />
Test<br />
Life Science Cell Biology 10 17<br />
Genetics 12 20<br />
Ecology 11 18<br />
Evolution 11 18<br />
Physiology 10 17<br />
Investigation and<br />
6 10<br />
Experimentation<br />
60 100<br />
Reference Sheets<br />
<strong>Student</strong>s in grade 10 who completed a standards-based science course take one <strong>of</strong> the tests listed<br />
above in addition to taking the Grade 10 Life Science Test. <strong>Student</strong>s in grades 9 through 11 who<br />
completed a standards-based science course take one <strong>of</strong> the following CST’s.<br />
Biology/Life Science<br />
Content Area Content Standards Number<br />
<strong>of</strong> Items<br />
Percentage on<br />
Test<br />
Life Science Cell Biology 9 15.0<br />
Genetics 19 31.6<br />
Ecology 7 11.7<br />
Evolution 9 15.0<br />
Physiology 10 16.7<br />
Investigation and<br />
6 10.0<br />
Experimentation<br />
60 100<br />
Reference Sheets<br />
27
Chemistry<br />
Content Area Content Standards Number Percentage on Reference Sheets<br />
<strong>of</strong> Items Test<br />
Physical Science Atomic & Molecular<br />
Structure<br />
Chemical Bonds<br />
6<br />
7<br />
10.0<br />
11.7<br />
• Chemistry<br />
Formulas, Units &<br />
Constants<br />
Conservation <strong>of</strong> Matter and 10 16.7 • Chemistry<br />
Stoichiometry<br />
Gases and Their Properties 6 10.0<br />
Periodic Table <strong>of</strong><br />
Elements<br />
Acids and Bases 5 8.3<br />
Solutions 3 5.0<br />
Chemical<br />
5 8.3<br />
Thermodynamics<br />
Reaction Rates 4 6.7<br />
Chemical<br />
4 6.7<br />
Equilibrium<br />
Organic<br />
Chemistry and<br />
Biochemistry<br />
2 3.3<br />
Investigation and<br />
Experimentation<br />
Earth Sciences<br />
Nuclear<br />
Processes<br />
2 3.3<br />
6 10.0<br />
60 100<br />
Content Area Content Standards Number<br />
<strong>of</strong> Items<br />
Percentage on<br />
Test<br />
Earth Science Earth’s Place in the<br />
12 20.0<br />
Universe<br />
Dynamic Earth Processes 9 15.0<br />
Energy in the Earth System 18 30.0<br />
Biogeochemical Cycles 5 8.3<br />
Structure and<br />
5 8.3<br />
Composition <strong>of</strong><br />
the Atmosphere<br />
California<br />
5 8.3<br />
Geology<br />
Investigation and<br />
6 10.0<br />
Experimentation<br />
60 100<br />
Reference Sheets<br />
28
Physics<br />
Content Area Content Standards Number<br />
<strong>of</strong> Items<br />
Percentage on<br />
Test<br />
Physical Science Motion and forces 12 20.0<br />
Conservation <strong>of</strong> Energy and 12 20.0<br />
Momentum<br />
Heat and Thermodynamics 9 15.0<br />
Waves 10 16.7<br />
Electric and<br />
11 18.3<br />
Magnetic<br />
Phenomena<br />
Investigation and<br />
6 10.0<br />
Experimentation<br />
60 100<br />
Reference Sheets<br />
The California Standards Tests for Science also contain four additional tests that students can<br />
take in conjunction with the tenth grade test. These four tests are designed to integrate/coordinate<br />
concepts from life science, earth science, physical science, and investigation and experimentation<br />
together.<br />
29
VIII. Shifting Classroom <strong>Assessment</strong> “More <strong>of</strong>/ Less <strong>of</strong> Chart”<br />
More <strong>of</strong><br />
The process <strong>of</strong> continuous formative and<br />
summative assessment<br />
<strong>Assessment</strong> data informs and guides<br />
instruction<br />
<strong>Student</strong>s have clarity <strong>of</strong> learning goal(s)<br />
<strong>Student</strong>s receive descriptive feedback<br />
Teacher selects unbiased and fair assessment<br />
tools for a purpose<br />
Teachers use multiple measures to assess<br />
student understanding<br />
Less <strong>of</strong><br />
<strong>Assessment</strong> only for grading<br />
<strong>Assessment</strong> data not used for instruction<br />
<strong>Student</strong>s have limited to no knowledge <strong>of</strong><br />
learning goal(s)<br />
<strong>Student</strong>s receive a grade or non-descriptive<br />
feedback<br />
Teacher uses assessment tools without<br />
consideration <strong>of</strong> bias, fairness, or purpose<br />
Teachers only use multiple-choice questions<br />
IX. Conclusion<br />
The ultimate goal <strong>of</strong> assessment is to improve student understanding and achievement <strong>of</strong><br />
important and meaningful science. It is also for students to develop inquiry skills and habits <strong>of</strong><br />
mind that will enable them to become fully pr<strong>of</strong>icient in science.<br />
30
References<br />
i Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational<br />
Conference, The Future <strong>of</strong> <strong>Assessment</strong>: Shaping Teaching and <strong>Learning</strong>, New York.<br />
ii Popham, J.W. (2000). Modern Educational Measurement. Practical Guidelines for Educational Leaders.<br />
Needham, MA: Allyn & Bacon.<br />
iii National Research Council. (2001). Classroom <strong>Assessment</strong> and the National Science Education<br />
Standards. Committee on Classroom <strong>Assessment</strong> and the National Science Education Standards.<br />
Washington, DC: National Academy Press.<br />
iv Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational<br />
Conference, The Future <strong>of</strong> <strong>Assessment</strong>: Shaping Teaching and <strong>Learning</strong>, New York.<br />
v National Research Council. (1996). National Science Education Standards. National Committee on<br />
Science Education Standards and <strong>Assessment</strong>. Washington, DC: National Academy Press.<br />
vi Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational<br />
Conference, The Future <strong>of</strong> <strong>Assessment</strong>: Shaping Teaching and <strong>Learning</strong>, New York.<br />
vii National Research Council. (2001). Classroom <strong>Assessment</strong> and the National Science Education<br />
Standards. Committee on Classroom <strong>Assessment</strong> and the National Science Education Standards.<br />
Washington, DC: National Academy Press.<br />
viii Leahy, S., Lyon, C., Thompson, M., & Wiliam, D. (2005). Classroom <strong>Assessment</strong>: Minute by Minute, Day<br />
by Day. Educational Leadership, 63(3).<br />
ix National Research Council. (1996). National Science Education Standards. National Committee on<br />
Science Education Standards and <strong>Assessment</strong>. Washington, DC: National Academy Press.<br />
x Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational<br />
Conference, The Future <strong>of</strong> <strong>Assessment</strong>: Shaping Teaching and <strong>Learning</strong>, New York.<br />
xi Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational<br />
Conference, The future <strong>of</strong> <strong>Assessment</strong>: Shaping Teaching and <strong>Learning</strong>, New York.<br />
xii Popham, J.W. (2008). Transformative <strong>Assessment</strong>. Alexandria, VA: Association for Supervision and<br />
Curriculum Development.<br />
xiii Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42.<br />
xiv Wiliam, D. (2007). <strong>Chapter</strong> 9, Content Then Process: Teacher <strong>Learning</strong> Communities in the Service <strong>of</strong><br />
Formative <strong>Assessment</strong>. Solution Tree. P.191.<br />
xv Wiliam, D. (2007). <strong>Chapter</strong> 9, Content Then Process: Teacher <strong>Learning</strong> Communities in the Service <strong>of</strong><br />
Formative <strong>Assessment</strong>. Solution Tree.<br />
31
xvi National Research Council. (2001). Classroom <strong>Assessment</strong> and the National Science Education<br />
Standards. Committee on Classroom <strong>Assessment</strong> and the National Science Education Standards.<br />
Washington, DC: National Academy Press.<br />
xvii Black, P. (2004). The Nature and Value <strong>of</strong> Formative <strong>Assessment</strong> for <strong>Learning</strong>. (Draft paper). Kings<br />
College, London.<br />
xviii Marzano, R. J., Pickering, D.J., Pollock, J.E. (2001). Classroom Instruction that Works. Alexandria, VA:<br />
Association for Supervision and Curriculum Development.<br />
xix Butler, R. (1987). Task-involving and ego-involving properties <strong>of</strong> evaluation: Effects <strong>of</strong> different feedback<br />
conditions on motivational perceptions, interests and performance. Journal <strong>of</strong> Educational Psychology,<br />
79(4), 474-482.<br />
xx Marzano, R. J., Pickering, D.J., Pollock, J.E. (2001). Classroom Instruction that Works. Alexandria, VA:<br />
Association for Supervision and Curriculum Development.<br />
xxi Sadler, R. (1989). Formative assessment and the design <strong>of</strong> instructional systems. Instructional Science,<br />
18, 119-144.<br />
xxii Black, P. (2004). The Nature and Value <strong>of</strong> Formative <strong>Assessment</strong> for <strong>Learning</strong>. (Draft paper). Kings<br />
College, London.<br />
xxiii Sadler, R. (1989). Formative assessment and the design <strong>of</strong> instructional systems. Instructional Science,<br />
18, 119-144.<br />
Black, P. (2004). The Nature and Value <strong>of</strong> Formative <strong>Assessment</strong> for <strong>Learning</strong>. (Draft paper). Kings<br />
College, London.<br />
xxiv Foster, G., Sawicki, E., Schaeffer, H., Zelinski, V. (2002). I Think, Therefore I learn! Ontario, Canada:<br />
Pembroke.<br />
xxv Black, P. (2004). The Nature and Value <strong>of</strong> Formative <strong>Assessment</strong> for <strong>Learning</strong>. (Draft paper). Kings<br />
College, London.<br />
xxvi Wiliam, D. (2007). <strong>Chapter</strong> 9, Content Then Process: Teacher <strong>Learning</strong> Communities in the Service <strong>of</strong><br />
Formative <strong>Assessment</strong>. Solution Tree. P.192-194.<br />
xxvii Wiliam, D. (2007). <strong>Chapter</strong> 9, Content Then Process: Teacher <strong>Learning</strong> Communities in the Service <strong>of</strong><br />
Formative <strong>Assessment</strong>. Solution Tree. P.192-194.<br />
xxviii Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42.<br />
xxix Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42.<br />
xxx Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42.<br />
xxxi National Research Council. (1996). National Science Education Standards. National Committee on<br />
Science Education Standards and <strong>Assessment</strong>. Washington, DC: National Academy Press.<br />
32
xxxii Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.]<br />
xxxiii Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.]<br />
xxxiv Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.]<br />
xxxv Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.]<br />
xxxvi National Research Council. (1996). National Science Education Standards. National Committee on<br />
Science Education Standards and <strong>Assessment</strong>. Washington, DC: National Academy Press.<br />
xxxvii Popham, J.W. (2002). Classroom <strong>Assessment</strong>s. What Teachers Need to Know. Boston, MA: Allyn &<br />
Bacon.<br />
xxxviii Popham, J.W. (2002). Classroom <strong>Assessment</strong>s. What Teachers Need to Know. Boston, MA: Allyn &<br />
Bacon.<br />
xxxix ETS {Need to find reference.]<br />
xl National Research Council. (1996). National Science Education Standards. National Committee on<br />
Science Education Standards and <strong>Assessment</strong>. Washington, DC: National Academy Press.<br />
xli Georghiades, P. (2004). From the general to the situated: Three decades <strong>of</strong> metacognition.<br />
International Journal <strong>of</strong> Science Education, 26(3), 365 – 383.<br />
xlii Comfort, K. B., Klein, S., Bolus, R. (2005). Research in standards-based science assessment:<br />
Iinvestigating teacher understanding and use <strong>of</strong> science assessment data. Unpublished<br />
manuscript.<br />
xliii Thompson, S. J., Johnstone, C. J., & Thurlow, M. L. (2002). Universal design applied to large-scale<br />
assessments (Synthesis Report 44). Minneapolis, MN: University <strong>of</strong> Minnesota, National Center on<br />
Educational Outcomes.<br />
xliv Carr, J., Sexton, U., & Lagun<strong>of</strong>f, R. (2007). Making Science accessible to English Learners. San<br />
Francisco, CA: WestEd<br />
xlv Carr, J., Sexton, U., & Lagun<strong>of</strong>f, R. (2007). Making Science accessible to English Learners. San<br />
Francisco, CA: WestEd<br />
xlvi Carr, J., Sexton, U., & Lagun<strong>of</strong>f, R. (2007). Making Science accessible to English Learners. San<br />
Francisco, CA: WestEd<br />
33