Chapter 6: Assessment of Student Learning ... - scienceinquirer

Chapter 6: Assessment of Student Learning 

Introduction 

This chapter will address the nature of assessment and the purposes of assessment at different 

levels in the educational system from the classroom, to the district and state, to the national and 

international levels. 

The big idea of assessment is that assessments are cyclical in nature— teachers and students use 

assessment to monitor student progress which in turn informs instructional decisions that support 

learning. This chapter will discuss the role of the teacher and the role of the student in the 

assessment instruction cycle. It will also address a variety of assessments designed to test student 

mastery of higher-order thinking and the integral role of the investigation and experimentation 

standards as part of assessment. 

Results from classroom assessments provide quality feedback to teachers. This chapter will 

address using data and results that will allow teachers to improve student learning, inform and 

guide instruction, and research their teaching practices. This chapter will also cover strategies 

for the assessment of English learners and special needs students. Information regarding the 

current statewide assessment system in science will also be covered. 

I. The Nature of Assessment 

The nature of assessment is the essence of knowing what students should know, do, and 

understand. In science, assessments should provide students the opportunity to demonstrate their 

understanding of important and meaningful science content, to use scientific tools and processes, 

and to apply their knowledge and understanding to real-life situations. 

Assessment does not exist in isolation, but must be closely aligned with the goals of state 

standards, curriculum, and instruction to support learning. Current research calls for “balanced 

assessment systems” that align and restore: 

• Comprehensiveness—the use of multiple sources of evidence to draw inferences 

about an individual student’s proficiency, 

• Coherence—a shared model of learning that links curriculum, instruction and 

assessment within the classroom and links the classroom with large-scale assessments 

and 

• Continuity between classroom, district and state assessments calling for a longitudinal 

assessment of learning progress over time. i 

Assessment, testing, and educational measurement are often used interchangeably to refer to a 

process by which educators use students’ responses to stimuli in order to draw inferences about 

students’ knowledge and skills. ii While testing usually refers to standardized multiple-choice 

instruments, assessment per se denotes a more comprehensive view of student performance. iii 

1

The terms assessment and evaluation are also used interchangeably and in many contexts. 

Assessment refers to the judgment of student performance and evaluation refers to the judgment 

of programs or organizational effectiveness. iv 

2

II. Purpose of Assessment 

Assessment is a systemic, multi-step process involving the collection and interpretation of 

educational data. As the primary feedback mechanisms in the educational system, assessments 

provide information to students about how well they are performing; to teachers about how well 

their students are learning; to districts about the effectiveness of their teachers and programs; and 

to policymakers about the effects of their policies. The intent of this feedback is to allow 

stakeholders within the educational system to make informed decisions regarding improved 

student learning, teacher development, program modifications, and changes in policy. v 

The purposes of assessment can be categorized into three main areas: (1) support of student 

learning; (2) certification, which includes reporting individual student achievement, placement 

and/or promotion; and (3) accountability, which is designed to evaluate programs and inform 

policy. The first purpose focuses primarily on formative and summative classroom assessments 

while the second and third are geared more toward large-scale assessments including district, 

state, and national tests. 

Formative and Summative Classroom Assessments 

Formative assessment is defined as assessment carried out during the instructional process for 

improving teaching or learning. vi Assessment becomes formative only when either the teacher or 

the student uses that information to inform teaching and/or to influence learning. vii 

Formative assessments are informal, ongoing assessments that provide continuous opportunities 

for teachers to observe, question, listen, and provide immediate feedback to students. Formative 

assessment also provides opportunities for students to become more involved in the assessment 

process and to become self-reflective about their own learning. 

The line between instruction and assessment is blurred in classrooms where formative 

assessment is used to support learning. “Everything students do—from conversing in groups, 

completing seatwork, answering and asking questions, to sitting quietly and looking confused— 

is a potential source of information about what they do and do not understand.The teacher who 

consciously uses assessment to support learning takes in this information, analyzes it, and makes 

instructional decisions that address the understandings and misunderstandings that these 

assessments reveal.” viii 

While formative assessments occur minute-by-minute and day-by-day, summative assessments 

are cumulative assessments, usually occurring at the end of a unit of instruction. Designed to 

measure what a student has learned after a certain period of time, summative assessments are 

administered less frequently than formative assessments. Teachers also use summative 

assessments as pretests to see what students understand before they teach a unit of instruction 

and as posttests afterwards to see what students learned as a result of their instruction. 

Summative assessments are also used for reporting grades at the end of a semester. 

Summative classroom assessments should (1) enable students to draw on what they have learned 

to explain new phenomena, think critically and make informed decisions ix , and (2) consist of 

multiple measures including hands-on performance tasks, constructed response investigations, 

3

long and short essays, portfolios, interactive computer tasks, and well constructed multiplechoice 

tests. 

District Summative Assessments 

School districts administer summative tests to students throughout the year to determine if 

students are learning the grade level science content recommended in the state standards and to 

evaluate the district science program. Examples of the types of summative science assessments 

used in school districts in California include benchmark and interim assessments and end-ofcourse 

tests. 

Benchmark and interim assessments are used to monitor progress during the school year toward 

meeting state standards and NCLB performance goals. x These assessments usually consist of 

multiple-choice questions and are administered at the end of every quarter. These types of 

assessments focus on program evaluation and provide teachers with information about which 

science standards students have mastered. Current research does not show that benchmark or 

interim assessments help to improve student learning or achievement in science. xi 

End-of-course tests are used by districts at the high school level to determine the content learned 

by students as a result of taking a specific course of study. Districts also implement end-ofcourse 

tests to: establish the effectiveness of the curriculum in each science domain; ensure that 

course content is focused on state standards; establish a common level of expected student 

performance; ensure that evaluation of student performance is consistent across classrooms and 

schools in the district; and to help identify students who need additional help to meet graduation 

requirements. 

Districts participating in state-funded projects also administer summative assessments; for 

example, districts participating in the California Mathematics Science Partnerships (CaMSP) are 

required to administer a standards-based assessment as a pretest and posttest to students in both 

treatment and control groups at the beginning and end of the school year. The analyses of the 

pretest and posttest results are used to determine if the treatment teachers’ training makes a 

difference in student learning and achievement of science. Districts are using a variety of 

multiple-choice tests aligned to enduring grade level standards. 

State Summative Assessments 

The California Standards Tests (CSTs) are summative assessments that measure student 

achievement of the Science Content Standards for California Public Schools. The CST’s are a 

battery of standardized tests that comprise the state's STAR (Standardized Testing and 

Reporting) Program. All students in grades two through eleven participate in the STAR Program 

including students with disabilities and students who are English learners. Section VII of this 

chapter addresses the science portion of the CST in more depth. 

National and International Assessments 

The National Assessment of Educational Progress (NAEP)—also known as the Nation’s Report 

Card—measures fourth-, eighth-, and twelfth-grade students’ performance in science with 

assessments designed specifically for national and state information needs. The NAEP 

4

assessment contains multiple measures including: multiple-choice items, constructed response 

questions, hands-on performance tasks, and interactive computer tasks. All of the components of 

NAEP are aligned to the content recommendations in the 2009 NAEP Science Framework. 

Participation in NAEP allows states to compare student achievement to the achievement of 

students in other states. 

The international assessments, the Trends in International Mathematics and Science Study 

(TIMSS) and the Program for International Student Assessment (PISA), enable the United States 

to benchmark its performance—in fourth-grade and eighth-grade mathematics and science in 

TIMSS and in 15-year-old students’ mathematics, science, and reading literacy in PISA—to that 

of other countries. 

Each test was designed to serve a different purpose and each is based on a separate and unique 

framework and set of assessment questions although content areas assessed and ages and grade 

levels of students are significantly similar. The definitions of science differ among the three 

assessments: The NAEP framework defines science as physical science, life science and earth 

science. TIMSS also includes life science and earth science, but with regard to physical science, 

TIMSS splits it into separate domains for physics and chemistry. NAEP identifies three 

categories of “knowing and doing science” as conceptual understanding, scientific investigation, 

and practical reasoning. PISA takes a broader approach than both NAEP and TIMMS in 

addressing important competencies required for scientific literacy: identifying scientific issues, 

explaining scientific phenomena, and using scientific evidence. PISA’s content can be divided 

into knowledge of the natural world (in the fields of life systems, physical systems, Earth and 

space systems, and technology systems) and knowledge about science itself (scientific inquiry 

and scientific explanations). All three assessments are conducted regularly to allow the 

monitoring of student outcomes over time. 

5

III. Assessment of Student Learning 

Quality classroom assessment informs instruction and improves student understanding, learning 

and achievement. Both formative and summative assessments make up quality classroom 

assessments. Formative assessment is defined as a planned, ongoing dynamic process in which 

teachers and students use evidence to adjust teaching and learning. Measurements of student 

learning, such as scores from a summative test, are just one component of the formative 

assessment process. While informal or formal assessments play a role in this process, they are 

not the process itself. xii 

In this chapter, assessment of student learning in science is defined as a process of formative 

assessment that integrates instruction with multiple measures of student ability including a 

variety of techniques for various learning styles and levels of readiness. 

The research base clearly supports the process of formative assessment in improved student 

learning and achievement. A synthesis of more than 4,000 research studies undertaken during the 

last 40 years consistently shows that, when implemented well, formative assessment can 

significantly improve student learning and achievement more than any other educational 

outcome. xiii 

“The big idea of formative assessment is that evidence about student learning is used to adjust 

instruction to better meet student needs; in other words, teaching is adaptive to the student’s 

learning needs and assessment is done in real time.” xiv When teachers engage in formative 

assessment, the purpose of assessment changes from just measuring what students know to 

enhancing student learning. xv In this new role, assessment is a shared responsibility between 

teachers and students. 

The Teacher’s Role in Classroom Assessment 

The teacher’s role in ongoing assessment is to facilitate student growth, understanding, and 

learning. Teachers use continuous assessments to: improve classroom practice, plan curricula, 

develop self-directed learners, report student progress, and investigate their own teaching 

practices. xvi 

While numerous strategies can be used for formative assessment, current research shows that 

higher-level questioning, descriptive feedback, student self-assessment and reflection, and 

student self-regulated learning all have a positive effect on student achievement and the ability 

for students to transfer their learning to new situations. xvii 

Good questioning is at the heart of classroom practice. Teachers spend at least 80% of their time 

engaged in questioning on any given day. Research shows that questioning can improve student 

learning when teachers: (1) structure questions around information that is critical to the topic, not 

around information that might be interesting or unusual; (2) ask questions that are higher-level— 

the questions require students to analyze, synthesize, and apply information instead of recalling 

facts; (3) provide students with “wait time” after a question so that students have time to think 

about their response; and (4) help students establish a mental map to process their learning 

experiences. xviii 

6

Teacher feedback is central to formative assessment. The effectiveness of formative assessment 

is dependent on both the quality of the information gathered and the quality of the feedback 

provided. In a study on teacher grading and feedback, researchers investigated the effectiveness 

of different kinds of feedback over a series of lessons. The students were randomly assigned to 

one of three groups: Group A received written feedback clearly describing what they did 

correctly, what was incorrect, and what was needed to improve their work; Group B received 

only grades derived from scoring; and Group C received grades and general comments. The 

scores for the students in Group A that received constructive descriptive feedback increased 

significantly from the first to the second lesson, while the scores for the students in Groups B and 

C declined between the first and the second lesson. xix 

Research shows that in order for feedback to be effective it should be: (1) corrective in nature 

clearly describing what the student is doing that is correct, not correct, and what needs to be done 

to improve their work; (2) provided in a timely manner; (3) and specific to a criterion referencing 

a specific level of skill or knowledge. xx 

The Student’s Role in Classroom Assessment 

Students are ultimately responsible for taking action to bridge the gap between where they are 

and where they need to go in their learning. xxi Research shows that when students have insight 

into their own strengths and weaknesses and develop their own repertoires of strategies for 

learning, their learning improves. Self-assessment, peer-assessment, and self-regulation are 

metacognitive strategies that assist students in improving their own learning. xxii 

Peer-assessment is a powerful complement to formative assessment. Student discourse during 

peer-assessment is valuable because it allows students to assume the role of the teacher. In the 

role of teacher, students have to make sure that they understand the content so that they can 

evaluate the understanding of their peers. xxiii 

As students become self-regulated learners, they are able to describe their strengths, analyze 

learning tasks to consider their options, explain their choices in completing their learning tasks, 

and regularly set goals for future learning. xxiv 

In order for self-assessment, peer-assessment and self-regulated learning to become effective 

components of student learning, students must understand the criteria used to evaluate their work 

and the difference between quality work and substandard work. Students should also be taught 

the habits and skills of the collaborative process used in peer-assessment, requiring them to see 

their work objectively. To become a self-directed learner, students set their sights on their own 

learning goals and understand the steps they must go through in order to get there. xxv 

Strategies and Techniques for Formative Assessment 

Research maintains that the process of effective formative assessment consists of five key 

strategies. xxvi Figure 1 below outlines the five key strategies and one suggested technique for 

implementing each strategy. xxvii 

Figure 1: Key Strategies and Techniques for Effective Formative Assessment 

7

Strategy Technique Description 

1. Clarifying Learning 

Intentions and 

Sharing Criteria for 

Success 

Sharing 

Exemplars 

Before asking 11 th grade students to write a lab report, the teacher 

gives each student four sample lab reports representing varying 

degrees of quality. The lab reports are teacher-generated or from 

a previous class. Students are asked to analyze the reports and 

2. Engineering 

Effective Classroom 

Discussions, 

Questions, and 

Learning Tasks that 

Elicit Evidence of 

Learning 

3. Providing Feedback 

that Moves 

Learners Forward 

4. Activating Students 

as Learners of Their 

Own Learning 

5. Activating Students 

as Instructional 

Resources for One 

Another 

White 

Boards 

Find it and 

Fix it 

Traffic 

Lighting 

Pre-Flight 

Check List 

identify why certain reports are of a higher quality than others. 

During a 4 th grade lesson on magnetism, the teacher asks the class 

what would happen if two like poles of two magnets were placed 

together. He asks the class to write their answers on their white 

boards and hold them up on the count of three. Using this kind of 

“all student response system” helps the teacher to get a sense of 

what students understand while requiring all students to engage in 

the task. If all answers are correct, the teacher moves on. If none 

are correct, the teacher may choose to re-teach the concept in 

another way. If there are a variety of answers, the teacher can use 

the information from the student answers to direct a class 

discussion. 

Students in a 7 th grade classroom just completed a task on plant 

and animal cells. Rather then checking all correct answers and 

putting a check next to incorrect ones, the teachers tells a student 

“three of your answers are incorrect; find them and fix them.” This 

requires the student to engage cognitively in response to the 

feedback rather than reacting emotionally to a letter grade. 

After students in a 3 rd grade class complete a lesson on energy and 

matter, they review the learning goal their teacher provided at the 

beginning of the lesson and hold up a colored circle to indicate their 

level of understanding. Green means I understand; yellow, I’m not 

sure; and red, I do not understand. At regular intervals, the teacher 

provides time in class for students to move their learning forward by 

turning their reds to yellows and their yellows to green. 

For homework, students in a 9 th grade class write a paper on a 

science–based societal issue. Before turning in their work, students 

trade papers with a peer. Each student completes a “pre-flight 

checklist” by comparing the peer’s document against a list of 

required elements, e.g., identify a science-based societal issue, cite 

research studies, analyze data, and communicate findings. 

As teachers utilize these key strategies and techniques for formative assessment and integrate 

them into their practice, they view their own practice in new ways. 

Implementing and Sustaining Formative Assessment with Teacher Learning Communities 

Formative assessments are not common practices in most teachers’ classrooms and changes in 

teacher practice are not always easy to implement. Furthermore, professional development in 

almost any aspect of assessment is sparse. By working with practicing classroom teachers in real 

time, researchers have identified practical suggestions for setting up Teacher Learning 

Communities (TLC) to implement and sustain formative assessment. xxviii Figure 2 below outlines 

a strategy found to be successful in establishing a TLC. xxix 

8

Figure 2: Strategies for Implementing a Teacher Learning Community Around Classroom 

Assessment 

Suggestion 

Plan for the TLC to run for at least 

two years 

Start with volunteers 

Meet monthly for at least 75 

minutes 

Aim for a group of 8-10 teachers 

Try to group teachers with similar 

assignments 

Establish building-based groups 

Require teachers to make 

detailed, modest, individual action 

plans 

Creating an Action Plan 

Teacher Leader to organize and 

coordinate meetings 

Rationale 

Formative assessment is not a quick fix. It takes time to learn, 

practice, and refine your strategies. 

Formative assessment cuts across many established practices in 

schools and volunteers are more likely to find ways around obstacles. 

Monthly meetings are more suited to teachers’ schedules and time. 

To ensure that all individuals have adequate time to report and share, 

the meeting should last 75 minutes or longer. 

When the group is too small, there are not enough differences of 

opinion to provide for good teacher learning. When the group is too 

large, all members may not have time to talk. 

Teachers should work and share in small grade level groups. 

While cross-building meetings can be productive, it is best to work 

within sites so that support can be maintained with a group of trusted 

colleagues. 

At the first meeting, each teacher should made a specific plan about 

what they want to change. Teachers should focus on a small number 

of changes they can integrate into their practice. 

The following questions are intended to help teachers format their 

own action plans: 

1. What is one thing that you will find easy to change? What 

difference do you expect it to make to your practice? 

2. What is one thing that you would like to change that will require 

support? What help would you need? 

3. What other changes would you like to make later on in the year? 

What help might you need? 

4. What will you do differently or stop doing to implement these 

changes? 

Someone needs to make sure the meetings happen, e.g., secure a 

room, send out the agenda, secure refreshments and so on. This 

person should not be an expert. The idea of a TLC is that each 

person comes with a clear idea about what they want help with and 

the group helps that person with the task. 

The following five-part process xxx was also found to be successful in implementing and 

sustaining teacher learning community meetings: 

1. Introduction (5-10 minutes): Participating teachers agree on the goals of the meeting and 

agenda. 

2. How’s it going? (30-50 minutes): Each teacher provides a summary of what they did in 

relation to their action plan during the previous month. The other teachers listen and 

provide support for that teacher in moving their plans forward. 

9

3. New learning about formative assessment (25-40 minutes): The teacher leader or a small 

group of other teachers research and introduce new ideas in formative assessment to the 

group. The teachers engage in a shared activities intended to improve their understanding 

of formative and summative assessment. 

4. Individual teacher planning (10-15 minutes): Based on the group discussion, feedback 

and new learning, teachers may want to revise their action plans. Teachers need time to 

think through what they are planning to do in the next month. They may also want to 

discuss new ideas with their colleagues. 

5. Review of the meeting (5 minutes): The lead teacher redirects the group to the original 

goals and objectives for the meeting and checks to see if they were achieved. 

Teacher learning communities have the potential to support the implementation of formative 

assessments while installing ownership in teachers for their own professional development. The 

strategies mentioned above provide the foundation for a practical and workable model that will 

enable schools to initiate and sustain teacher professional development based on formative 

assessment. 

The Assessment Instruction Cycle 

During the assessment instruction cycle, teachers continuously observe student behavior, collect 

evidence, and make reasonable inferences about what students know. Assessment is central to 

teaching and to instruction—an invisible thread connects assessment, curriculum and teaching 

together in the service of learning. xxxi 

There are four major components to the assessment instruction cycle: (1) achievement 

expectations; (2) the cyclical nature of assessment and instruction; (3) multiple forms of 

assessment; and (4) evidence and feedback. xxxii 

The bases for state, district and classroom assessment, as well as curriculum and instruction in 

California are the State Science Content Standards. Achievement expectations start with the state 

standards and there is strong alignment among the state standards, the state adopted science 

curricula, the teachers’ instructional practices and the students’ learning goals. Student learning 

goals are clearly translated into plain language that all students can understand. Teachers guide 

students through well-defined learning progressions and students understand where they need to 

go next to accomplish their goals. Teachers also provide students with criteria for how their work 

will be judged and exemplars or models of quality student work. xxxiii 

Assessment and instruction are cyclical in nature. Teachers and students use assessment to 

monitor student progress, which in turn informs instructional decisions that support learning. 

Teachers assess, determine needs, provide descriptive feedback, set goals, provide guided 

practice, and keep the cycle in continuous motion. Students work with their teacher to know 

where they are in their learning continuum. With their teacher’s guidance, students track and 

manage their progress, assess and reflect on their learning, set goals, learn, and keep the cycle in 

continuous motion. xxxiv 

10

Teachers use multiple forms of assessment that yield accurate information about students to 

support their learning and achievement. Teachers are continuously collecting evidence, analyzing 

it, and providing timely descriptive feedback to students. The evidence and feedback are: directly 

related to the standards and to the students’ leaning goals; communicated and understood by 

students to encourage self-reflection and goal setting; and used to show growth and improvement 

over time for students, teachers, and parents. xxxv 

11

IV. Examples of Quality Formative and Summative Science Assessments 

Assessments should provide students the opportunity to demonstrate their understanding of 

important and meaningful science content, to use scientific tools and processes, to apply their 

understandings to solve new problems, and to draw on what they have learned to explain new 

phenomena, think critically, and make informed decisions. xxxvi All assessments should have clear 

expectations for students, be valid, reliable, and free of bias. 

Validity 

Three types of validity are central to assessment: content validity; construct validity; and 

instructional validity. Content validity addresses the degree to which an assessment measures the 

intended content of the standards. Construct validity refers to the degree to which an assessment 

measures a “construct” or ability. The Investigation and Experimentation standards, for example, 

outline the skills or constructs necessary to engage in scientific inquiry. To make a valid claim 

about a student’s ability to conduct inquiry, the assessment would need to assess the range of 

skills in the Investigation and Experimentation standards. Finally, an assessment has 

instructional validity if the content of the test matches what is actually being taught during 

instruction. 

Reliability 

When assessments are reliable, they consistently measure what they are intended to measure. 

There are three kinds of consistency in classroom assessments: stability—the consistency of 

student scores over time; alternate test forms—consistency of results among two or more 

different forms of a test; and internal consistency—consistency in the way items on an 

assessment work. xxxvii 

Bias 

Sometimes assessments can be biased against particular groups of students. When an assessment 

is biased, the constructs of the test cause students to perform poorly. All assessments should be 

free of bias—they should not penalize students because of their gender, ethnicity, socioeconomic 

status, religion, or other defining characteristics. Assessments should also not be offensive to 

students. xxxviii Different forms of bias include: xxxix 

• Content Bias: Does the assessment contain content that is different or unfamiliar to 

different groups? Example: asking girls to compare the mass of different footballs when 

they have not had experience with footballs. 

• Language Bias: Does the assessment contain words that have different or unfamiliar 

meanings for different groups? Example: asking urban students about farming techniques 

such as forage pits. 

• Item Structure and Format Bias: Does the nature of the task confuse members of different 

groups? Example: requiring non-English learners to write a long essay in English. 

12

• Stereotyping: Does the assessment give a positive representation of different groups? 

Assessments should be free of material that may be offensive, demeaning, or emotionally 

charged. 

• Fairness: Is the assessment balanced in terms of being equally familiar to every group? 

Tests should be free of words or phrases that are generally associated with elitism-- polo, 

yacht, regatta; finances--venture capital, stock options; regionalisms--grinder, hoagie, 

parish; military topics--rapier, mortar, breech; political topics--alderman, pork barrel; 

legal topics--tort, docket; and farm topics--combine, thresher. 

Assessing the Science Content Standards for California Public Schools 

Assessments should cover the content of the standards at each grade level including the standards 

for Investigation and Experimentation. The Investigation and Experimentation standards are 

central to the role of assessment in the teaching of science. Involving students in scientific 

inquiry helps them develop proficiency in: 1) understanding scientific concepts; 2) appreciating 

how and what we know in the realm of science; 3) understanding of the nature of science; 4) the 

ability to inquire about the natural world; and 5) the ability to use the skills and attitudes 

associated with science. xl 

The Investigation and Experimentation standards are multifaceted—they call for students to 

make observations, pose questions, make predictions, plan and conduct investigations, use tools 

to gather, analyze and use data, generate and evaluate evidence and explanations, use critical and 

logical thinking, examine information, consider alternative explanations, and communicate their 

results. 

Student understanding of this rich array of skills cannot be captured in a simple set of multiplechoice 

questions. Assessments should consist of different strategies ranging from formative 

assessments which include teacher observations and feedback to challenge statements, to 

summative assessments which include hands-on performance tasks, constructed response 

investigations, open-ended questions, portfolios, and well constructed multiple-choice tests. 

Multiple-Measures of Student Achievement 

Assessments should be based on multiple measures of student ability and include a variety of 

techniques for various learning styles and levels of readiness. Figure 4 below outlines examples 

of formative and summative assessments. 

Figure 4: Examples of Formative and Summative Assessments 

Formative 

Teacher Observation, Listening, Questioning and 

Feedback 

Self-reflection and Self-assessment 

Peer Assessment and Reflection 

Science Notebooks 

White Boards 

Summative 

Hands-On Performance Tasks 

Constructed Response 

Open-ended Questions 

Multiple-choice Questions 

Portfolios 

13

Graphic Organizers: Concept Maps, Concept Webs, 

Venn Diagrams, Flowcharts 

Challenge Statements 

Extended Research Projects 

Student Presentations 

Interviews 

Homework Assignments 

Interactive Computer Assessments 

Constructed Response Items 

Constructed response items require students to write their own answers. Student responses are 

scored with a scoring rubric tailored specifically to each task. Scoring rubrics can be holistic 

(where a single score is assigned to the entire task) or analytical (where each question on a task 

receives an individual score). Analytical rubrics are more diagnostic in nature and provide more 

detailed information regarding student understanding of science content and inquiry constructs in 

the task. 

Hands-on Performance Tasks 

Hands-on performance tasks integrate standards for life, earth and/or physical science with 

Investigation and Experimentation constructs. During a hands-on task, students are presented 

with a scenario identifying a problem that needs to be solved. Students are provided hands-on 

materials organized on a placemat, and asked to: make predictions; setup and conduct an 

investigation; record data and observations; organize data (graphs, charts, tables, etc.); explain if 

and how the results of their investigation either support or refute their prediction; analyze their 

results and use their own data and findings to explain their answers; use what they’ve learned in 

the task to make an application beyond the task; and/or think of another (new) question to 

investigate and briefly describe the steps of a plan for a new investigation. Students work with a 

partner to conduct their investigation and to collect their data. They work individually to record 

their answers in their test booklet. 

Examples of performance tasks are in Appendix A. 

Constructed Response Investigations 

Constructed Response Investigations are extended paper/pencil tasks that integrate science 

concepts with inquiry and investigation. Students are presented with a problem that students 

(hypothetical) in another school are trying to solve. They are provided a set of authentic data and 

a set of questions and required to: analyze the problem and the data; graph and interpret data; 

interpret relationships on graphs; construct models, questions, predictions and/or hypothesis; 

recommend solutions; and/or design new investigations to further explore the problem in the 

task. Although students usually work individually, these tasks can be designed to include 

information that students would discuss with a partner before writing their individual responses. 

Examples of constructed response tasks are in Appendix A. 

14

Open-ended Questions 

Open-ended questions are short paper/pencil tasks that focus on evaluating understanding and 

reasoning. They are designed to explore students’ abilities to: communicate scientific 

understandings; use inquiry; reason scientifically; express positions on societal issues; and 

design an experiment. Students are presented with a prompt, usually in the form of a problem or 

scenario, and asked to communicate their understandings of scientific concepts and processes. 

Students work individually to record their responses in their test booklet. 

Examples of open-ended questions are in Appendix A. 

Challenge Statements 

Challenge Statements are assessment probes designed to investigate students’ thinking about 

important science concepts. The assessment probe consists of a deliberately provocative or 

ambiguous statement about a science concept such as—“As electrical current passes through 

devices such as light bulbs and motors, some of it gets used up.” The learner is asked to agree or 

disagree with the statement and to explain their reasoning. Students are expected to explain their 

thinking using everyday language and not use academic vocabulary. Academic vocabulary can 

be used as a screen for not revealing misconceptions. The goal of Challenge Statements is to 

make student thinking visible and not hide their misconceptions behind their science vocabulary. 

Challenge Statements are used before and after a unit of instruction. Students start by thinking 

about the Challenge Statement and writing their thoughts individually. They discuss their ideas 

with their peers and then have an opportunity to revise their statement based on input from their 

group. Challenge Statements demand deeper thinking and investigation. They set the stage for 

meaningful discussion as part of learning. 

Challenge Statements are evaluated using a 5-point rubric modeled after the five levels of 

proficiency measured in the California Standards Tests. In evaluating responses, valid 

conceptions and sophistication of reasoning are considered. 

Student Science Notebooks 

Student Science Notebooks engage students in scientific thinking as they explore questions, 

make predictions, plan and conduct investigations, collect, organize and use data, apply their 

learning, and communicate their understanding of science. As an assessment tool, science 

notebooks have been found to: help students construct their conceptual thinking; inform and 

guide instruction; enhance literacy skills; support differentiated learning; and foster teacher 

collaboration. 

White Boards 

White Boards are powerful tools for allowing students to make their thinking visible. The use of 

white boards at the beginning of an instructional unit is an effective way to elicit students’ prior 

knowledge of the content to be taught. Before teaching a fourth grade lesson on circuits, a 

teacher may ask the class to quickly draw a complete circuit on their white boards and hold them 

up. The teacher can easily find out which students understand circuits and use this information to 

15

teach the lesson. During the lesson, the teacher may ask expert students to use their white boards 

to explain their thinking. This provides novice learners an opportunity to learn from expert 

thinking, which is usually hidden. xli At the end of the lesson, the teacher may have the students 

use the white boards to show what they learned and use this information to prepare for the next 

lesson. 

Graphic Organizers: Concept Maps, Venn Diagrams, Flowcharts 

Graphic organizers, such as concept maps, Venn diagrams, and flowcharts are mental maps of 

student thinking and understanding. Concept maps help students see the connections between 

concepts and the differences among concepts. Venn diagrams help students see the relationships 

between ideas, and flowcharts can help students to sequence events. Like white boards, they can 

be used as assessment strategies for making student thinking visible, helping teachers assesses 

what students do and do not understand. 

Portfolios 

Portfolios are collections of student work designed to provide the best evidence of a student’s 

scientific literacy. They are used to measure student growth over time, showing achievement of 

science concepts, the deepening of understanding of the scientific method, and the growth of 

both communication and problem solving skills. Through portfolios, students can become 

actively engaged in their own learning, gaining a sense of pride and ownership of their work. As 

an assessment tool, portfolios provide opportunities for students to: reflect on and self-evaluate 

their learning and work; select a variety of different types of work they think best represent their 

understanding of science; and learn how to score and evaluate the work of peers. Teachers use 

student portfolios to evaluate the progress of the student, the class, the curriculum, and their 

instruction. 

Interactive Computer Tasks 

Computer simulations can present students with rich, interactive assessments that model systems 

in the natural world. Science simulations can model authentic environments and make concepts 

that are difficult to represent in a graphic format such convection currents, the movement of 

molecules in solids, liquids and gases, and/or plate tectonics visible. In an interactive computer 

task, students have the opportunity to manipulate stimuli that they would not be able to 

manipulate in real time. In an assessment of plate tectonics and Earth’s structure, for example, 

students can investigate the results of different plate movements or how wind, water, and ice 

shape and reshape Earth’s surface. Interactive computer simulations allow students to 

demonstrate their understandings of science content and inquiry in an active manner. Moreover, 

computer technology associated with simulations can provide automatic feedback to students and 

teachers and can help to inform and guide instruction. 

Select Response Items 

Select response items are commonly called multiple-choice items. In responding to a multiplechoice 

item, students select one of four possible answer choices and record their responses on a 

separate answer sheet. Each multiple-choice item is: aligned to only one content standard; 

contains a stem with either a question or a completion format; and four different answer choices 

with only one correct answer. The four answer choices should be approximately the same length, 

16

have the same format, and have parallel syntax and semantic structures. At least 10 items are 

needed for each standard to reliably report student achievement for that standard. Ten items are 

also needed to reliably report student achievement for each domain level of life science, earth 

science, physical science, and investigation and experimentation. Two examples of multiplechoice 

items follow. 

Regular Multiple-choice Items 

A well-constructed multiple-choice item may be a valuable component of an assessment system 

because it can provide broad coverage of important topics and allow students to demonstrate a 

variety of skills and knowledge. Many “regular” multiple-choice items usually focus on lowerlevel 

recall—assessing small, topical pieces of information such as, what are the parts of a cell, 

or in what year was helium discovered. Multiple-choice items require higher-level and theyfocus 

more on important skills and can probe analytical reasoning. 

While any incorrect student answer can qualify as a misconception, there is a relatively large 

research base of documented student misconceptions in science. Documented misconceptions 

have been studied and confirmed by researchers through thorough investigations. Documented 

common student misconceptions in science can be built into the answer choices. If documented 

misconceptions are used in the answer choices, it is recommended that only one of the four 

answer choices contain the documented misconception. 

Justified Multiple-choice Items 

A modified multiple-choice question is called a justified multiple-choice question. Students 

select an answer choice and then explain why they think the answer is correct. Students are 

directed to use their understanding of specific science content and inquiry to explain why their 

answer is correct. Teachers use scoring rubrics specific to each question to score student work. 

Examples of justified multiple-choice questions are in Appendix A. 

Graphic Organizers for Monitoring and Tracking Formative and Summative Assessments 

aligned to the California Science Content Standards 

Teachers can use various methods to monitor and track different classroom assessments aligned 

to the California Science Content Standards. The matrix shown in Figure 5 below shows general 

headings for formative and summative assessments. Enduring California science standards for 

grade 4 are listed down the left side of the matrix. Teachers can monitor and track specific 

assessments for formative and summative categories in the cells. 

17

Figure 5: Graphic Organizer for Monitoring Formative and Summative Assessments 

aligned to the California Science Content Standards 

QuickTime and a 

TIFF (Uncompressed) decompressor 

are needed to see this picture. 

By using a variety of assessments that have clear expectations for students and are closely linked 

to the standards and to learning goals, teachers can capture the full range of student 

understanding and progress. They can also use the resulting data in thoughtful and powerful 

ways to improve student learning and achievement and to inform and guide their instruction. 

18

V. Analyzing and Using Data and Results 

Results from classroom assessments provide quality feedback to teachers allowing them to: 

improve student learning and achievement; inform and modify instruction; plan curriculum; 

target teaching; and research teaching practices. 

Once teachers collect data and results, they need to make sense of their findings before they can 

apply them to improved learning and instruction. Analyzing data involves: looking for patterns 

or trends in both individual student work and for similar patterns in the work of all students in 

the class; reflecting on inferences and plausible explanations for findings; making sense out of 

clusters of information that go together; and making informed decisions for using the results with 

students and with their instruction. 

Tally Sheets 

Tally Sheets can be designed to record and analyze student results for multiple-choice tests. The 

Tally Sheet is a matrix with the item numbers and the codes for the standards assessed identified 

across the top of the matrix and the names of the students listed down the left side of the matrix. 

The teacher could enter (+) for a correct answer and (–) for an incorrect answer and then tally the 

number correct for each student and for each standard. By reading across the matrix from the left 

side to the right side, teachers can quickly determine how many items each student responded to 

correctly. By reading from the top of the matrix to the bottom of the matrix for each item, 

teachers can quickly determine which standards on this particular test were difficult for students 

and which were not. In order to make a reliable inference about student understanding of a single 

standard, there must be at least ten items for each standard. Figure 6 below shows a tally sheet 

made in Excel for recording student responses to a multiple-choice test. Several Tally Sheets can 

be made in Excel to keep track of student results and progress. 

Figure 6: Tally Sheet for Multiple-choice Answers 




19

Tally Sheets can also be used to capture and analyze information from a hands-on performance 

task. A hands-on performance task was administered to eighth grade students in a large urban 

school district. The students investigated variables related to force and motion. After the students 

took the test, each question in their booklets was scored with an analytical rubric and 

summarized in the Tally Sheet in Figure 7 below. 

The parts of the performance task and associated questions are listed at the top of the matrix. The 

score points—1 for a correct response, 0—for an incorrect response, and B—for blank are listed 

down the left side of the matrix. The data, reported in percentages for the 4, 500 students tested, 

is recorded in each cell in the table. The data for question 3B, for example, shows that 76% of 

the 4,500 eighth grade students correctly recorded data from their investigation in a data table 

while 23% of the students did not record data in a table correctly. The matrix also shows that 1% 

of the students left the question blank. In contrast, the data for question 4 show that only 34% of 

the 4,500 students were able to organize their results correctly on a graph while 62% of the 

students did not graph their data correctly. The matrix also shows that a 4% of the students did 

not attempt to graph the data from their investigation. 

Figure 7: Tally Sheet Showing Student Results for an Eighth Grade Performance Task 




The information in Figures 6 and 7 allow teachers to use data from a summative test to inform 

instruction and improve student learning. Teachers can identify specific areas where students are 

experiencing difficulty and target their instruction to address these areas. This allows teachers to 

use results from a summative test in a formative manner. Furthermore, research shows that when 

teachers identify specific student weaknesses and target their instruction using metacognitive 

teaching strategies to address those weaknesses, student achievement improves significantly. xlii 

20

Assessment data should be drawn from multiple sources and triangulated. Triangulation is a 

technique of using data from three different sources to determine student achievement of specific 

content. Three different sources of data provide teachers three different perspectives of student 

work and understanding of that content, making their inferences about student understanding 

more reliable. 

The Logic Model for Assessment in Figure 8 shows a graphic representation for triangulating 

data from pre-posttests, formative and summative assessments, and the state Content Standards 

Test. 

In this model, the grey box in the middle represents the formative and summative assessments 

that take place during the course of standards-based instruction throughout the school year. At 

the start of instruction in the fall, the teacher administers a pretest to determine students’ prior 

knowledge of the science concepts for that particular grade level. In this scenario, the school is 

participating in a CaMSP and required to pre- and posttest students. Throughout the course of the 

year, the teacher engages in continuous formative and summative assessment. In the spring, the 

teacher administers the California Standards Test for science and at the end of the year, the 

posttest is administered. 

The model shows that the intent of the data from the pre- and posttest and the CST is to: see how 

well students are achieving the Science Content Standards; determine if the school is meeting its 

state performance targets in science; investigate program effects between the schools 

participating in the CaMSP; to determine program impact; and to inform local and state 

evaluators. 

The model also shows that data from all assessments are triangulated to form a culminating body 

of evidence. At a larger grain size, the results of the culminating body of evidence are used to 

inform and guide instruction, inform and guide professional development, plan instruction, 

allocate resources, and to disseminate findings of what worked to the larger learning network. 

21

Figure 8: Logic Model for Assessment 




22

VI. Assessing English Learners and Special Needs students 

Inclusiveness of Assessments 

The principles of universal design help to make assessments accessible to all students. The 

application of universal design principles to the development of classroom assessments will: xliii 

• Allow for the widest range of student participation, including students with 

disabilities and English Language Learners (ELL) 

• Ensure that the assessments themselves are not obstacles to improved learning 

• Provide valid inferences about the performance of all students 

• Provide each student a comparable opportunity to demonstrate their understanding of 

the content tested 

The seven elements of universally designed assessments include: 

1. Inclusive assessment population—addresses the context of the entire student population 

to be assessed. California classrooms include students with different cognitive, cultural, 

and linguistic backgrounds. These students represent a wide range of skills, abilities, and 

diverse learning needs. 

2. Precisely designed construct—recommends that all assessments are designed to measure 

what they intend to measure. Formative and summative assessments at all grade levels 

need to closely align to the intent and content of the standards. 

3. Accessible, non-biased items—maintains that all items used in classroom assessment are 

not biased against any groups of students. 

4. Amenable to accommodations—addresses the use of appropriate accommodations during 

testing. While experts maintain that universally designed assessments will be accessible 

to most students, some students will still require accommodations. These 

accommodations can include: alternate settings (alternate rooms, non-school settings, 

special lighting, furniture, and/or acoustics, other school personnel); scheduling and 

timing (to correspond with medical or learning needs, short breaks, extended time); 

presentation formats (Braille, large print, signing directions, translation, underlining 

words/phrases, visual magnification or reduction, acetate shields); and response formats 

(use of word processor, typewriter, computer, adult transcription, Brailler, student 

dictation). 

5. Simple, clear, and intuitive instructions and procedures—maintains that students should 

respond to a task in the manner that the test developer intended. Regardless of a student’s 

ability, language skills, knowledge, or experience, test directions and instructions need to 

be simple, clear, consistent, and easy to understand. 

6. Maximum readability and comprehension—focuses on the use of vocabulary and 

sentence complexity appropriate for an intended grade level. Research is showing that 

linguistic simplification of vocabulary—the use of plain language—can benefit all 

students, including students with limited English proficiency. Plain language strategies 

23

include: reducing wordiness and removing irrelevant material; eliminating unusual or low 

frequency words; avoiding ambiguous and irregularly spelled words; avoiding proper 

names; avoiding inconsistent naming and graphic conversions; and marking all questions. 

7. Maximum legibility—refers to clear, uncomplicated, and legible text, graphs, tables, and 

graphics, and response formats. 

English Language Learners 

Science teachers who assess English learners will need to insure that these learners have a 

reasonable way to communicate what they are learning. Language barriers in the testing process 

need to be modified so that the focus of the assessment is on science learning, not on the mastery 

of English. xliv 

A variety of accommodations can be implemented that can make assessments fair for English 

learners. These accommodations should address the same content standards for all students 

while, at the same time, offering students different ways of performing that respects their 

differences and yields accurate results. Accommodations are intended to elicit the most accurate 

information about what students know and can do without providing an unfair advantage to 

students who do not receive an accommodation. xlv 

The table in Figure 9 below describes common testing accommodations that teachers may use in 

their classrooms with English learners. These accommodations can be used with formative and 

summative assessments. xlvi 

Figure 9: Assessment Accommodations for English Learners 

Test Accommodations 

Extra Time 

Word Walls, Glossaries, 

Dictionaries 

Notes in Primary Language 

Models & Rubrics 

Enhanced Test Directions 

Checklists 

Oral Responses 

Purpose or Use 

Extra time is required to read and understand test questions. English 

learners need to engage in extra thinking to respond to questions in 

English. 

Word walls created during instruction provide reference during 

assessment so English learners can communicate understanding 

easier. Use English and/or bilingual dictionaries when appropriate. 

Student notes from instruction in their primary language helps them to 

produce answers they know in their primary language. 

Provide models of expected work for students who have not 

experienced the type of assessment before. Preview the rubric that will 

be used to score student work. Previewing models and rubrics before 

an assessment helps students understand assessment objectives. 

Read directions aloud and rephrase them so that students know what 

is expected. Simplify test directions as much as possible—one step at 

a time—allowing students to respond in between steps. Use checklists 

for directions. 

Test anxiety can make communication in English more difficult. Allow 

English learners to give oral responses. Prompt students individually 

and scaffold the conversation to elicit meaningful responses. Provide 

support for constructed response items with sentence frames for 

24

Illustrations, Graphic Organizers 

Hands-on Activities 

Language Conventions 

Small Groups 

written answers. 

Allow students to express ideas with labeled drawings, diagrams or 

graphic organizers. Ask students to follow up with oral explanations or 

demonstrations. 

Have students perform an activity or experiment and tell what they are 

doing and thinking. Orally prompt students as needed. 

Focus on student understanding of science content during a science 

assessment and ignore language conventions. Address language 

conventions during instruction. 

Administer assessments to small groups of English learners using 

prompts and scaffolds and allowing for oral responses. 

Special Needs Students 

Students with special needs should have access to the same content standards curriculum and 

high quality instruction as students without disabilities. This can be accomplished through: a) 

adaptations in delivery of content to make it accessible to students’ level of understanding, and 

b) differentiation in level of expectation for student achievement to focus on prioritized target 

skills within that content that are both meaningful to students and build growth in academic 

learning. 

25

VII. The California Standards Test 

The purpose of the California Standards Test (CST) is to determine students’ achievement of the 

California content standards for each grade or course in science. Students’ scores are compared 

to preset criteria to determine whether the students’ performance on the test is advanced, 

proficient, basic, below basic, or far below basic. The state target is for all students to score at the 

proficient and advanced levels. CST scores are used for calculating each school’s Academic 

Performance Index (API) and Adequate Yearly Progress (APY). 

The California Science Standards Tests are multiple-choice and administered annually to 

students in grades five, eight, and ten. The following tables provide information about the 

content and test blueprints for each grade level test. 

Grade 5 

Content Area Grade Level 

Standards 

Physical Science 5 

4 

Life Science 5 

4 

Earth Science 5 

4 

Investigation and 

5 

Experimentation 

4 

Grade 8 

Number 

of Items 

Percentage on 

Test 

Reference Sheets 

8 

29 • Periodic Table of 

6 

Elements 

7 

29 • Mineral 

7 

Information 

8 

29 

6 

4 

13 

2 

48 100 

Content Area Content Standards Number Percentage on Reference Sheets 

of Items Test 

Physical Science Motion 8 13 • Periodic table of 

Forces 8 13 

the elements, 

Structure of Matter 9 15 

formulas, and 

Earth in the Solar System 7 12 

conversions 

(Earth Science) 

Reactions 7 12 

Chemistry of 

3 5 

Living Systems 

(Life Science) 

Periodic Table 7 12 

Density and 

5 8 

Buoyancy 


6 10 


60 100 

26

Grade 10 

Content Area Content Standards Number 


Percentage on 

Test 

Life Science Cell Biology 10 17 

Genetics 12 20 

Ecology 11 18 

Evolution 11 18 

Physiology 10 17 


6 10 


60 100 


Students in grade 10 who completed a standards-based science course take one of the tests listed 

above in addition to taking the Grade 10 Life Science Test. Students in grades 9 through 11 who 

completed a standards-based science course take one of the following CST’s. 

Biology/Life Science 



Percentage on 

Test 

Life Science Cell Biology 9 15.0 

Genetics 19 31.6 

Ecology 7 11.7 

Evolution 9 15.0 

Physiology 10 16.7 


6 10.0 


60 100 


27

Chemistry 

Content Area Content Standards Number Percentage on Reference Sheets 

of Items Test 

Physical Science Atomic & Molecular 

Structure 

Chemical Bonds 

6 

7 

10.0 

11.7 

• Chemistry 

Formulas, Units & 

Constants 

Conservation of Matter and 10 16.7 • Chemistry 

Stoichiometry 

Gases and Their Properties 6 10.0 

Periodic Table of 

Elements 

Acids and Bases 5 8.3 

Solutions 3 5.0 

Chemical 

5 8.3 

Thermodynamics 

Reaction Rates 4 6.7 

Chemical 

4 6.7 

Equilibrium 

Organic 

Chemistry and 

Biochemistry 

2 3.3 



Earth Sciences 

Nuclear 

Processes 

2 3.3 

6 10.0 

60 100 



Percentage on 

Test 

Earth Science Earth’s Place in the 

12 20.0 

Universe 

Dynamic Earth Processes 9 15.0 

Energy in the Earth System 18 30.0 

Biogeochemical Cycles 5 8.3 

Structure and 

5 8.3 

Composition of 

the Atmosphere 

California 

5 8.3 

Geology 


6 10.0 


60 100 


28

Physics 



Percentage on 

Test 

Physical Science Motion and forces 12 20.0 

Conservation of Energy and 12 20.0 

Momentum 

Heat and Thermodynamics 9 15.0 

Waves 10 16.7 

Electric and 

11 18.3 

Magnetic 

Phenomena 


6 10.0 


60 100 


The California Standards Tests for Science also contain four additional tests that students can 

take in conjunction with the tenth grade test. These four tests are designed to integrate/coordinate 

concepts from life science, earth science, physical science, and investigation and experimentation 

together. 

29

VIII. Shifting Classroom Assessment “More of/ Less of Chart” 

More of 

The process of continuous formative and 

summative assessment 

Assessment data informs and guides 

instruction 

Students have clarity of learning goal(s) 

Students receive descriptive feedback 

Teacher selects unbiased and fair assessment 

tools for a purpose 

Teachers use multiple measures to assess 

student understanding 

Less of 

Assessment only for grading 

Assessment data not used for instruction 

Students have limited to no knowledge of 

learning goal(s) 

Students receive a grade or non-descriptive 

feedback 

Teacher uses assessment tools without 

consideration of bias, fairness, or purpose 

Teachers only use multiple-choice questions 

IX. Conclusion 

The ultimate goal of assessment is to improve student understanding and achievement of 

important and meaningful science. It is also for students to develop inquiry skills and habits of 

mind that will enable them to become fully proficient in science. 

30

References 

i Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational 

Conference, The Future of Assessment: Shaping Teaching and Learning, New York. 

ii Popham, J.W. (2000). Modern Educational Measurement. Practical Guidelines for Educational Leaders. 

Needham, MA: Allyn & Bacon. 

iii National Research Council. (2001). Classroom Assessment and the National Science Education 

Standards. Committee on Classroom Assessment and the National Science Education Standards. 

Washington, DC: National Academy Press. 

iv Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational 


v National Research Council. (1996). National Science Education Standards. National Committee on 

Science Education Standards and Assessment. Washington, DC: National Academy Press. 

vi Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational 


vii National Research Council. (2001). Classroom Assessment and the National Science Education 



viii Leahy, S., Lyon, C., Thompson, M., & Wiliam, D. (2005). Classroom Assessment: Minute by Minute, Day 

by Day. Educational Leadership, 63(3). 

ix National Research Council. (1996). National Science Education Standards. National Committee on 


x Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational 


xi Shepard, L.A. (2005). Formative assessment: Caveat emptor. Paper presented at ETS Invitational 

Conference, The future of Assessment: Shaping Teaching and Learning, New York. 

xii Popham, J.W. (2008). Transformative Assessment. Alexandria, VA: Association for Supervision and 

Curriculum Development. 

xiii Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42. 

xiv Wiliam, D. (2007). Chapter 9, Content Then Process: Teacher Learning Communities in the Service of 

Formative Assessment. Solution Tree. P.191. 

xv Wiliam, D. (2007). Chapter 9, Content Then Process: Teacher Learning Communities in the Service of 

Formative Assessment. Solution Tree. 

31

xvi National Research Council. (2001). Classroom Assessment and the National Science Education 



xvii Black, P. (2004). The Nature and Value of Formative Assessment for Learning. (Draft paper). Kings 

College, London. 

xviii Marzano, R. J., Pickering, D.J., Pollock, J.E. (2001). Classroom Instruction that Works. Alexandria, VA: 

Association for Supervision and Curriculum Development. 

xix Butler, R. (1987). Task-involving and ego-involving properties of evaluation: Effects of different feedback 

conditions on motivational perceptions, interests and performance. Journal of Educational Psychology, 

79(4), 474-482. 

xx Marzano, R. J., Pickering, D.J., Pollock, J.E. (2001). Classroom Instruction that Works. Alexandria, VA: 

Association for Supervision and Curriculum Development. 

xxi Sadler, R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 

18, 119-144. 

xxii Black, P. (2004). The Nature and Value of Formative Assessment for Learning. (Draft paper). Kings 


xxiii Sadler, R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 

18, 119-144. 

Black, P. (2004). The Nature and Value of Formative Assessment for Learning. (Draft paper). Kings 


xxiv Foster, G., Sawicki, E., Schaeffer, H., Zelinski, V. (2002). I Think, Therefore I learn! Ontario, Canada: 

Pembroke. 

xxv Black, P. (2004). The Nature and Value of Formative Assessment for Learning. (Draft paper). Kings 


xxvi Wiliam, D. (2007). Chapter 9, Content Then Process: Teacher Learning Communities in the Service of 

Formative Assessment. Solution Tree. P.192-194. 

xxvii Wiliam, D. (2007). Chapter 9, Content Then Process: Teacher Learning Communities in the Service of 

Formative Assessment. Solution Tree. P.192-194. 

xxviii Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42. 

xxix Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42. 

xxx Wiliam, D. (2007). Changing Classroom Practice. Educational Leadership, 65(4), 36-42. 

xxxi National Research Council. (1996). National Science Education Standards. National Committee on 


32

xxxii Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.] 

xxxiii Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.] 

xxxiv Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.] 

xxxv Stiggins, R. (2005). Measuring Up. PL October 2005. [Need to find complete reference.] 

xxxvi National Research Council. (1996). National Science Education Standards. National Committee on 


xxxvii Popham, J.W. (2002). Classroom Assessments. What Teachers Need to Know. Boston, MA: Allyn & 

Bacon. 

xxxviii Popham, J.W. (2002). Classroom Assessments. What Teachers Need to Know. Boston, MA: Allyn & 

Bacon. 

xxxix ETS {Need to find reference.] 

xl National Research Council. (1996). National Science Education Standards. National Committee on 


xli Georghiades, P. (2004). From the general to the situated: Three decades of metacognition. 

International Journal of Science Education, 26(3), 365 – 383. 

xlii Comfort, K. B., Klein, S., Bolus, R. (2005). Research in standards-based science assessment: 

Iinvestigating teacher understanding and use of science assessment data. Unpublished 

manuscript. 

xliii Thompson, S. J., Johnstone, C. J., & Thurlow, M. L. (2002). Universal design applied to large-scale 

assessments (Synthesis Report 44). Minneapolis, MN: University of Minnesota, National Center on 

Educational Outcomes. 

xliv Carr, J., Sexton, U., & Lagunoff, R. (2007). Making Science accessible to English Learners. San 

Francisco, CA: WestEd 

xlv Carr, J., Sexton, U., & Lagunoff, R. (2007). Making Science accessible to English Learners. San 


xlvi Carr, J., Sexton, U., & Lagunoff, R. (2007). Making Science accessible to English Learners. San 


33

Chapter 6: Assessment of Student Learning ... - scienceinquirer

Create successful ePaper yourself

Delete template?

Save as template?