Reliability in Classroom Observations - Harvard Graduate School of ...
Reliability in Classroom Observations - Harvard Graduate School of ...
Reliability in Classroom Observations - Harvard Graduate School of ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>Reliability</strong> <strong>in</strong> <strong>Classroom</strong><br />
<strong>Observations</strong><br />
National Institute for Excellence <strong>in</strong> Teach<strong>in</strong>g<br />
October 27, 2011<br />
Kristan Van Hook<br />
Senior Vice President, Public Policy and Development<br />
Anissa Rodriguez<br />
Senior Program Specialist
About NIET<br />
The National Institute for Excellence <strong>in</strong><br />
Teach<strong>in</strong>g (NIET) is a 501(c)(3) nonpr<strong>of</strong>it<br />
organization that pursues its mission to<br />
<strong>in</strong>crease educator effectiveness through two<br />
signature <strong>in</strong>itiatives:<br />
– TAP: The System for Teacher and Student<br />
Advancement and<br />
– the NIET Best Practices Center (BPC)
TAP Reaches Teachers and Students<br />
Nationwide<br />
For more than a decade, TAP has<br />
pioneered a comprehensive school<br />
reform focused on the quality <strong>of</strong><br />
teach<strong>in</strong>g and the advancement <strong>of</strong><br />
effective teachers.<br />
In the 2011-12 school year, TAP will<br />
reach<br />
• 13 states<br />
• 500 schools<br />
• 20,000 teachers<br />
• 200,000 students
The NIET Best Practices Center<br />
• Based on more than a decade <strong>of</strong> experience <strong>in</strong> schools across<br />
the country, the Best Practices Center (BPC) works with its<br />
partners to redesign educator evaluation systems to more<br />
effectively measure performance and support improvements <strong>in</strong><br />
<strong>in</strong>structional practice.<br />
• The BPC also provides support for performance-based<br />
compensation systems and creat<strong>in</strong>g teacher leadership roles <strong>in</strong><br />
schools.<br />
• With proven results and leadership <strong>in</strong> educator quality and<br />
reform, BPC works to engage schools, districts and states<br />
through:<br />
Service Support Solutions
The TAP System’s Lessons Learned for Design<strong>in</strong>g<br />
Better Teacher Evaluation Systems
TAP Elements <strong>of</strong> Success<br />
Multiple<br />
Career<br />
Paths<br />
Ongo<strong>in</strong>g Applied<br />
Pr<strong>of</strong>essional<br />
Growth<br />
Instructionally<br />
Focused<br />
Accountability<br />
Performance-<br />
Based<br />
Compensation
<strong>Classroom</strong> Observation Component<br />
<strong>of</strong> Teacher Evaluation<br />
Fair evaluations based on clearly def<strong>in</strong>ed,<br />
research-based standards<br />
• Multiple evaluations <strong>of</strong> classroom practice<br />
• By multiple tra<strong>in</strong>ed and certified evaluators<br />
• Post-conferences follow<strong>in</strong>g each evaluation;<br />
pre-conferences before announced<br />
observations<br />
• Follow-up support through PD, <strong>in</strong>dividual<br />
coach<strong>in</strong>g and access to onl<strong>in</strong>e resources
TAP Teacher Evaluations versus<br />
Traditional Teacher Evaluations<br />
70%<br />
Observational Rat<strong>in</strong>gs <strong>in</strong> Urban Districts<br />
with Non-B<strong>in</strong>ary Scales Reported <strong>in</strong> The<br />
Widget Effect<br />
70%<br />
Observational (SKR) Rat<strong>in</strong>gs <strong>in</strong> TAP<br />
<strong>School</strong>s Before Value-Added Scores are<br />
Calculated<br />
60%<br />
60%<br />
50%<br />
50%<br />
40%<br />
40%<br />
30%<br />
30%<br />
20%<br />
20%<br />
10%<br />
10%<br />
0%<br />
Lowest<br />
Rat<strong>in</strong>gs<br />
Middle<br />
Rat<strong>in</strong>gs<br />
Highest<br />
Rat<strong>in</strong>gs<br />
0%<br />
1 1.5 2 2.5 3 3.5 4 4.5 5
Similar Scale for <strong>Classroom</strong><br />
<strong>Observations</strong> and Value-Added Scores<br />
<strong>Classroom</strong><br />
<strong>Observations</strong><br />
Unsatisfactory Pr<strong>of</strong>icient Exemplary<br />
1 2 3 4 5<br />
Much less<br />
than a year’s<br />
growth<br />
Less than a<br />
year’s<br />
growth<br />
One year’s<br />
growth<br />
More than a<br />
year’s<br />
growth<br />
Much more<br />
than a year’s<br />
growth<br />
Value-Added<br />
Scores
Correlation Between TAP’s Measures<br />
<strong>of</strong> Teacher Performance<br />
5<br />
In High-Perform<strong>in</strong>g<br />
<strong>School</strong>s<br />
Teacher Value Added Score<br />
4<br />
3<br />
2<br />
In Medium-<br />
Perform<strong>in</strong>g <strong>School</strong>s<br />
In Low-Perform<strong>in</strong>g<br />
<strong>School</strong>s<br />
1<br />
1 2 3 4 5<br />
Teacher Skills, Knowledge and Responsibilities (SKR) Score
TAP’s Teach<strong>in</strong>g Standards are Research-<br />
Based<br />
The TAP Teach<strong>in</strong>g Standards are based on education psychology research<br />
focus<strong>in</strong>g on learn<strong>in</strong>g and <strong>in</strong>struction, and cont<strong>in</strong>ue to be validated by more<br />
recent research. In addition, the development was <strong>in</strong>fluenced by focus<br />
groups with outstand<strong>in</strong>g educators, <strong>in</strong>clud<strong>in</strong>g many Milken Educators.<br />
The work was <strong>in</strong>formed by materials from numerous sources, <strong>in</strong>clud<strong>in</strong>g:<br />
• Interstate New Teacher Assessment and Support Consortium (INTASC)<br />
• National Board for Pr<strong>of</strong>essional Teacher Standards<br />
• Massachusetts’ Pr<strong>in</strong>ciples for Effective Teach<strong>in</strong>g<br />
• California’s Standards for the Teach<strong>in</strong>g Pr<strong>of</strong>ession<br />
• Connecticut’s Beg<strong>in</strong>n<strong>in</strong>g Educator Support Program<br />
• New Teacher Center’s Developmental Cont<strong>in</strong>uum <strong>of</strong> Teacher Abilities<br />
• Danielson's Framework for Teach<strong>in</strong>g
TAP Teach<strong>in</strong>g Standards: Skills,<br />
Knowledge & Responsibilities
Example <strong>of</strong> One Indicator <strong>in</strong> the<br />
TAP Teach<strong>in</strong>g Standards<br />
Exemplary (5) Pr<strong>of</strong>icient (3) Unsatisfactory (1)<br />
Academic<br />
Feedback<br />
‣ Oral and written feedback<br />
is consistently academically<br />
focused, frequent, and highquality.<br />
‣ Feedback is frequently<br />
given dur<strong>in</strong>g guided practice<br />
and homework review.<br />
‣ The teacher circulates to<br />
prompt student th<strong>in</strong>k<strong>in</strong>g,<br />
assess each student’s<br />
progress, and provide<br />
<strong>in</strong>dividual feedback.<br />
‣ Feedback from students is<br />
regularly used to monitor<br />
and adjust <strong>in</strong>struction.<br />
‣ Oral and written feedback<br />
is mostly academically<br />
focused, frequent, and<br />
mostly high-quality.<br />
‣ Feedback is sometimes<br />
given dur<strong>in</strong>g guided practice<br />
and homework review.<br />
‣ The teacher circulates<br />
dur<strong>in</strong>g <strong>in</strong>structional activities<br />
to support engagement and<br />
monitor student work.<br />
‣ Feedback from students is<br />
sometimes used to monitor<br />
and adjust <strong>in</strong>struction.<br />
‣ The quality and timel<strong>in</strong>ess<br />
<strong>of</strong> feedback is <strong>in</strong>consistent.<br />
‣ Feedback is rarely given<br />
dur<strong>in</strong>g guided practice and<br />
homework review.<br />
‣ The teacher circulates<br />
dur<strong>in</strong>g <strong>in</strong>structional<br />
activities, but monitors<br />
mostly behavior.<br />
‣ Feedback from students is<br />
rarely used to monitor or<br />
adjust <strong>in</strong>struction.<br />
‣ Teacher engages students<br />
<strong>in</strong> giv<strong>in</strong>g specific and highquality<br />
feedback to one<br />
another.
Inter-rater <strong>Reliability</strong><br />
Consistency between the scores assigned by<br />
evaluators result<strong>in</strong>g from the process <strong>of</strong> com<strong>in</strong>g to<br />
consensus on collected evidence and assigned<br />
scores based on the TAP Teach<strong>in</strong>g Standards.
Initial Tra<strong>in</strong><strong>in</strong>g and Ongo<strong>in</strong>g Support<br />
<strong>of</strong> Inter-rater <strong>Reliability</strong><br />
Know it<br />
The first step to creat<strong>in</strong>g <strong>in</strong>ter-rater reliability is truly<br />
understand<strong>in</strong>g the standard (rubric) be<strong>in</strong>g used to evaluate.<br />
Assess it<br />
In order to measure this understand<strong>in</strong>g, you need to assess<br />
evaluators application <strong>of</strong> the rubric <strong>in</strong> a controlled environment.<br />
Monitor/Address it<br />
Once this basel<strong>in</strong>e has been set, you need to provide ongo<strong>in</strong>g<br />
support and tra<strong>in</strong><strong>in</strong>g towards apply<strong>in</strong>g it successfully.
What are Effective Ways to Monitor<br />
and Address Inter-rater <strong>Reliability</strong>?<br />
To Monitor Inter-rater<br />
<strong>Reliability</strong><br />
To Address Inter-rater<br />
<strong>Reliability</strong>
Monitor<strong>in</strong>g Inter-Rater <strong>Reliability</strong>: A<br />
Case <strong>of</strong> Inconsistent Scor<strong>in</strong>g Across<br />
Evaluators<br />
Master A<br />
Master B<br />
5<br />
4<br />
3<br />
2<br />
1<br />
0<br />
Problem Solv<strong>in</strong>g<br />
Th<strong>in</strong>k<strong>in</strong>g<br />
Teacher Knowledge <strong>of</strong> Students<br />
Teacher Content Knowledge<br />
Group<strong>in</strong>g Students<br />
Academic Feedback<br />
Question<strong>in</strong>g<br />
Activities and Materials<br />
Lesson Structure and Pac<strong>in</strong>g<br />
Present<strong>in</strong>g Instructional Content<br />
Motivat<strong>in</strong>g Students<br />
Standards and Objectives<br />
Respectful Culture<br />
Environment<br />
Manag<strong>in</strong>g Student Behavior<br />
Expectations<br />
Assessment<br />
Student Work<br />
Instructional Plans
Monitor<strong>in</strong>g Inter-Rater <strong>Reliability</strong>: A Case <strong>of</strong><br />
Inconsistent Scor<strong>in</strong>g <strong>of</strong> One Rubric Indicator
Example <strong>of</strong> average observer vs.<br />
teacher’s self score
Overall Average by Rubric Indicator
Ways to Build Inter-rater reliability<br />
Up Front<br />
• Evaluators are tra<strong>in</strong>ed together as a team, so they build a<br />
common language and common understand<strong>in</strong>g <strong>of</strong> each<br />
<strong>in</strong>dicator<br />
• Each evaluator must pass a certification test, and be<br />
recertified annually, com<strong>in</strong>g with<strong>in</strong> one po<strong>in</strong>t <strong>of</strong> national<br />
raters<br />
• Initial tra<strong>in</strong><strong>in</strong>g is re<strong>in</strong>forced through onl<strong>in</strong>e resources and<br />
tra<strong>in</strong><strong>in</strong>g
Ways to Address Inter-rater reliability<br />
Over Time<br />
• Us<strong>in</strong>g the CODE system, leadership or evaluation teams can<br />
exam<strong>in</strong>e consistency amongst raters to ensure that each rater<br />
is scor<strong>in</strong>g with<strong>in</strong> one po<strong>in</strong>t <strong>of</strong> each evaluator on their team<br />
• The exam<strong>in</strong>ation <strong>of</strong> evaluator data though CODE reduces the<br />
possibility <strong>of</strong> score <strong>in</strong>flation<br />
• We recommend that evaluators schedule activities to monitor<br />
<strong>in</strong>ter-rater reliability with<strong>in</strong> their team at least once per<br />
month
Inter-rater <strong>Reliability</strong> <strong>in</strong> Practice: A Process<br />
Based upon Cont<strong>in</strong>uous Improvement<br />
Teacher Growth<br />
<strong>in</strong> <strong>Classroom</strong><br />
Instruction<br />
Leadership<br />
Team: Support<br />
and Mentor<strong>in</strong>g<br />
Inter-rater<br />
<strong>Reliability</strong><br />
Leadership<br />
Team:<br />
<strong>Classroom</strong><br />
<strong>Observations</strong>
Tennessee Education Acceleration<br />
Model (TEAM): Educator Observation<br />
TEAM will shed light on<br />
educator practices and<br />
relevant student<br />
outcomes, while also<br />
facilitat<strong>in</strong>g a process for<br />
analysis and cont<strong>in</strong>uous<br />
improvement. This new<br />
system will <strong>in</strong>clude<br />
multiple measures for<br />
look<strong>in</strong>g at performance<br />
and will provide a way to<br />
<strong>in</strong>dividualize both support<br />
and recognition for<br />
educators.
Tennessee First to the Top –<br />
Evaluation System<br />
Field test <strong>of</strong> different observation<br />
systems across the state<br />
TAP Teach<strong>in</strong>g Standards selected for the<br />
classroom observation portion <strong>of</strong><br />
new statewide teacher evaluation<br />
model<br />
NIET supports the state <strong>in</strong> tra<strong>in</strong><strong>in</strong>g 5,000<br />
evaluators over the course <strong>of</strong> the<br />
summer, as well as designated<br />
TNDOE staff to provide ongo<strong>in</strong>g<br />
support<br />
90,000 educators registered on NIET<br />
Portal, access<strong>in</strong>g <strong>in</strong>formation on the<br />
evaluation system, certification for<br />
evaluators, monitor<strong>in</strong>g <strong>of</strong> results, as<br />
well as support for improvement for<br />
<strong>in</strong>dividual teachers
Tennessee<br />
NIET supported the state <strong>in</strong> tra<strong>in</strong><strong>in</strong>g<br />
5,000 evaluators <strong>in</strong> 100 tra<strong>in</strong><strong>in</strong>gs<br />
over the course <strong>of</strong> the summer 2011<br />
Location and Number <strong>of</strong> Evaluation Tra<strong>in</strong><strong>in</strong>gs<br />
Bristol 1<br />
Clarksville 1<br />
Clarksville Montgomery 3<br />
Cleveland 4<br />
Columbia 2<br />
Columbia 2<br />
Cookeville 8<br />
Fayetteville 2<br />
Greeneville 5<br />
Harriman 2<br />
Henderson 2<br />
Jackson 1<br />
Jackson /Madison County <strong>School</strong> System 2<br />
Jefferson City 1<br />
Johnson City 4<br />
Knox County <strong>School</strong>s 6<br />
Knoxville 5<br />
Lebanon 4<br />
Mart<strong>in</strong> 6<br />
McKenzie 1<br />
Memphis/Shelby County <strong>School</strong>s 3<br />
Metro Nashville Public <strong>School</strong>s 6<br />
Morristown 1<br />
Mounta<strong>in</strong> City 1<br />
Murfreesboro 3<br />
Nashville 9<br />
Ripley 1<br />
Robertson County <strong>School</strong>s 1<br />
Rutherford County <strong>School</strong>s 3<br />
Savannah 3<br />
Sevier County <strong>School</strong>s 2<br />
Sumner County <strong>School</strong>s 3<br />
Tipton County <strong>School</strong>s 1<br />
Williamson County <strong>School</strong>s 3<br />
Grand Total 102
<strong>School</strong> Districts <strong>in</strong> Tennessee Us<strong>in</strong>g<br />
NIET <strong>Classroom</strong> Observation Instrument
TEAM Annual Observation Cycle
Tennessee Teachers Access NIET’s<br />
Onl<strong>in</strong>e Tra<strong>in</strong><strong>in</strong>g Portal
Evaluation Process Resources
Evaluator Certification<br />
Certification and<br />
Recertification:<br />
Another <strong>in</strong>tegral feature <strong>of</strong> the<br />
tra<strong>in</strong><strong>in</strong>g portal <strong>in</strong>cludes an<br />
onl<strong>in</strong>e certification and<br />
recertification for evaluators.<br />
This onl<strong>in</strong>e experience will<br />
<strong>in</strong>clude the opportunity to<br />
watch a lesson video, gather<br />
evidence and then evaluate the<br />
lesson and assign scores for the<br />
lesson. Certification also<br />
requires demonstrat<strong>in</strong>g the<br />
ability to plan an effective postconference.<br />
Once the exam is<br />
passed, the evaluator would<br />
<strong>of</strong>ficially be certified to<br />
evaluate
Video Library
Tra<strong>in</strong><strong>in</strong>g Modules
NIET Data Systems
NIET Data System Reports
Support provided to school leaders to<br />
tra<strong>in</strong> teachers<br />
• After the <strong>in</strong>itial tra<strong>in</strong><strong>in</strong>g that we provided to all 5,000 evaluators <strong>in</strong> the state <strong>of</strong><br />
Tennessee the state department hired 9 tra<strong>in</strong>ed consultants to serve as a<br />
support system for the schools <strong>in</strong> the new evaluation process<br />
• After our <strong>in</strong>itial tra<strong>in</strong><strong>in</strong>g each adm<strong>in</strong>istrator was required to tra<strong>in</strong> their<br />
teachers and staff on the evaluation process us<strong>in</strong>g materials that we provided<br />
on our portal<br />
• To assist and support adm<strong>in</strong>istrator <strong>in</strong> tra<strong>in</strong><strong>in</strong>g their teachers we provided a<br />
PowerPo<strong>in</strong>t presentation, a tra<strong>in</strong><strong>in</strong>g manual, a participant guide and<br />
accompany<strong>in</strong>g videos for three levels <strong>of</strong> tra<strong>in</strong><strong>in</strong>g: an elementary, middle and<br />
high school tra<strong>in</strong><strong>in</strong>g<br />
• We currently have 90,000 users <strong>in</strong> our NIET Best Practices Center Portal from<br />
the state <strong>of</strong> Tennessee
TN Adm<strong>in</strong>istrators Brief their Faculties
Kristan Van Hook<br />
kvanhook@niet.org<br />
Anissa Rodriguez<br />
arodriguez@niet.org