The intense debate around teacher evaluation has been fueled in recent years by the federal government’s efforts to spur the creation of more sophisticated evaluation systems at the state level, in large part through incentives embedded in the Race to the Top grant competition and No Child Left Behind waiver process. As a result, a majority of states have passed laws requiring the adoption of teacher evaluation systems that are based in part on student achievement data.
Based on the furor that this requirement has elicited from teachers’ unions, one might assume that students’ test scores feature prominently in these new evaluation systems. That is hardly the case. In our new study, published today in Education Next, my colleagues and I found that only 22 percent of teachers were evaluated based on test score gains in the four urban school districts we studied. This is largely because most teachers lead classrooms that are outside the grades and subjects subject to standardized tests.
Most of the action and nearly all the opportunities for improving teacher evaluations lie in the area of classroom observations. This component makes up 50 and 75 percent of the overall evaluation scores in the districts we studied, and much less is known about observation-based measures of teacher performance than about value-added measures based on test scores. Our study yielded a number of new findings that point to potential improvements in the design of teacher evaluation systems.
Most importantly, we discovered that there is bias in the classroom observation scores due to student ability. Teachers with students with higher incoming achievement levels receive classroom observation scores that are higher on average than those received by teachers with lower achieving students. Specifically, a teacher assigned the highest-achieving students is four times as likely to get a very high observation score as a teacher assigned the lowest-achieving students.
Fortunately, there is a straightforward fix to this problem. Observation scores should be adjusted for student demographics, such as measures based as test scores already are. Our study confirmed that this statistical adjustment is successful in producing a pattern of teacher ratings that is much less strongly correlated with the incoming achievement level of students.
Our study also offers practical lessons for the number of annual observations and who should conduct them. Outside observers are better, on average, at identifying teachers who are effective at boosting student achievement than in-building administrators. But both types of observations are costly in terms of staff time, and additional observations beyond two or three do not add much in terms of identifying teacher effectiveness. Consequently, we recommend that each teacher be evaluated two or three times annually, with at least one of the observations conducted by a trained observer from outside the teacher’s school.
New teacher evaluation systems represent a significant improvement over the bad old days of every teacher getting a satisfactory rating based on a cursory observation by their principal. Several studies, including our own, clearly demonstrate that teacher evaluation systems that are based on a number of components, such as classroom observation scores and test-score gains, are already much more effective at predicting future teacher performance than paper credentials and years of experience. Continuing to improve these evaluation systems by addressing the design flaws we have identified will bring districts closer to achieving the primary goal of meaningful teacher evaluation: assuring greater equity in students’ access to good teachers.
– Matthew M. Chingos
Matthew M. Chingos is a senior fellow of the Brown Center on Education Policy at the Brookings Institution. The study on which this blog entry is based “Getting Classroom Observations Right: Lessons on how from four pioneering districts,” by Grover J. “Russ” Whitehurst, Matthew M. Chingos and Katharine M. Lindquist, was published by Education Next on September 16, 2014.