The Inconsistent Implementation of Teacher Evaluation Reforms

Contrary to claims that recent teacher evaluation reforms are leading to strict, one-size-fits-all policies, state-level data actually suggests local districts are implementing state-based teacher evaluation reforms inconsistently.

In an effort to improve teacher quality and student performance, states began making sweeping changes to teacher evaluation systems in 2009. Previous teacher evaluation policies didn’t differentiate poor teachers from average teachers or great ones—much less award or discipline teachers based on their performance. Common reform policies included the use of objective student data to evaluate teacher performance, more frequent classroom observations, and the rollout of performance-based incentives (or disciplinary action). These new policy changes, however, were met by sharp criticism from opponents who felt that new state-level reforms would undermine local control.

Data released by states in recent years, however, suggests districts actually have considerable control over reform implementation. They have so much control, in fact, that they are implementing new state policies in widely unpredictable ways. At Bellwether Education Partners, Chad Aldeman and I analyzed teacher evaluation data from 17 states and Washington, D.C. and published a report of our key findings. Among them:

Colorado: Elementary school teachers have a significant advantage over their high school counterparts

After Colorado passed its new evaluation law, two dozen districts participated in the pilot implementation by rating teachers on their professional practice. (Colorado’s new evaluation system also calls for half of a teacher’s rating to be based on student academic growth, but the state is still finalizing this process.) A teacher’s final rating during the 2012-13 pilot year fell into one of five categories: Exemplary, Accomplished, Proficient, Partially Proficient, and Not Evident.

Under this new system, male and high school teachers were more likely to receive unfavorable ratings. During the pilot year, 14 percent of male teachers received the two lowest ratings of “Partially Proficient” or “Not Evident,” compared to 7 percent of female teachers. And 17 percent of high school teachers (or 1 out of 6) fell into these low-ranking categories, compared to only 5 percent of elementary school teachers.

This data implies that elementary school teachers are performing at a higher level than high school teachers, and that female teachers are more effective than male ones. More likely, districts in Colorado are not implementing classroom observations and teacher evaluation policies in a fair, objective, and consistent manner.

Florida: Do neighboring districts have a vastly different teaching workforce?

Florida implemented a four-tiered system after passing a new school personnel evaluation law in 2011. Under this new system, teachers are evaluated on student academic growth and classroom practice and receive a rating of Highly Effective, Effective, Needs Improvement (called “Developing” if they are in their first three years of teaching), or Unsatisfactory.

Data from three neighboring counties in western Florida indicates that districts are implementing the new state law very differently. In Hillsborough County, 38 percent of teachers received the best rating of “Highly Effective” in SY 2012-13. A few miles away, however, Pasco County gave this rating to only 5 percent of its teachers.

In Pasco, where very few teachers received the highest rating, 94 percent of teachers received the next-best rating of “Effective”—they were considered good, but not amazing. In contrast, 43 percent of teachers in nearby Manatee County were labeled “Effective.”

One could argue that these varied teacher ratings are the result of other distinct differences across the three districts. But in fact, student demographics and academic performance at Hillsborough, Pasco, and Manatee are remarkably similar. All three counties have nearly identical percentages of economically disadvantaged students and produce relatively similar graduation rates. In 2013, all three districts received a letter grade of “C” based on student performance on statewide assessments. Despite these similarities among districts that share borders, the distribution of teacher effectiveness ratings is incredibly inconsistent—it makes one wonder if Florida is pushing districts to implement any “statewide” standards.

Our findings suggest that local districts are still figuring out how to implement their own evaluation systems, and they’re doing so in divergent ways. Those divergences are likely more an artifact of poor implementation than true differences. The data that some states have released can hardly be considered useful if their system favors certain school districts, female versus male teachers, or specific grade levels. The truth is that districts are implementing statewide evaluation systems in an unreliable and ineffectual manner—and that means we haven’t made much as much progress as we need to when it comes to teacher evaluation.

—Carolyn Chuong

Carolyn Chuong is an analyst at Bellwether Education Partners.

The Inconsistent Implementation of Teacher Evaluation Reforms

Latest Issue

NEWSLETTER

Business + Editorial Office

Discover

More Information