False Claim on Drill and Kill

By 12/13/2010

6 Comments | Print | NO PDF |

The Gates Foundation is funding a $45 million project to improve measures of teacher effectiveness.  As part of that project, researchers are collecting information from two standardized tests as well as surveys administered to students and classroom observations captured by video cameras in the classrooms.  It’s a big project.

The initial round of results were reported last week with information from the student survey and standardized tests.  In particular, the report described the relationship between classroom practices, as observed by students, and value-added on the standardized tests.

The New York Times reported on these findings Friday and repeated the following strong claim:

But now some 20 states are overhauling their evaluation systems, and many policymakers involved in those efforts have been asking the Gates Foundation for suggestions on what measures of teacher effectiveness to use, said Vicki L. Phillips, a director of education at the foundation.

One notable early finding, Ms. Phillips said, is that teachers who incessantly drill their students to prepare for standardized tests tend to have lower value-added learning gains than those who simply work their way methodically through the key concepts of literacy and mathematics. (emphasis added)

I looked through the report for evidence that supported this claim and could not find it.  Instead, the report actually shows a positive correlation between student reports of “test prep” and value added on standardized tests, not a negative correlation as the statement above suggests.  (See for example Appendix 1 on p. 34.)

The statement “We spend a lot of time in this class practicing for [the state test]” has a correlation of  0.195 with the value added math results.  That is about the same relationship as “My teacher asks questions to be sure we are following along when s/he is teaching,” which is 0.198.  And both are positive.

It’s true that the correlation for “Getting ready for [the state test] takes a lot of time in our class” is weaker (0.103) than other items, but it is still positive.  That just means that test prep may contribute less to value added than other practices, but it does not support the claim that  ”teachers who incessantly drill their students to prepare for standardized tests tend to have lower value-added learning gains…”

In fact, on page 24, the report clearly says that the relationship between test prep and value-added on standardized tests is weaker than other observed practices, but does not claim that the relationship is negative:

The five questions with the strongest pair-wise correlation with teacher value-added were: “Students in this class treat the teacher with respect.” (?=0.317), “My classmates behave the way my teacher wants them to.”(?=0.286), “Our class stays busy and doesn’t waste time.” (?=0.284), “In this class, we learn a lot almost every day.”(?=0.273), “In this class, we learn to correct our mistakes.” (?=0.264) These questions were part of the “control” and “challenge” indices. We also asked students about the amount of test preparation they did in the class. Ironically, reported test preparation was among the weakest predictors of gains on the state tests: “We spend a lot of time in this class practicing for the state test.” (?=0.195), “I have learned a lot this year about the state test.” (?=0.143), “Getting ready for the state test takes a lot of time in our class.” ( ?=0.103)

I don’t know whether something got lost in the translation between the researchers and Gates education chief, Vicki Phillips, or between her and Sam Dillon at the New York Times, but the article contains a false claim that needs to be corrected before it is used to push changes in education policy and practice.

UPDATE – The LA Times coverage of the report contains a similar misinterpretation: “But the study found that teachers whose students said they “taught to the test” were, on average, lower performers on value-added measures than their peers, not higher.”

Try this thought experiment with another observed practice to illustrate my point about how the results are being mis-reported…  The correlation between student observations that “My teacher seems to know if something is bothering me” and value added was .153, which was less than the .195 correlation for “We spend a lot of time in this class practicing for [the state test].”  According to the interpretation in the NYT and LA Times, it would be correct to say “teachers who care about student problems tend to have lower value-added learning gains than those who spend a lot of time on test prep.”

Of course, that’s not true.  Teachers caring about what is bothering students is positively associated with value added just as test prep is.  It is just that teachers caring is a little less strongly related than test prep.  Caring does not have a negative effect just because the correlation is lower than other observed behaviors.

-Jay P. Greene

Comment on this article
  • jacob says:

    they’re probably responding to this quote (pg. 9):

    Many are concerned that high value-added teachers are simply coaching children to do well on state tests. In the long run, it would do students little good to score well on state tests if they fail to understand key concepts. However, in our analysis so far, that does not seem to be the case. Indeed, the teachers who are producing gains on the state tests are generally also promoting deeper conceptual understanding among their students. In mathematics, for instance, after adjusting for measurement error, the correlation between teacher effects on the state math test and on the Balanced Assessment in Mathematics was moderately large, .54.

  • Jay P. Greene says:

    I suspect that the reporters are honestly repeating what they are getting from Gates officials, but the results of the report do not support the claim that drill and kill hurts value-added on standardized tests.

  • eric says:

    This sort of misinterpretation of reported data is widespread. It seems no one in these institutions has sufficient reading comprehension skills required to understand precisely what a set of data says. I suppose it is a self fulfilling prophecy – faltering education system leading to inadequate, inaccurate coverage of news. In any case, great reporting and I hope the NY and LA Times post corrections, but am not holding my breath.

  • stats geek says:


    Your assumption that a postive correlation is significant is unfounded-that is practically significant, not statistically significant. Your are focused on one piece of data. The tiny effect size for these items would suggest these are ineffective practices.

    A second strategy is to group the items into groups (latent variables) through factor analysis (exploratory or confirmatory). This may be what the Gates Foundation spokesperson was referring to with such a general statement.

  • Jay P. Greene says:

    I agree, stats geek, that all of the correlations between the student survey items and value added are weak. My point was just that the direction of the correlation coefficient is the opposite of what was claimed.

    You may also be right that Vicki Phillips is referencing some factor analysis, but I can’t find it in the report Gates just released. People don’t normally cite as one of their findings an analysis that they didn’t report.

    I encourage you to read the report and see if you can find it.

  • John Thompson says:

    What about the obvious falsehoods reported by Jason Felch who reported that a new Gates-funded report had found that “classroom effectiveness can be reliably estimated by gauging the students’ progress on standardized tests” though in reality the report claimed no such thing. Then there was the obvious editorializing disguised as explanatory text including lines such as ‘Value-added’ is thought to bring objectivity to the process and, because it compares students to themselves over time, largely controls for influences outside teachers’ control, such as poverty and parental involvement.” The Gates web site explicitly repudiated that statement, and even said that that was why they were doing the MET project.

    The evidence in the Gates Foundation’s “Learning About Teaching,” as opposed to their PR spin, supports Gordon MacInnes’ vision of school reform, and his explanation of why “the value-added method is not ready for prime time.” MacInnes has shown that high-quality pre-school, and diagnostic testing to achieve reading for comprehension, is the key to closing the Achievement Gap. The Gates study’s also showed that before 3rd grade when students are “learning to read,” children’s reading improves more during the months when they are in school. But that pattern is reversed in 4th grade. The study admits that “the above pattern implies that schooling itself may have little impact on standard reading assessments after 3rd grade.” The other headline-worthy finding was documentation of the powerful effect of classroom behavior on increasing test scores. MacInnes, like James Heckman and others, have demonstrated the importance of providing socio-emotional supports. The Gates authors should reread the Consortium for Chicago School Research’s explanations of why we need systemic approaches to absenteeism and student behavior before teachers can effectively teach a rigorous curriculum.

  • Comment on this Article

    Name ()


    Sponsored Results

    The Hoover Institution at Stanford University - Ideas Defining a Free Society

    Harvard Kennedy School Program on Educational Policy and Governance

    Thomas Fordham Institute - Advancing Educational Excellence and Education Reform