Last month, Education Next published a study that I conducted with Amelie Wupperman that looked at the impact on student achievement of teaching time devoted to lecturing vs. problem-solving activities. As our study concluded,
We find that teaching style matters for student achievement, but in the opposite direction than anticipated by conventional wisdom: an emphasis on lecture-style presentations (rather than problem-solving activities) is associated with an increase—not a decrease—in student achievement. This result implies that a shift to problem-solving instruction is more likely to adversely affect student learning than to improve it.
Our study, as well as an article about it by Education Next editor Paul Peterson, generated a lot of comments and discussion which I would like to respond to here.
In general, the question that we try to answer in our study is: What is the effect on student achievement if the average teacher shifts time in class from lecturing to problem-solving activities (as these teaching practices are implemented in practice today)? This is an important question, as many stakeholders in education today strongly advocate problem-solving activities, and teachers might feel the pressure to change their way of teaching and to lecture less.
From a policy point of view, this is also a particularly relevant question. Most education reforms explicitly attempt to increase student achievement, but are usually quite expensive (e.g. reducing class size). So, the idea of achieving the same goal of boosting student achievement by simply inducing the average teacher to lecture less (which is what many stakeholders advocate) seems very intriguing. However, unfortunately it is not that simple.
Our results imply that simply inducing teachers to shift time in class from lecture-style presentations to problem-solving activities, without concern for how this is implemented, contains little potential to increase student achievement and, in fact, might even have adverse effects on student learning.
On implementation: Implementation matters! I completely agree on this. If correctly implemented, hands-on-learning techniques may be highly effective. In our study, we (deliberately) do not control for implementation. Thus, our findings do not answer the question: What is the better teaching practice if teaching practices are implemented in the ideal way?
On limitations and implications: We are looking at average effects of shifts in the amount of time spent lecturing, but we do not suggest that lecturing vs. problem-solving activities is an “either/or” choice. Clearly, effects need not be linear. There might certainly be an optimal mix of these teaching practices. Thus, our findings do not call for more lecture-style teaching in general. Moreover, our findings are limited to students in the middle grades, and it is only the benefits of the relative emphasis on one or another teaching style that is being assessed.
On the outcome measure: TIMSS test scores certainly do not measure every dimension of student learning that is relevant. For example, non-cognitive skills are not measured in TIMSS. Moreover, it might be that effects fade out in the long run. Currently, the lack of large-scale longitudinal data that includes information on teaching practices does not allow us to test this hypothesis.
However, TIMSS test scores are nevertheless something we should care about. First, TIMSS test scores are presumably positively correlated with many other dimensions that matter (e.g. non-cognitive skills) and future achievement levels. At least, there is little reason to believe that the correlation is negative. Second, various test score measures have been shown to be correlated with other measures of educational success (high school dropout, college completion, etc.) and labor market outcomes (employment probabilities, earnings, etc.). Test scores from international student achievement studies have also been shown to substantially explain differences in long-run economic growth between countries.
Ignoring education research that utilizes available test score measures as a proxy for student learning on the basis that test scores might be an imperfect measure for everything that matters is very short-sighted. Education researchers have made tremendous efforts to improve student achievement tests and to collect better data over the past decades. These were definitely steps in the right direction and substantially fostered our understanding of educational production. If one ignores these developments, we are back in a world where the debate over education is mainly a subjective debate based on personal beliefs and experiences.
On the effect size: There seems to be some confusion here. We normalized TIMSS scores to have a mean of zero and a standard deviation of one. This is best practice in education research to make results comparable across studies. For example, the Project STAR class-size experiment suggests that math achievement increases by 22% of a standard deviation in small classes (13-17 students) compared to regular classes (22-25 students). A recent study by Hill et al. (2008) provides empirical benchmarks for interpreting effect sizes in educational research more generally. The rough benchmark for the effect that we find (3.6% of a SD) of “one to two months of extra learning” is based on this study. You might think of this as a small effect, but it is estimated precisely and it is statistically significant with a t-statistic way above two.
On the in-class time use measure: Because we wanted to look at the effect of teaching practices on a large-scale, we had to rely on self-reported measures of in-class time use. These measures are certainly measured with error. However, classical measurement theory tells us that measurement error drives estimates towards zero. Imagine a situation in which teachers provide completely random answers to the question on in-class time use. In this case, you would not be able to detect a correlation between test scores and in-class time use even if there are strong effects. It is all the more interesting that we find a significant positive effect of time spent on lecture, despite the presumably noisy measure of in-class time use. Our results are also not driven by the grouping of different time use categories. For example, if we just compare time spent lecturing to problem-solving with teacher guidance, we find the same effects.
On potential mechanisms: There are a couple of potential explanations that are consistent with our findings. First, lecture-style practices might just not be as ineffective as often believed and, thus, shifting time in class to problem-solving is not necessarily going to help students learn more. Second, teaching based on problem-solving might be highly effective, but optimal teaching methods that cannot be executed by teachers in general may do more harm than good. In other words, simply requiring teachers to shift to problem-solving approaches is not likely to have positive effects if the average teacher is not skilled in implementing problem-solving approaches. Third, the optimal teaching style might contain a mix of both techniques, which involves lecturing (to provide background or basic content) complemented by problem-solving activities. On average teachers might already be teaching based on an almost optimal mix (with a little too much problem-solving), so that further reducing the time spent lecturing has negative effects on student achievement.
– Guido Schwerdt