6 Questions Hattie Didn’t Ask But Could Have
by Terry Heick
If you’re familiar with John Hattie’s meta-analyses and it hasn’t given you fits, it may be worth a closer look.
If you missed it, in 2009, John Hattie released the results of a massive amount of work (work he updated in 2011). After pouring over thousands of studies, Hattie sought to separate the wheat from the chaffe–what works and what doesn’t–similar to Marzano’s work, but on a much (computationally) larger scale.
Hattie (numerically) figures that .4 is an “average” effect–a hinge point that marks performance: anything higher is “not bad,” and anything lower is “not good.” More precisely, Grant Wiggins aggregated all of the strategies that resulted in a .7 or better–what is considered “effective.” The top 10?
- Student self-assessment/self-grading
- Response to intervention
- Teacher credibility
- Providing formative assessments
- Classroom discussion
- Teacher clarity
- Reciprocal teaching
- Teacher-student relationships fostered
- Spaced vs. mass practice
So what’s below these top 10? Questioning, student motivation, quality of teaching, class size, homework, problem-based learning, mentoring, and dozens of other practices educators cherish. Guess what ranks below direct instruction and the esoteric “study skills”? Socioeconomic status. Of course, it’s not that simple.
While I leave it up to Hattie and those left-brain folks way smarter than I am to make sense of the numbers, I continue to wonder how the effect of one strategy–problem-based learning, for example–can be measured independently of other factors (assessment design, teacher feedback, family structure, and so on). Also, it can also be difficult to untangle one strategy (inquiry-based learning) from another (inductive teaching).
Hattie’s research is stunning from a research perspective, and noble from an educational one, but there are too many vague–or downright baffling–ideas to be used as so many schools and districts will be tempted to use it. Teacher Content Knowledge has an effect size of.09, which actually is worse than if they did nothing at all? Really? So how does it make sense to respond, then?
As always, start with some questions–and you may be left with one troubling implication.
What Should You Be Asking?
Recently, we shared a list of these effect sizes, shown in ascending order. Included in Grant’s original post is a well thought-out critique of Hattie’s work (which you can read here), where the author questions first Hattie’s mathematical practice of averaging, and then brings up other issues, including comparing apples and oranges (which another educator does here). Both are much more in-depth criticisms that I have any intention of offering here.
There are multiple languages going on in Hattie’s work–statistical, pedagogical, educational, and otherwise. The point of this post is ask some questions out loud about what the takeaways should be for an “average teacher.” How should teachers respond? What kinds of questions should they be asking to make sense of it all?
1. What’s the goal of education?
Beyond anything “fringe benefits” we “hope for,” what exactly are we doing here? That, to me, is the problem of so many new ideas, trends, educational technologies, research, and more–what’s the goal of education? We can’t claim to be making or lacking progress until we know what we’re progressing towards.
The standards-based, outcomes-based, data-driven model of education has given us bravely narrow goals for student performance in a very careful-what-you-wish-for fashion.
2. How were the effect sizes measured exactly?
How are we measuring performance here so that we can establish “effect”? Tests? If so, is that ideal? We need to be clear here. If we’re saying this and this and this “work,” we should be on the same page about what that means. And what if a strategy improves test scores but stifles creativity and ambition? Is that still a “win”?
3. What do the terms mean exactly?
Some of the language is either vague or difficult to understand. I am unsure what “Piagetian programs” are (though I can imagine), nor “Quality Teaching” (.44 ES). “Drugs”? “Open vs Traditional”? This is not a small problem.
4. How were those strategies locally applied?
Also, while the “meta” function of the analysis is what makes it powerful, it also makes me wonder–how can Individualized Instruction only demonstrate a .22 ES? There must be “degrees” of individualization, so that saying “Individualized Instruction” is like saying “pizza”: what kind? With 1185 listed effects, the sample size seems large enough that you’d think an honest picture of what Individualized Instruction looked like would emerge, but it just doesn’t happen.
5. How should we use these results?
In lieu of any problems, this much data has to be useful. Right? Maybe. But it might be that so much effort is required to localize and recalibrate it a specific context, that’s it’s just not–especially when it keeps schools and districts from becoming “researchers” on their own terms, leaning instead on Hattie’s list. Imagine “PDs” where this book has been tossed down in the middle of every table in the library and teachers are told to “come up with lessons” that use those strategies that appear in the “top 10.” Then, on walk-throughs for the next month, teachers are constantly asked about “reciprocal teaching” (.74 ES after all), while project-based and inquiry-based learning with diverse assessment forms and constant meta-cognitive support is met with silence (as said administrator flips through Hattie’s book to “check the effect size” of these strategies).
If you consider the analogy of a restaurant, Hattie’s book is like a big book of cooking practices that have been shown to be effective within certain contexts: Use of Microwave (.11 ES) Chefs Academic Training (.23 ES), Use of Fresh Ingredients (.98). The problem is, without the macro-picture of instructional design, they are simply contextual-less, singular items. If they are used for teachers as a starting point to consider while planning instruction, that’s great, but that’s not how I’ve typically seen them used. Instead, they often become items to check, along with learning target, essential question, and evidence of data use.
Which brings me to the most troubling question of all…
6. Why does innovation seem unnecessary?
Scroll back up and look at the top 10. Nothing “innovative” at all. A clear, credible teacher that uses formative assessment to intervene and give learning feedback should be off the charts. But off the charts how? Really good at mastering standards? If we take these results at face value, innovation in education is unnecessary. Nothing blended, mobile, connected, self-directed, or user-generated about it. Just good old-fashioned solid pedagogy. Clear, attentive teaching that responds to data and provides feedback. That’s it.
Unless the research is miles off and offers flat out incorrect data, that’s the path to proficiency in an outcomes-based learning environment. The only way we need innovation, then, is if we want something different.
Education: No Innovation Required; image attribution flickr user usarmycorpofengineerssavannahdistrict