How Student Work Models Make Rubrics More Effective

Without models of student work to validate and ground them, rubrics are often too vague for students to be useful.

How Student Work Models Make Rubrics More Effective

Student Work Models Make Rubrics More Effective

by Grant Wiggins, Authentic Education

It was not that long ago when I did a workshop where the staff from the Dodge Foundation (who were funding my work at the time) took me aside at the break because they were concerned with my constant use of a term that they had never heard of – rubric. Those of us promoting their use over the past 20 years can now smile and take satisfaction in the fact that the term is now familiar and the use of rubrics is commonplace world-wide.

Alas, as with other good ideas, there has been some stupidification of this tool. I have seen unwise use of rubrics and countless poorly-written ones: invalid criteria, unclear descriptors, lack of parallelism across scores, etc. But the most basic error is the use of rubrics without models. Without models to validate and ground them, rubrics are too vague and nowhere near as helpful to students as they might be.

Consider how a valid rubric is born. It summarizes what a range of concrete works looks like as reflections of a complex performance goal. Note two key words: complex and summarizes. All complex performance evaluation requires a judgment of quality in terms of one or more criteria, whether we are considering essays, diving, or wine. The rubric is a summary that generalizes from lots and lots of samples (sometimes called models, exemplars, or anchors) across the range of quality, in response to a performance demand. The rubric thus serves as a quick reminder of what all the specific samples of work look like across a range of quality.

Cast as a process, the rubric is not the first thing generated, therefore; it is one of the last things generated in the original anchoring process. Once the task has been given and the work is collected, one or more judges sorts the work into piles while working from some general criteria. In an essay, we care about such criteria as: valid reasoning, appropriate facts, clarity, etc. So, the judges sort each sample into growing piles that reflect a continuum of quality: this pile has the best essays in it; that pile contains work that does not quite meet the criteria as well as the top pile, etc.

Once all the papers have been scored, the judge(s) then ask: OK, how do we describe each pile in summary form, to explain to students and other interested parties the difference in work quality across the piles, and how each pile differs from the other piles? The answer is the rubric.

Huh? Grant, are you saying the assessment is made before there are rubrics? Isn’t that backward?

No, not in the first assessment. Otherwise, how would there ever be a first assessment? It’s like the famous line from Justice Potter Stewart: I can’t define pornography, but I know it when I see it. That’s how it works in any judgment. The judgments come first; then, we turn our somewhat inchoate judgments into fleshed-out descriptors – rules that rationalize judgment – into a more general and valid system. Helpful rubrics offer rich descriptors that clarify for learners the qualities sought; poor rubrics amount to no more than saying that Excellent is better than Good, etc.

Once we have the rubrics, of course, we can use them in future assessments of the same or similar performance. But here is where the trouble starts. A teacher borrows a rubric from a teacher who borrowed the rubric, etc. Neither the current teacher nor students know what the language of the rubric really means in the concrete because the rubric has become unmoored from the models that anchor and validate it. In a very real sense, then, neither teacher nor students can use the rubric to calibrate their work if there are no models to refer to.

Look at it from the kids’ point of view. How helpful is the following descriptor in letting me know exactly what I have to do to get the highest score? And how does excellence differ from merely adequate? (These two descriptors actually come from a state writing assessment):

5. This is an excellent piece of writing. The prompt is directly addressed, and the response is clearly adapted to audience and purpose. It is very well-developed, containing strong ideas, examples and details. The response, using a clearly evident organizational plan, engages the reader with a unified and coherent sequence and structure of ideas. The response consistently uses a variety of sentence structures, effective word choices and an engaging style.

3. This is an adequate piece of writing. Although the prompt is generally addressed and the response shows an awareness of audience and purpose, the response’s overall plan shows inconsistencies. Although the response contains ideas, examples and details, they are repetitive, unevenly developed and occasionally inappropriate. The response, using an acceptable organizational plan, presents the reader with a generally unified and coherent sequence and structure of ideas. The response occasionally uses a variety of sentence structures, appropriate word choices and an effective style.

Do you see the problem more clearly? Without the models I cannot be sure what, precisely and specifically, each of the key criteria – well-developed, strong ideas, clearly-evident organizational plan, engages the reader, etc. – really mean.  I may now know the criteria, but without the models I don’t really know the performance standard; I don’t know how ‘strong’ is strong enough, nor do I know if my ideas are ‘inappropriate’: There is no way I can know without examples of strong vs. not strong  and appropriate vs. inappropriate (with similar contrasts needed for each key criterion.)

In fact, without the models, you might say that this paper is ‘well-developed’ while I might say it is ‘unevenly developed.’ That’s the role of models; that’s why we call them ‘anchors’ because they anchor the criteria in terms of a specific performance standard.

Knowing the criteria is better than nothing, for sure, but it is nowhere near as helpful as having both rubric and models. This same argument applies to the Common Core Standards: we don’t know what they mean until we see work samples that meet vs. don’t meet the standards. It is thus a serious error that the existing samples for Writing exist in the Appendix to the Standards where far too few teachers are likely to find them.

This explains why the AP program, the IB program, and state writing assessments show samples of student work – and often also provide commentary. That’s really the only way the inherently-general language of the rubric can be made fully transparent and understood by a user – and such transparency of goals is the true aim of rubrics.

This is why the most effective teachers not only purvey models but ask students to study and contrast them so as to better understand the performance standards and criteria in the concrete. In effect, by studying the models, the student simulates the original anchoring process and stands a far better chance of internalizing and thus independently meeting the standard.

But doesn’t the use of models inhibit creativity and foster drearily formulaic performance?

This is a very common question in workshops. Indeed, it was posed in a recent workshop in Prince George’s County we ran (and it spawned the idea for this post). The answer? Not if you choose the right models! Even some fairly smart people in education seem confused on this point. As long as the models are different, of genuine quality, and in their variety communicating that the goal is original thought, not formula, then there is no reason why students should respond formulaically except out of fear or habit.

If you don’t want 5-paragraph essays, don’t ask for them! If you don’t want to read yet another boring paper, specify via the examples and rubric descriptors that fresh thinking gets higher scores!

Bottom-line: never give kids the excuse that “I didn’t really know what you wanted!” Always purvey models to make goals and supporting rubrics intelligible and to make good performance more likely. Thus, make sure that the variety of models and the rubrics reflect exactly what you are looking for (and what you are warning students to avoid).

In my next post, I’ll provide some helpful tips on how to design, critique, and refine rubrics; how to avoid creativity-killing rubrics; and other tips on how to implement them to optimize student performance. Meanwhile, if you have questions or concerns about rubric design and use, post a reply with your query and I’ll respond to them in the following post.

This article first appeared on Grant’s personal blog

On May 26, 2015, Grant Wiggins passed away. Grant was tremendously influential on TeachThought’s approach to education, and we were lucky enough for him to contribute his content to our site. Occasionally, we are going to go back and re-share his most memorable posts. This is one of those posts. Thankfully his company, Authentic Education, is carrying on and extending the work that Grant developed.