What Are The Different Ways To Measure Understanding?
by Terry Heick
How do you measure what a student understands?
Not give them an assessment, score it, then use that score to imply understanding. Rather, how do you truly ‘uncover’ what they ‘know’–and how ‘well’ they know it?
The Challenge Of Outcomes & Standards-Based Assessment
First a preface: itemizing ways to measure understanding is functionally different than students choosing a way to demonstrate what they know—mainly because in a backward-design approach where the learning target is identified first, that learning target dictates everything else downstream.
If, for example, a student was given a topic and an audience and was allowed to ‘do’ something and then asked to create something that demonstrated what they learned, the result would be wildly different across students. Put another way, students would learn different things in different ways.
By dictating exactly what every student will ‘understand’ ahead of time, certain assessment forms become ideal. It also becomes much more likely that students will fail. If students can learn anything, then they only fail if they fail to learn anything at all or fail to demonstrate learning anything at all. By deciding exactly what a student will learn and exactly how they will show you they learned it, three outcomes, among others, are possible:
1. The student learned a lot but not what you wanted them to learn
2. The student learned exactly what you wanted them to learn but failed to demonstrate it in the assessment
3. The student failed to learn
But what if you, as a teacher, have been misled?
Consider The Purpose Of Assessment
For an analogy, consider a car. If we want the car running at its best—to be safe to operate, to start when we want to go somewhere, to attain the fuel efficiency that we want, to look and smell the way we want, etc., we need data about those conditions. The look and smell are easy enough, and the fuel efficiency is a matter of math. The reliability is a little more abstract—cut and dry but the product of many other maintenance factors and so requires more abstraction to predict. And the safety bit? Equally abstract—and subjective, to boot.
Imagine being concerned about the reliability of the car (as anyone would be), so you developed tests that could be used to predict the future likelihood that the car would start. (It’s at this point that I’m realizing I could’ve chosen a much better analogy, but I’m sticking with the car/starting thing. Sorry.) So imagine giving a bunch of tests to predict today whether the car would start tomorrow. It makes sense on paper because we want to monitor that idea, but it seems a little wasteful, right?
And it gets wasteful fast when you consider the possibility that the car may fail test after test after test and still continue to start. This means that the tests we developed to predict whether or not the car would start likely measured something but not what we wanted them to measure. The tests were bad and the data misleading and any conclusions drawn accordingly invalid.
In an outcomes-based and data-driven circumstance, the data and the decisions made using that data are everything. If that data is misleading, it’s not hard to realize that we will be misled–as teachers and learners–too.
First, Determine The Purpose Of The Assessment
If all students are all going to have their height and weight measured, a common standard makes sense; If students are all going to have their attractiveness measured, any kind of ‘standard’ is creepy.
Measuring knowledge and mastery of competencies and skills isn’t quite as subjective as ‘beauty,’ but isn’t anywhere close to as cut and dry as height and weight. We can give the same test that measures the same thing in the same way for all students and do no damage really—provided we are all on the same page that we’re not measuring understanding but rather measuring performance on a test.
In a perfect world, we’d have countless ways to measure that understanding—all valid, universally understood, engaging to students, etc. In pursuit, I thought it’d make sense to brainstorm different ways to measure understanding. Some will be more or less useful depending on content areas, grade levels, student motivation, etc., not to mention what the purpose of the assessment is.
Do you need a snapshot?
Do you need to measure mastery or growth?
Do you want it to be flexible for a variety of learners or more binary—you either pass or you fail?
Do you want students to be able to return to the assessment periodically or is this a one-shot kind of deal?
Is the assessment for the teacher or the student?
If you’re not clear about why you’re assessing (and what you’re going to do with the data the assessment provides) you’re wasting a lot of time, energy, and resources–your own and that of the students.
With that mind, see below for 50 ways to measure understanding. Some are assessment forms (e.g., exit slips), some are models (e.g., Bloom’s Taxonomy) and some are more frequently thought of as teaching strategies (e.g., Socratic Discussion).
50 Ways To Measure Understanding
Assessment Forms For Measuring Understanding
These can be thought of as reasons to test
In 6 Types Of Assessment, we offered up exactly that–six ‘kinds’ of tests that imply the purpose of and standard for the assessment.
1. Norm-Referenced Assessments
Norm-referenced assessments are assessments used to compare students to one another rather.
2. Criterion-Based Assessments
A criterion-based assessment assesses a student’s performance against a clear and published goal or objective. This is in contrast, for example, to a test that students hope to ‘do well on’ but without a clear and concise objective and/or without clear performance standards for that objective.
3. Standardized Assessments
A standardized assessment is any assessment containing elements that are the same for all students universally. The perceived benefit is that standardization ensures all students are being weighed equally and that there is a common ‘bar’ for students to be measured with.
4. Standards-Based Assessment
A form of standardized assessment, a standards-based assessment is one that is based on an academic content standard (e.g., ‘Determining an author’s purpose...’).
5. Personalized assessments
While these aren’t necessarily ‘different’ kinds of assessments, they do reflect different reasons to assess.
Pre-assessment is any kind of evaluation, analysis, or measurement of student understanding that occurs before the teaching/learning process begins.
The purpose of pre-assessment varies–it can be to help plan lessons and activities, revise curriculum maps, create personalized learning pathways for individual students, help inform grouping strategies, plan future assessments, etc.
7. Formative Assessment
Formative assessment generally occurs during the teaching and learning, though it’s not that simple and a better way to think about formative assessment is to consider that it provides data to form and inform the teaching and learning on an ongoing basis. A common example of a formative assessment is a quiz. Types of quizzes? Pop quizzes, planned/scheduled quizzes, timed quizzes, and so on.
This can also be thought of as ‘diagnostic assessment,’ and is ideally the most common form of assessment in K-12 learning environments (because the purpose is to measure understanding to better create future learning experiences).
8. Summative Assessment
A summative assessment is any assessment done when the ‘teaching is done.’ This makes the process of ‘summative assessment’ a curious thing unless there are no more opportunities to teach and learn (like the end of a school year).
9. Timed Assessment
This one is self-explanatory–any assessment that is time-bound is a timed assessment (though technically that timing can be minutes or even years depending on the nature and purpose and scale of the assessment.
Timed Assessments can also be combined with other forms–a timed project or timed essay, for example. The idea is the constraint of time somehow shapes the scope of the test and the performance of the student.
11. Untimed Assessment
Untimed assessments are less common than timed assessments if for no other reason than the scheduled-nature of modern education necessitates it.
12. Open-Ended Assessment
In contrast to a timed, standards-based and standardized assessment, an open-ended assessment is generally designed to provide a proving ground for the students to demonstrate knowledge, skills, and competencies. Through open-ended assessment, student autonomy, creativity, and self-efficacy play a larger role in their performance.
Due to the nature of this approach, the mindset of the learner is crucial. Without confidence, ownership, and a clear sense of how and what they might demonstrate what they know, learners can feel uncertain–and worse, may fail to ‘show what they know’ and misinform future planning of learning experiences because of this ‘failure.
A learning blend is an example of an open-ended assessment.
13. Game-Based Assessment
A game-based assessment is often technology-based (e.g., video games), but an athletic contest can be considered game-based assessments as it’s the performance within a given set of rules that determines what the learner knows and can ‘do.’
14. Benchmark Assessment
Benchmark assessments evaluate student performance at periodic intervals, frequently at the end of a grading period. Can predict student performance on end-of-the-year summative assessments.
15. Group Assessment
Group assessment is what it sounds like it might be–assessment done in a group with (at times) varying roles and responsibilities.
Obviously, a design challenge is Group Assessment is to know exactly what you’re assessing as social dynamics and individual roles and responsibilities can obscure the analysis of student learning.
Different Assessment Forms For Measuring Understanding
These can be thought of as ‘types of tests’
16. Short Responses Tests
Example: Short, written or verbal responses to questions or prompts
17. Extended Responses (On-Demand, Essays, etc.)
Example: Like the above but longer–anywhere from a few paragraphs to entire research essays
18. Multiple-Choice Tests
Not sure this needs explaining–multiple-choice assessments are great for providing data but are highly dependent on the quality of the questions and responses–and even then favor highly-literate and motivated students over others.
19. True-False Tests
If you’re good at creating very nuanced True/False assessment items, they can challenge students with a strong grasp of content by forcing them to closely consider whether something is ‘true’ or not. True/False assessments can also be useful for struggling or ‘hesitant’ students because the barrier to answering is so low (like a multiple-choice assessment) but there are only two ‘answers’ to choose between.
Tip: You can allow students to revise the true or false statement until it seems true to them based on what they know. The changes they make can go a long way in helping to diagnose what they’re misunderstanding.
20. Matching Items
The strength of matching items is that they’re simple to create, complete, and score–and well-designed, can be surprisingly effective in uncovering what students know. The challenge with these kinds of assessments is that they do very little to demonstrate depth of understanding and are only useful with certain kinds of content.
21. Performance & Demonstration (i.e., watching the student attempt to demonstrate understanding/competency/skill in real-time)
Example: Watching a student try to hit a free throw in basketball or make a specific pass in soccer, etc. It doesn’t have to be athletic-based, however. Students can also demonstrate the effect of gravity on planetary orbits or how propaganda works, etc.
22. A Visual Representation
Example: Students can create a visual representation of the water cycle–how it works, all the forms it takes, its benefits, the physics of the process, etc. What is visualized is obviously part of the assessment.
They could also create one for the use of transitional phrases in writing–what they do, when they’re used, what their effects are, etc.
Analogies are underrated assessment tools; students can ‘answer’ analogies you create, modify them to create new meanings, explain why an analogy you create is wrong, or create their own analogies to demonstrate understanding.
Example: If you wanted to assess a student’s understanding of thesis statements, you could have students ‘answer’ an analogy you create by completing the analogy.
Thesis Statement: Essay:: Company:________ (mission or slogan)
24. Concept Maps
25. Graphic Organizers (like analogies, these are also very underrated ways to measure understanding)
26. A Physical Artifact
27. A Question (i.e., a student asking/revising a question as a form of assessment)
28. A Debate
29. A Conversation/Group Discussion/Socratic Discussion
30. Question Stems
See here for examples of question stems for critical thinking. You can also let students create their own stems and quiz/test one another.
31. Role-Playing (e.g., role-playing historical figures to assess biographical knowledge–this is similar to #21)
32. QFT Session
33. Observable Metacognition
This non-traditional assessment form is asking to (somehow ‘watching’ the student think about their own thinking an using it to ‘measure’ understanding
34. Self-Assessment (where the student evaluates their own understanding with or without the help of the teacher)
36. Expert Assessment
This one is obviously better suited to more skilled and knowledgeable learners (in high school and college, for example.) An example of Expert Assessment is a talent show like American Idol
Any kind panel evaluation (where a third party chooses one or more of the above forms) that seeks to evaluate and measure understanding where the assessment and feedback depend on the specific and often narrow expertise of the panel itself is an ‘expert assessment.’
37. RAFT Assignments
RAFT is a common ELA activity that stands for Role Audience Format Topic (or Theme/Thesis/Tone). I hesitated to put this on the list because it’s best-suited to English-Language Arts/Literature/Writing/Literacy and is hard to explain it’s utility as an assessment tool even in that narrow domain.
The idea here is to alter elements of an activity or assignment to force students to think critically to complete it, and it doesn’t have to be ‘RAFT’–you can frame anything in any number of ways. (If you’ve never used RAFT before, you’re probably best off skipping this one until you’re more familiar with it. Email me if you have any questions.)
Example: Students studying the Declaration of Independence can revise it for a specific audience or communicate the same ideas in a different format or tone (rather than the letter/political tone of the original).
38. A Challenge
Creating a challenge for students to complete in order to demonstrate understanding is another non-traditional form of assessment but can be useful to engage hesitant learners or bring out the best in gifted students. Gamification is useful in challenge-based assessment.
39. Teacher-Designed Projects
Create a project that produces a ‘thing’ whose quality will or will not demonstrate student understanding.
40. Student-Designed Projects
The same as above but the student designs the project (likely with the teacher). This will confuse a lot of students–you’ll know fast if this is going to be useful for your purpose for assessment or not.
41. Self-Directed Learning
By supporting the student to reflect on, prioritize, plan, and complete their own learning experiences on their own, students will inherently have their understanding measured.
Self-Directed Learning is a form of open-ended assessment–you can see one of the self-directed learning models I created here.
Frameworks & Assessment Models For Measuring Understanding
These can be thought of as ways to frame the thinking about the content being assessed in the test
42. Bloom’s Taxonomy
43. The TeachThought Learning Taxonomy
44. UbD’s 6 Facets Of Understanding
45. Marzano’s New Taxonomy
46. Assessing ‘Transferability’
It is common for assessments to be standardized, universal, and practiced in those standardized and universal forms (for example, a multiple-choice assessment with a set number of items and a standardized amount of time to complete).
This is great for norm-referencing, but a terrible way to truly measure what a student understands–and the strength and depth of that understanding. That’s where transfer of understanding comes in. You can read more about different kinds of learning transfer.
47. A Graded Assessment
This is the most common form of formal assessment–one that is scored, feedback is given, and data is shared (even if only via a letter grade).
48. An Ungraded Assessment
This is less common than graded assessments–which is strange because the most useful purpose of assessment is to provide data to revise planned instruction. Feedback is also useful, but the scoring, grading, and communication of a grade can take time, distract learners emotionally, and most critically obscure data about that understanding.
That’s not to say that assessments should never be graded, but scoring, documenting a score, then communicating that score to students, parents, colleges, etc., significantly alters the tone, scale, and reach of that assessment. It becomes more like a public performance than a way to measure understanding.
49. Feedback-Only Assessment
Learning Feedback-only Assessment is similar to ungraded but focused intently on providing detailed feedback to individual learners to help them mastery the standard, competency, or skill.
50. Pass/Fail Assessment
While feedback can be given, time limits can be imposed, and various taxonomies can be used, the primary characteristic of pass-fail assessment is that letter grades and points are generally not given and the standard for performance is binary–that is, the standard either was or was not met.
An athlete trying to vault over a 6-meter bar is participating in a kind of ‘pass/fail’ assessment in that they either will or will not clear the bar.
51. Ongoing ‘Climate of Assessment’
A ‘Climate of Assessment’ is a personal favorite of mine.
In this approach, critical and complex ideas and skills are constantly revisited and iterated upon in various ways, forms, and contexts while supplemented by less complex/quicker-to-master content. Assessment is frequent, playful, clear, compelling, and always formative.
52. Snapshot of Assessment
This is an assessment of what the student seems to know at that moment based on that given assessment form. These can be misleading but useful if they’re preceded and proceeded by additional snapshots in the aforementioned ‘climate of assessment.’
53. A Measurement Of Growth Over Time
In this kind of assessment, mastery isn’t measured–nor are students compared to one another (as they are in norm-referenced assessments).
Instead, the focus is on far the student has (or has not) ‘come.’ One strategy here is grading backward. This approach, while complex, is naturally differentiated for each student, has a positive tone, and makes it much more difficult to ‘fail’ unless the student actually literally atrophies in terms of skills and understanding.
54. Concept Mastery
This kind of assessment is mainly distinguished by what’s being assessed–the focus here is a grasp of concepts and ideas rather than skills and competencies (like the item below).
55. Competency & Skill Mastery
Many academic standards combine concepts and skills–which is fine because that’s often what the ‘real world’ is like.
But when you’re trying to troubleshoot student achievement and evaluate what they actually, truly understand (versus how they did on the test), being able to separate what they ‘know’ and what they can ‘do’ can make remediation more efficient–and make the student feel better about their own ‘lack of mastery’ because they are able to see what they do and don’t know, what they can and cannot do, etc., which is more precise and comforting than ‘I missed the question’ or ‘I failed the test.’)