Assessments by Design: Rethinking Assessment for Learner Variability

As the clock seems to race through the final minutes of an exam, several students frantically scan questions and fill in bubbles to demonstrate their knowledge and content mastery of biology concepts. Elsewhere, a small group of students collaborate on a PowerPoint presentation, preparing to showcase their business management knowledge they’ve acquired. Still elsewhere, students review their essays for grammar and formatting before final submissions, with which they hope to show how much they’ve learned about the socio-political impact of the War of 1812.

These kinds of assessments (multiple choice exams, PowerPoint presentations, essays) are so routine and embedded in higher education, that it’s hard to imagine anyone successfully graduating from college without the associated skills. Likewise, instructors often default to these common forms of assessments. After all, unless they have reason to do otherwise, teachers will teach as they’ve been taught. In other words, in the words of the old country adage, “If it ain’t broke, don’t fix it!”

But what if our assessment model is broken? Is this assessment model the best for all students, in all situations? Or is there enough evidence that it is worth revisiting assessment in higher education?

To evaluate if there is a need to change the default assessment model for a specific higher education course, ask yourself: What’s the purpose of assessment? Is my way of assessing effective toward this end?

Perhaps the most natural and functional use of assessment is to measure progress toward—or mastery of—a desired outcome. Seemingly, this is the primary function in curricular education, as we use assessments for measuring student performance in relation to stated learning outcomes. All other motivations for assessment, such as “to have something to put in the gradebook” or “because it’s expected of me,” are true, but should be understood as distant secondary reasons for assessing learners. In their “Backward Design” instructional design model, Wiggins and McTighe (2005) make this point explicitly, arguing that instructional design should begin with clear outcomes, then move next into the development of assessments that match with those outcomes. This approach views assessments as a psychometric means of measuring student achievement of learning outcomes.

Understanding assessments as a measure in relation to learning outcomes makes for highly effective criteria to evaluate assessments. If a student aces an exam, presents an effective PowerPoint presentation, or submits a polished and high-quality essay, can I understand that to mean they have mastered the learning outcome? Out of context and without knowledge of the outcomes, how would you know?

For example, imagine that you teach a 300-level Environmental Systems course. In this course, one course learning outcome is: Students will be able to evaluate the potential impacts of invasive species, including considering factors that may lead to positive or negative prognosis.

As a summative assessment, you have a 100-question multiple choice exam in which students are to…

Is this a good match? No. And why not?
To answer this, we should look at the key verb in the outcome — in this case, evaluate. In Bloom’s Taxonomy, evaluation is a “higher order thinking skill” (HOTS) along with analysis and creation. Typically, HOTS require open-ended demonstration that isn’t viable in the constraints of a multiple choice exam. The exam may effectively measure learner’s ability to remember, identify, or understand important concepts related to the subject matter, and there is no doubt that those abilities are prerequisite to the type of evaluation we are seeking, but the fact remains that a student may score a 100% on the exam, and I still have no proof that they are able to evaluate environmental scenarios. I simply haven’t seen them try.

This is a curricular issue. When designing assessments, we need to be explicitly clear about what our learning outcomes are, and then we need to design assessments that enable learners to demonstrate progress toward—or mastery of—those outcomes.

As noted, designing assessments to match learning outcomes is good pedagogical design. But there’s another aspect that we need to address to fully answer our key question: Is our way of assessing effectiveness toward measuring student progress also helping learning outcomes? We need to consider individual variability among students.

To isolate this variable, imagine that you have thoughtfully designed an assessment that would succeed in allowing learners to potentially demonstrate that they can “evaluate the potential impacts of invasive species, including considering factors that may lead to positive or negative prognosis.” In this case, my choice of assessment is an applied project in which learners evaluate hypothetical ecosystems receiving hypothetical invasive species and they explicate their evaluative thinking through a PowerPoint presentation for the class. This certainly allows learners to demonstrate evaluation in the context of the subject matter.

But it also bottlenecks students through a means of expression (public speaking) and a technological skill (PowerPoint development), neither of which have anything to do with the learning outcome. So, if I have some students who are terrible public speakers, or have social anxiety or speech-related disabilities, or who never learned to make a good PowerPoint, I should recognize that I have introduced extraneous variables to my measure[1]. If a student fails to deliver a strong presentation, does that mean that they cannot “evaluate the potential impacts of invasive species…?” No, not necessarily. In this context, the means of assessment was structurally sound, but otherwise arbitrary. Surely there are many ways that learners could show mastery of this learning outcome, but I happened to choose this one. And some of them, therefore, will show me something significantly less than what they, in fact, are capable of.

If the first point was curricular, the second point is personal: When designing assessments, we need to be aware that forms of assessment may restrict the measurement of students’ accomplishments. Ignoring the potential barriers that irrelevant forms of assessment may provide for students means that our interpretation of the results for some learners are invalid. And the consequences may be that we are failing or discouraging students who really were successful, but we simply never knew.

Just like there were two levels of design in assessment: Assessment as part of backward curriculum design (curricular) and assessment that empowers variable learners (personal), so our solution needs to address both of these levels. At the same time, we need to be cognizant about what instructors have time for and what’s realistic. ♦

Related Posts

AI-assisted plagiarism? ChatGPT bot says it has an answer for that

From Delhi to West Bengal, these states have shut down schools amid COVID-19 surge

Outbreak prompts HBS to move classes online