debitage: Changing tests mid-stream

Janet Stemwedel poses a question about fairness in multi-section exams. If you discover a problem with the exam -- say, a poorly written question -- after one section of a class has taken the exam, but before the second section takes it, is it fair to fix the question before the second group takes it? I think this question highlights a tension between two ways of looking at grading: as measurement and as compensation.

Grading as measurement is the view I try to adhere to. This view holds that a grade is a measurement of a characteristic of a student -- mastery of certain skills and knowledge -- just like you might measure the student's height or temperature. It's admittedly a difficult measurement to interpret, since we all know what the freezing and boiling points of water are (the reference points for the temperature scale), but it's much less clear what "the stuff Dr. X wants you to get out of this class" is even if we know from your B+ that you got 87-89% of it*. Under the grading as measurement view, the grader's overriding responsibility is to accurately measure each student's mastery of the material. If the grader has good reason to believe the measuring instrument (test) is flawed, there is then a duty to fix that flaw for all future measure-ees, even if some people have already been measured with the flawed instrument. To measure future students with the flawed instrument does nothing to make one's treatment of the first group fairer. Rather, the duty owed to them is to either adjust the measurement to account for the instrument's flaws (e.g. granting credit for answers based on understandable misreadings of a poorly-written question), or in extreme cases to throw out that measurement and either re-measure with a less flawed instrument or calculate the final figure without using that incorrect measurement. But note that this correction is the same as one's duty to those students even if there is no second section with the option of taking a corrected test. Consider the analogy of a doctor taking patients' temperatures. If the doctor discovers that the thermometer is miscalibrated, the obvious course of action is to get a working thermometer ASAP, then either adjust the earlier patients' records (if we know that, say, the broken thermometer was reading consistently 5 degrees too high), or throw out their temperature records altogether. It wouldn't make sense for those earlier patients walking around wrongly thinking they had a fever to say that fairness demands that later patients also be misdiagnosed with fevers.

However, there is a competing model of grading that is prevalent among students and has some pull on teachers as well: grading as compensation. Here, a grade is a reward given to a student in return for doing certain work in the course. A grade is then more of a valuable good, like money, rather than a measurement of a characteristic. So if instead of a doctor we analogize the teacher to a boss, there seems to be some grounds for concern that fixing the test for the second class is unfair to the first class. Imagine that the manager of a McDonald's tells the employees on the first shift to cook up some fries, with the understanding that the employees' wages will be docked if they screw it up. After noting the difficulty the first shift had with the task, the manager poses the same task to the employees on the second shift, but with clearer instructions on using the fryer. The first shift employees could reasonably claim that their wages were docked unfairly. And it's at least conceivable -- though obviously people of different political persuasions may disagree about whether it's right -- that justice to the first shift could be established by giving equally unclear directions to the second shift, so that everyone is earning their wages on an equal footing. It does seem unfair that the boss would give some people an easier way to earn money than others, in a way that it's not unfair to fix a broken thermometer.

Under the measurement view, the difficulty of a test is a function of the standards of the professor, the student's mastery level, and extraneous bias in the measurement instrument. Each individual has a right to as small a contribution from that third term as possible (i.e. a right to be measured accurately) regardless of whether others have secured that right as well. Under the compensation view, the difficulty of a test is a function of the demands of the professor (i.e. what the student gives them on one side of the exchange bargain) and the abilities of the student to meet those demands. Here the student has a right to the same offer or same terms of the deal as every other student.

On a philosophical level, the measurement view seems much easier to defend, and there's clearly a lot of pernicious behavior, such as grade-grubbing, that is rooted in a compensation view. But the compensation view is hard to escape, and it's common to use grades as punishments and incentives. And grades are often turned into quasi-goods because they can be in a sense exchanged for goods, as when you use your grades to convince an employer to hire you. In the case of the poorly-worded test question I think the measurement view gets the right answer, but there's a good reason the idea of fixing a test feels unfair at first glance.

*Here we're assuming each individual is graded against an independent measuring stick, rather than the class being forced into a normal distribution with a pre-defined shape. The latter would raise obvious issues. But I have yet to hear of any good defense of such grading practices, though I admit I haven't looked that hard.

17.12.09

Changing tests mid-stream

0 Comments:

Field Manual

Kiosk