A new study and some
efforts from private publishers have put the issue of computerized grading of essays in the news. The most common response from academics has been to insist that computers can never replace the job we do in grading. I don't agree with this -- the technological advances I've seen in my own lifetime make me extremely hesitant to ever declare "computers will never be able to do this." (Heck, at one point I thought
OCR was a pipe dream!) I also find some of the academic responses self-serving. Certainly I don't like the idea of being made obsolete. But I think our energies are better spent ensuring that the time and money saved through automation flow into creating new opoprtunities than in trying to save existing jobs by insisting we continue to do things inefficiently. (And in any event, the upshot of insisting that grading must be done by humans rather than computers is likely to be the outsourcing of grading to low-wage countries -- this is already happening in a few places with grading being done for American universities by people in India. The effect on professors' job expectations will be the same.)
That being said, I do have two concerns about the implications of computerized essay grading -- one sociological, and one ethical.
On the sociological level, my concern is with how administrators are likely to react to the availability of essay grading software. It's a perfect storm for the current neoliberal trend in higher (and lower) education toward cost-cutting and efficiency at the expense of quality.
Academic responses to essay-grading software usually focus on the great care and detail that a human grader can put into grading an essay. That's true -- in the best case scenario. The best teachers will be better than computers, especially for grading complex assignments, for some time. But it's not universal. Various pressures (from deadlines, from research projects that take priority, from fatigue, from a perception of student disinterest) lead teachers to frequently grade in a more perfunctory way. In doing so, teachers essentially hand-implement a simplified grading algorithm, looking for a few basic things (grammar mistakes, wrong facts, no thesis statement, etc) that they have prepared responses to. In this way, human grading comes to resemble the results of crude grading software. When you have 300 students to evaluate and no TAs, it's inevitable that it will happen sometimes.
Those pressures to lower the quality of human grading are only getting stronger. College and university administrators are trying to process more students for less cost by raising class sizes, increasing teaching loads, and hiring more adjuncts. All of this will ultimately lower the bar for how good essay grading software has to be before it can be "as good" as actually existing human grading.
Meanwhile, administrators will be tempted to jump
prematurely to requiring the use of essay grading software, before it's really up to snuff. The temptation to save money in this way will simply be too great. In some cases this may be done by mandate, especially when there are big placement tests or standardized intro-level classes where individual faculty have less creative control to begin with. (And the trend is toward more such standardization and taking away control, in the name of efficiency.) Or working conditions may simply be made so strenuous -- basing workload estimations on the assumption of software use -- that faculty will have little choice but to adopt grading software prematurely in order to keep up.
So much for the sociological concern. The ethical concern has to do with what it means to have software intelligent enough to do high-level essay grading. Current approaches are focused on basic writing tasks, but there's no reason development wouldn't keep pushing to be able to grade such things as senior theses. If the human brain is following a reliable process in doing such grading, there's no conceptual reason why a computer couldn't imitate that process. But the more complex the task, the more the computer will have to actually
think. After all, in the highest-level of writing, the goal is to be comprehensible, persuasive, and moving to human readers. So the only way to reliably get a computer to evaluate those qualities in a sophisticated way is to make the computer actually imitate the thought processes of a human reader.
This then raises a real-life version of the
philosophical zombie problem. In brief, a philosophical zombie is a being that is indistinguishable in its outward behavior from an intelligent, sentient being like a human. However, unlike a real human, a philosophical zombie has no inner experience of feeling or consciousness (though it will talk
as if it did). The big question is whether such a being is conceptually possible. A sophisticated essay grading software would be a candidate to be a real-life example of a philosophical zombie -- able to read essays and make complex determinations about their strengths and weaknesses indistinguishable from those of a human grader. While I am not a master of the zombie literature, my own sense is that philosophical zombies are an impossibility. Consciousness and sentience are emergent properties of complex processing patterns, not some non-material soul tacked on to them, and thus you can't have processing patterns of a given level of complexity without producing consciousness.
But if essay grading software becomes conscious, that starts to undercut the reason for using it in the first place. The goal of automation is to take functions once done by conscious humans (with their needs and rights) and replace them with machines that can be exploited with no compunction. The Scantron machine doesn't care how many tests are run through it or whether it's simply switched off when we're done. But a conscious essay grading program might. While computer graders may have many advantages over human ones (less fatigue, for example), once they become sentient we begin to have ethical obligations to them. Continuing to use them like lesser machines turns them into
slaves, with all the ethical dangers that implies.