Would you trust a computer to critique your composition? Would you allow a robot to determine if you’re truly worthy of that spot in a top-tier college?
Chances are that your standardized tests are already graded entirely by automated essay scoring software (AES), which means that your GRE score is analyzed by a computer. It doesn’t really matter what you write – as long as you use the right buzzwords.
Earlier this April, InsideHigherEd.com published “A Win for the Robo-Readers,” which references a recent study completed at the University of Akron:
The differences, across a number of different brands of automated essay scoring software (AES) and essay types, were minute. “The results demonstrated that over all, automated essay scoring was capable of producing scores similar to human scores for extended-response writing items,” the Akron researchers write, “with equal performance for both source-based and traditional writing genre.”
“In terms of being able to replicate the mean [ratings] and standard deviation of human readers, the automated scoring engines did remarkably well,” Mark D. Shermis, the dean of the college of education at Akron and the study’s lead author, said in an interview.
The New York Times puts the issue a little more succinctly, in an article entitled “Facing a Robo-Grader? Just Keep Obfuscating Mellifluously”:
Mr. Perelman found that e-Rater prefers long essays. A 716-word essay he wrote that was padded with more than a dozen nonsensical sentences received a top score of 6; a well-argued, well-written essay of 567 words was scored a 5.
Gargantuan words are indemnified because e-Rater interprets them as a sign of lexical complexity. “Whenever possible,” Mr. Perelman advises, “use a big word. ‘Egregious’ is better than ‘bad.’”
These stories are especially disheartening given the recent New York Times opinion piece “Teach the Books, Touch the Heart,” which argues against spending more time trying to quantify the results of English instruction. Acting under pressure, NYC public school teachers have cut their supplementary classes (like reading groups) to add more test preparation programs.
This shift from English instruction to test prep instruction is detrimental for everyone. Students aren’t taught how to read, write, and think critically. They are taught to write for the test. Teaching students to write for robotic readers is even worse.
It’s true that the current human graders only spend 2 to 3 minutes on each SAT writing essay, but at least those readers are able to distinguish between actual arguments and bullshit. Test preparation programs already take advantage of the short time frames that readers will spend on essays, suggesting simplistic writing styles and using certain basic frameworks that most students abandon in middle school (like the five-paragraph essay). If test-takers are graded by AES, the quality of the essays will be degraded even further, and our standards for high school “writing” will be dumbed down.
Of course, the ETS has an answer for this too:
E.T.S. officials say that Mr. Perelman’s test prep advice is too complex for most students to absorb; if they can, they’re using the higher level of thinking the test seeks to reward anyway. In other words, if they’re smart enough to master such sophisticated test prep, they deserve a 6.
Let’s see what this says about our education system:
- Students who are able to afford complicated test prep and tutors will be rewarded.
- “Cheating” is not really cheating, but rather shows that the student is intelligent enough to thwart the rules of an arbitrarily scored exam.
- Quality doesn’t matter if you can make your work look like it might be quality.
Is it any wonder that high school students believe school is for standardized testing, and that we should adjust our curriculums accordingly?
When the standard of intelligence is gaming the system, there is a serious problem. We’re taking the education out of schools and teaching our students that results are important, not methods. The conversation that we need to have is not “Should Robo-Readers be used for standardized test scores?” but “Why do we place so much importance on standardized test results at all? How can we change the paradigm?”