Pages

Wednesday, February 23, 2011

How essay tests are scored

My impression is that a lot of people think standardized testing is a good way to keep pressure on schools and teachers to improve performance.

Standardized testing has its place -- but not when it's performed like this. Jessica Lussenhop's article for CityPages is a sobering look at how essay tests from around the country are graded in a kind of sweatshop atmosphere.

Actually, it's not "sobering" so much as "saddening." Then it becomes "angering."

For the familiar SATs and other multiple-choice tests, the scoring process is completely automated. There's one right answer, and ignoring issues like stray marks, a machine can tell whether the right circle was filled in. There's no ambiguity and no drama. (Mostly: Lussenhop's piece notes a few catastrophes in which automated scoring screwed up, costing the company NCS Pearson millions.)

Machines can't judge essays, though. How, then, are essay tests scored?

It would be reassuring to imagine rooms full of sharp, well-trained educators taking the time to evaluate each essay the way your teachers do (or did, or should have done). It would be reassuring, but it's not so.
Today, tens of thousands of temporary scorers are employed to correct essay questions. This year, Maple Grove-based Data Recognition Corporation will take on 4,000 temporary scorers, Questar Assessment will hire 1,000, and Pearson will take on thousands more. From March through May, hundreds of thousands of standardized test essays will pour into the Twin Cities to be scored by summer.
Since the essays are responses to a standardized test, it shouldn't surprise anyone that the scoring companies try to standardize how the essays are scored. Scorers are trained to assess each essay according to a numeric scale in which each number corresponds to a qualitative assessment of the essay's grammar, organization, vocabulary, etc. The scorer's job, then, is to reduce the essay's complexity to a single number representing how well it matched the putative ideal determined by the scoring company.

Reducing essays to scores is exactly what a teacher does, of course, but no teacher spends all day doing nothing but grading essays. One scorer, though, said "she was being asked to crank through 200 real essays in a day."
The scanned papers popped up on the screen and her eyes flitted as fast as they could down the lines. The difference between "excellent" and "good" and "adequate" was decided in a matter of seconds, to say nothing of the responses that were simply off the reservation. How do you score a kid who rails that his town sucks? What about an exceptionally well-written essay on why the student was refusing to answer the question?
How would you feel, knowing the essay over which you sweated for an hour was scored in a matter of seconds by a temp worker concerned more with filling a quota than giving your essay the consideration it deserved?

Yet that's the reality of standardized testing.

The testing industry doesn't admit its assembly-line process is problematic.
Pearson spokesman Adam Gaber warns against taking the opinions of former scorers too seriously.

In an email, he characterized their concerns as "one-sided stories based upon people who have a very limited exposure and narrow point of view on what is truly a science."
That might be the most telling and most disturbing quotation from the entire article. This guy thinks what his company does is a science.

What Pearson and its competitors do in the area of essay scoring is not a science. It's not even an art. It's a brutal reduction of thought to numbers. The principles of industrial production that gave us hot dogs now give us essay scores.

As with any standardized testing, the scoring of essays rewards kids who know how to take the test. Stick to the topic, keep your paragraphs and sentences to a certain length, use the expected vocabulary, and all will be well. Kids who don't stay within the lines suffer. It doesn't matter that some of them are bright and creative. They score badly in our industrialized testing regime, so they are problems, and not the kind that can be solved by filling in the right circle.

(Thanks to Longreads for the link.)

No comments:

Post a Comment