Teachers using computers to grade, critique student writing

Tuesday, May 10, 2005
University of Missouri professor Ed Brent shows off his essay grading software in his office in Columbia, Mo.

COLUMBIA, Mo. -- Stacks of student essays riddled with flaws -- the thought alone was enough to make Ed Brent look to pass on the work of his Introduction to Sociology class.

Students in Brent's course at the University of Missouri-Columbia now submit drafts of their papers through an online interface he developed. It identifies how many points Brent wanted included that actually were, and how well concepts were explained.

Within seconds, students have a score.

Final papers are still handled by Brent and his assistants, but students are encouraged to use the professor's SAGrader program to give them a better shot of earning an A.

"I don't think we want to replace humans," Brent said. "But we want to do the fun stuff, the challenging stuff. And the computer can do the tedious but necessary stuff."

Brent's software -- developed with National Science Foundation funding -- is part of a global movement to digitize a facet of academia long reserved for a teacher and a red pen. SAGrader is just the latest offering entering a market responding to burgeoning classroom use, from routine assignments in high school English classes to an essay on the GMAT, the standardized test for business school admission.

SAGrader is not yet being used outside Brent's classroom, but he hopes to change that. If it finds a commercial base, it would join a field of others.

Educational Testing Service offers Criterion, a program that includes "e-Rater," the essay-scoring software used for the GMAT. Vantage Learning has IntelliMetric, which uses artificial intelligence to score open-ended questions. Maplesoft sells Maple T.A., while numerous other programmers, educators and researchers have created programs used on a smaller scale.

Observers all agree use is growing, though most companies involved are private and offer no sales figures.

But it's tough to tout a product that tinkers with something many educators believe only a human can do.

"That's the biggest obstacle for this technology," said Frank Catalano, a senior vice president for Pearson Assessments and Testing, whose "Intelligent Essay Assessor" is in use in places ranging from middle schools to the military. "It's not its accuracy. It's not its suitability. It's the believability that it can do the things it already can do."

The technology has met both admiration and hatred, sometimes in the same places.

Take South Dakota. The state tested essay-grading software a couple years back, but decided against using it widely, saying feedback was negative.

Not everywhere, though. Students in the Watertown, S.D., school district, for example, now have their writing assessment tests scored by computer.

Lesli Hanson, an assistant superintendent in Watertown, said students enjoy the new testing and teachers are relieved to end an annual ritual in which two dozen people holed up for three days to score 1,500 tests.

"It almost got to be torture," she said.

Elsewhere, Oregon is testing software it hopes eventually will be used for writing assessments, a change that could save the state $1 million a year. The University of Phoenix now processes roughly 80,000 papers each month through its WritePoint program, which doesn't grade papers but offers detailed mechanical tutoring for students. And Indiana has gone further than any other state in the use of such technology in its schools.

Some 80 percent of Indiana's 60,000 11th-graders have their English assessment scored by computer. Another 10,000 ninth-graders are taking part in a pilot program in which some of their routine written assignments are assessed by computer.

Stan Jones, Indiana's commissioner of higher education, said the technology pales in comparison to a teacher, but cuts turnaround time, trims costs and allows overworked teachers to give written assignments without fearing the workload.

"This requires them to require more essays, more writing, and have it graded very painlessly," Jones said.

The software is not flawless, even its most ardent supporters admit.

When the University of California at Davis tried out such technology a couple years back, lecturer Andy Jones decided to try to trick "e-Rater."

Prompted to write on workplace injuries, Jones instead input a letter of recommendation, substituting "risk of personal injury" for the student's name.

"My thinking was, 'This is ridiculous, I'm sure it will get a zero," he said.

He got a five out of six.

A second time around, Jones scattered "chimpanzee" throughout the essay, guessing unusual words would yield him a higher score.

He got a six.

In Brent's class, sophomore Brady Didion submitted drafts of his papers numerous times to ensure his final version included everything the computer wanted.

"What you're learning, really, is how to cheat the program," he said.

Work to automate analysis of the written word dates back to the 1950s, when such technology was used largely to adjust the grade level of textbooks, said Henry Lieberman, a research scientist at the Massachusetts Institute of Technology. Before long, researchers aimed to use such applications to evaluate student writing.

SAGrader, like other programs, involves a significant amount of input from a teacher before it's able to work. For each of the four papers Brent assigns during his semester-long course, he must essentially enter all the components he wants an assignment to include and take into account the hundreds of ways a student might say them.

What a writer gets back is quite detailed. A criminology paper resulted in a nuanced evaluation offering feedback such as this: "This paper does not do a good job of relating white-collar crime to various concepts in labeling theory of deviance."

Brent says despite its limitations, such software allows teachers to do things they weren't able to do before. Before he implemented SAGrader he only gave students multiple-choice tests.

"Now we can focus more," he said. "Are they making a good argument? Do they seem to understand? Are they being creative?"

Respond to this story

Posting a comment requires free registration: