“There was a onetime wedding planner, a retired medical technologist and a former Pearson saleswoman with a master’s degree in marital counseling. To get the job, like other scorers nationwide, they needed a four-year college degree with relevant coursework, but no teaching experience. They earned $12 to $14 an hour, with the possibility of small bonuses if they hit daily quality and volume targets.”
By MOTOKO RICH, New York Times | http://nyti.ms/1CqFwBs
Rose Rodriguez-Rabin, left, and Valerie Gomm scored Common Core exams at an office in San Antonio. Ms. Rodriguez-Rabin has worked for the testing company Pearson on various projects since 2009. Ms. Gomm described the scoring process as challenging, saying that "you go into analyzing every trait." Credit Ilana Panich-Linsman for The New York Times
JUNE 22, 2015 :: SAN ANTONIO — The new academic standards known as the Common Core emphasize critical thinking, complex problem-solving and writing skills, and put less stock in rote learning and memorization. So the standardized tests given in most states this year required fewer multiple choice questions and far more writing on topics like this one posed to elementary school students: Read a passage from a novel written in the first person, and a poem written in the third person, and describe how the poem might change if it were written in the first person.
But the results are not necessarily judged by teachers.
On Friday, in an unobtrusive office park northeast of downtown here, about 100 temporary employees of the testing giant Pearson worked in diligent silence scoring thousands of short essays written by third- and fifth-grade students from across the country.
There was a onetime wedding planner, a retired medical technologist and a former Pearson saleswoman with a master’s degree in marital counseling. To get the job, like other scorers nationwide, they needed a four-year college degree with relevant coursework, but no teaching experience. They earned $12 to $14 an hour, with the possibility of small bonuses if they hit daily quality and volume targets.
Photo: Susan Mitchell, center, has been teaching middle school for 30 years, and this is her first summer as a scorer for Pearson. There is no data on how many of the scorers currently work as classroom teachers. Credit Ilana Panich-Linsman for The New York Times
Officials from Pearson and Parcc, a nonprofit consortium that has coordinated development of new Common Core tests, say strict training and scoring protocols are intended to ensure consistency, no matter who is marking the tests.
At times, the scoring process can evoke the way a restaurant chain monitors the work of its employees and the quality of its products.
“From the standpoint of comparing us to a Starbucks or McDonald’s, where you go into those places you know exactly what you’re going to get,” said Bob Sanders, vice president of content and scoring management at Pearson North America, when asked whether such an analogy was apt.
“McDonald’s has a process in place to make sure they put two patties on that Big Mac,” he continued. “We do that exact same thing. We have processes to oversee our processes, and to make sure they are being followed.”
Still, educators like Lindsey Siemens, a special-education teacher at Edgebrook Elementary School in Chicago, see a problem if the tests are not primarily scored by teachers.
“Even as teachers, we’re still learning what the Common Core state standards are asking,” Ms. Siemens said. “So to take somebody who is not in the field and ask them to assess student progress or success seems a little iffy.”
About 12 million students nationwide from third grade through high school took the new tests this year. Parcc, formally known as the Partnership for Assessment of Readiness for College and Careers, and the Smarter Balanced Assessment Consortium, another test development group, along with contractors like Pearson, worked with current classroom teachers and state education officials to develop the questions and set detailed criteria for grading student responses. Some states, including New York, separately developed Common Core tests without either consortium’s involvement.
Pearson, which operates 21 scoring centers around the country, hired 14,500 temporary scorers throughout the scoring season, which began in April and will continue through July. About three-quarters of the scorers work from home. Pearson recruited them through its own website, personal referrals, job fairs, Internet job search engines, local newspaper classified ads and even Craigslist and Facebook. About half of those who go through training do not ultimately get the job.
Parcc said that more than three-quarters of the scorers have at least one year of teaching experience, but that it does not have data on how many are currently working as classroom teachers. Some are retired teachers with extensive classroom experience, but one scorer in San Antonio, for example, had one year of teaching experience, 45 years ago.
For exams like the Advanced Placement tests given by the College Board, scorers must be current college professors or high school teachers who have at least three years of experience teaching the subject they are scoring.
“Having classroom teachers engaged in scoring is a tremendous opportunity,” said Tony Alpert, executive director of Smarter Balanced. “But we don’t want to do it at the expense of their real work, which is teaching kids.”
Photo: Most Common Core scorers do have some teaching experience, according to a test developer. Credit Ilana Panich-Linsman for The New York Times
The most important factor in scoring, testing experts say, is to set guidelines that are clear enough so that two different scorers consistently arrive at the same score.
During training sessions of two to five days for the Parcc tests, prospective scorers study examples of student essays that have been graded by teachers and professors as well as the scoring criteria.
To monitor workers as they score, Pearson regularly slips previously scored responses into the computer queues of scorers to see if their numbers match those already given by senior supervisors. Scorers who repeatedly fail to match these so-called validity papers are let go.
At the San Antonio center on Friday, the scorers worked on the Parcc test, which was given in 11 states and Washington. As Valerie Gomm read several paragraphs from a fifth-grade essay on a laptop screen, she consulted a heavily highlighted sheaf of papers that prescribed the criteria for evaluating reading comprehension, written expression and conventions like spelling and punctuation. For each of those traits, she clicked on a numeric score from 0 to 3.
“The first thing we do is just holistically read it,” explained Ms. Gomm, who immigrated from France with her American husband five years ago and previously managed a wedding business. “Then you get a feel for what they did, and you go into analyzing every trait, and then you really go deep and you can see if either your first feeling is right or wrong.”
She acknowledged that scoring was challenging. “Only after all these weeks being here,” Ms. Gomm said, “I am finally getting it.”
Some teachers question whether scorers can grade fairly without knowing whether a student has struggled with learning difficulties or speaks English as a second language. Pearson said some math tests are graded in Spanish. Scorers do not see any identifying characteristics of the students.
Experienced teachers also say that some students express themselves in ways that might be difficult for noneducators to decipher.
“Sometimes students say things as a student that as a teacher you have to interpret what they are actually saying,” said Meghann Seril, a third-grade teacher at Broadway Elementary School in Venice, Calif., whose students took the Smarter Balanced test this year. “That’s a skill that a teacher needs to develop over time, and as a grader, I think you need to have that as well.”
But testing experts say standardized tests are designed to evaluate student work by adults who do not know the child.
“They don’t know how your kid behaved in class yesterday,” said Catherine McClellan, a former research director at the Educational Testing Service and now an independent consultant to school districts, state education agencies and foundations. “And that is in fact a good thing because they will make a neutral, impartial judgment” based on scoring guidelines and training.
Still, the new tests are much more complicated and nuanced than previous exams and require more from the scorers, said James W. Pellegrino, a professor of psychology at the University of Illinois at Chicago who serves on advisory boards for Parcc and Smarter Balanced.
“You’re asking people still, even with the best of rubrics and evidence and training, to make judgments about complex forms of cognition,” Mr. Pellegrino said. “The more we go towards the kinds of interesting thinking and problems and situations that tend to be more about open-ended answers, the harder it is to get objective agreement in scoring.”