A better way to grade teachers
Effective evaluation requires rigorous, ongoing assessment by experts who review teachers' instruction, looking at classroom practice and evidence of student learning.
A smarter way to grade schools
Unlike in the rest of the U.S., California's SB 1458 rightly assigns just a portion of student test results into the API school rating formula. How the rest will be determined is the question.
LA Times Op-Ed By Linda Darling-Hammond and Edward Haertel | http://lat.ms/Py1cri
L.A. school district's Academic Growth over Time system uses complex statistical metrics to try to sort out the effects of student characteristics (such as socioeconomic status) from the effects of teachers on test scores. (Illustration by Peter Hoey / For The Times / November 5, 2012)
November 5, 2012 :: It's becoming a familiar story: Great teachers get low scores from "value-added" teacher evaluation models. Newspapers across the country have published accounts of extraordinary teachers whose evaluations, based on their students' state test scores, seem completely out of sync with the reality of their practice. Los Angeles teachers have figured prominently in these reports.
Researchers are not surprised by these stories, because dozens of studies have documented the serious flaws in these ratings, which are increasingly used to evaluate teachers' effectiveness. The ratings are based on value-added models such as the L.A. school district's Academic Growth over Time system, which uses complex statistical metrics to try to sort out the effects of student characteristics (such as socioeconomic status) from the effects of teachers on test scores. A study we conducted at Stanford University showed what these teachers are experiencing.
First, we found that value-added models of teacher effectiveness are highly unstable. Teachers' ratings differ substantially from class to class and from year to year, as well as from one test to the next. For example, teachers who rank at the bottom one year are more likely to rank above average the following year than to rate poorly again. The same kind of wild swings hold true for teachers at the top. If the scores were trustworthy measures of a teacher's ability, this would not occur.
Second, teachers' value-added ratings are significantly affected by differences in the students who are assigned to them. Even when statistical models try to control for student-demographic variables, teachers are advantaged or disadvantaged based on the students they teach. Contrary to proponents' claims, these models reward or penalize teachers according to where they teach and what students they teach, not just how well they teach.
We found that a teacher receives a higher value-added score when he is teaching students who are already higher-achieving, more affluent and more versed in English than when he is assigned large numbers of new English learners and students with fewer educational advantages. In fact, when we looked at high school teachers who teach different classes, the student composition of the class was a much stronger predictor of the teacher's value-added score than the teacher. This makes sense: With a classroom full of more-advantaged students, teachers can move faster and cover more material, something the statistical models used for value-added ratings fail to capture.
Finally, value-added ratings cannot disentangle the many home, school and student factors that influence learning gains. These matter more than the individual teacher in explaining changes in scores.
These findings have been replicated in many studies. As a result, most researchers have concluded that value-added scores should not be used in high-stakes evaluations of individual teachers. As the country's leading research organization, the National Research Council, concluded: "VAM estimates of teacher effectiveness … should not be used to make operational decisions because such estimates are far too unstable to be considered fair or reliable."
What is the alternative? Certainly we need teacher evaluation systems that identify both excellent and struggling teachers based on what they do and how their students learn. And we need systems that help teachers improve, target assistance where needed and remove teachers who cannot, with help, succeed in the classroom.
California's Educator Excellence Task Force recently released a report that outlines the most successful practices internationally. It illustrates that, as in other professions, good evaluation starts with rigorous, ongoing assessment by experts who review teachers' instruction based on professional standards. Evaluators look at classroom practice, plus evidence of student learning from a range of classroom work that includes (but is not limited to) school or district tests that directly connect with the curriculum and students.
Studies show that feedback from this kind of evaluation improves student achievement because it helps teachers get better at what they do. Systems that sponsor the effective Peer Assistance and Review program also identify poor teachers, provide them intensive help and remove them if they don't improve.
If we really want to improve teaching, we should look to develop such models of effective evaluation rather than pursuing problematic schemes that mis-measure teachers, create disincentives for teaching high-need students, offer no useful feedback on how to improve teaching practice and risk driving some of the best educators out of the profession.
LA Times Editorial | http://lat.ms/WLD2Or
California's SB 1458 counts just a portion of student test results into the API school rating formula. (Anthony Russo / For The Times / November 8, 2012)November 9, 2012
November 9, 2012 :: While the Obama administration is putting increased emphasis on standardized tests to measure teachers and schools, California is moving in the other direction. A new law will limit how heavily the annual standards exams can count toward a school's score on the state's Academic Performance Index.
We think California has the better approach.
Though the tests, which measure whether students are at grade level in various academic subjects, have value as an objective measurement of student progress, they were never intended to become the sole criterion by which good education is measured, and they shouldn't be. In too many classrooms, the result has been a creativity-stifling tendency to drill students for the multiple-choice tests.
Up to now, state officials have aggregated each school's test results into a simplified API number that can range from 200 to 1,000, with a score of 800 meaning that the school has met the state's target for students' proficiency. But under the law signed this year by Gov. Jerry Brown, the test results will count for only 60% of the API score starting in the 2015-16 school year.
That's fine, but what about the other 40%? The problem with SB 1458, by Senate President Pro Tem Darrell Steinberg (D-Sacramento), is that it leaves it to the California schools superintendent and the Board of Education to figure that out. The law is expected to accomplish one important goal: Finally, the state will have to include dropout rates in the API calculation, something it was supposed to do all along but ignored.
As for the rest, no one knows. The governor is responsible for the vagueness; he vetoed a better, more detailed bill last year that called for rating schools based on whether they offered an enriched curriculum and prepared students for college and employment. At the time, he raised valid concerns about how a love of learning, as well as deeper analytical and writing skills, had been dropped from the education equation. Perhaps panels of experts could measure these qualities through school visits, he suggested.
Perhaps so. But a worthwhile, on-the-ground examination of a school's quality is a time-consuming, very expensive proposition, and schools are struggling as it is. There already are too many well-intentioned reforms that have become meaningless because of the shortage of money to do them right.
As education officials ponder the API scores of the future, they should aim for a balanced approach to measuring schools that has real meaning and as much objectivity as possible — and can be carried out in the real, budget-challenged world.
smf: Does this mean that The Times may abandon the Sue the School District for the Test Scores under the Freedom of Information Act and Crunch the Data to Sell Newspapers and Advertising Methodology for reviewing teacher performance?