Connecticut’s Performance Evaluation Advisory Council (PEAC) met last week to discuss a response to data that show teacher evaluation systems have identified very few people to dismiss, and assign high ratings to most teachers — a pattern which has been reported in many states across the country over the last five years. This shouldn’t be a surprise, because many states are using similar tools for teacher evaluation: a state-specific version of Danielson’s Framework for Teaching (here dubbed the Common Core of Teaching, CCT), or other generic teaching rubric applied to teachers regardless of grade or subject area.
When we use the same, blunt tools, we can expect the same, nonspecific results.
As other states consider this “problem” of too many high-scoring teachers, some blame the evaluators and invest thousands of dollars in tediously retraining them to increase the accuracy of ratings. Others blame teachers for gaming the system: learning the rubrics so well that they can go through the motions to earn high scores without actually improving their teaching. These states often want to revise their rubrics, or fiddle with the formulas for calculating a rating, as we may do in Connecticut.
Instead of viewing this as a problem with people or formulas, I view it as evidence that it is time for the next generation of tools for observing teaching. Over the past few years, scores on the CCT have told us two things: First, we do not have a great mass of “ineffectives” lurking in our schools without our knowledge. Instead, we have a lot of teachers doing mostly good, generic things when we go looking for them. Second, we need more specific, more ambitious tools to differentiate and support our mostly good teachers.
Ratings are undifferentiated and Connecticut teachers report little confidence that evaluations will improve teaching because the tools used to evaluate classroom instruction are not specific or meaningful. In the interest of fairness, all teachers are judged by the same indicators within the CCT. In the interest of sameness, those indicators are not very specific. If we are using such a blunt instrument to identify effective instruction, we should expect most people to be rated about the same no matter how well we train evaluators or how familiar teachers become with the tool.
The MET Project, which hurled the Framework for Teaching into the spotlight as the most obvious choice for state policies, also examined two subject-specific tools: The Protocol for Language Arts Teaching Observation (PLATO) and the Mathematical Quality of Instruction instrument (MQI). They found that, although most teachers in the large-scale, national project scored reasonably well on the Framework system, less than 1 percent earned the highest rating on the MQI and no one earned the highest ratings across all indicators on PLATO, though about 25 percent of teaches earned a high rating on at least one PLATO indicator.
These tools are not perfect, but they illustrate something important: When we use subject-specific tools that specify aspects of instruction that are unique to each discipline, we see a wider distribution of performance, and may be able to generate feedback aimed at developing the specific craft, not the general appearance, of ambitious teaching across grades and content areas.
Now that we know we have mostly good teachers, we need to focus on supporting truly ambitious teaching. Not teaching that merely produces functional, orderly classrooms like those outlined in the CCT, but teaching that engages the student who came in with no confidence or interest in math; teaching that supports the language development of students who need to learn English while they also learn chemistry; teaching that sparks dramatic reading growth even among students who carry long histories of difficulty learning to read.
The CCT was a good place to start given what was available and given the growing suspicion that a few profoundly ineffective teachers were out there, quietly ruining averages. When the Performance Advisory Council first started to meet, most subject-specific tools had not been validated and were not commercially available. Even now we only have tools for Reading/Language Arts, Math and Science, and need to develop tools for all grades or subject areas.
Rather than retraining evaluators, or changing formulas for what counts as effective, we might invest in developing the kinds of tools that can identify and support ambitious teaching across grades and content areas. This work will not be quick, easy or inexpensive, but it is the right work to be doing if we are serious about using evaluations to support excellent teaching in Connecticut schools.