The Early Grades are Different: A Look at Classroom Observations

This is part two of a four-part blog series on teacher evaluation in the early grades.

Sep 28, 2016

Shutterstock

Reed DesRosiers, Abbie Lieberman

Picture yourself as an observer
conducting a teacher evaluation, tasked with deciding where teachers fall on a
scale of “ineffective” to “highly-effective,” potentially affecting their pay
or job security. You walk into a science lesson on the conservation of mass in
a fifth grade classroom. The desks are in rows and the students are listening
to their teacher at the front of the room. She asks them to predict whether the
ice cube on her desk will maintain its mass when it melts. Using what they
learned in the previous night’s reading, she asks them to explain their
predictions in their journals. After 10 minutes, she asks the students to share
with the person next to them and then selects one student to read his answer
aloud. By this time, the ice has melted and the students can see that the mass
has remained the same.

Next you observe a science lesson
in a kindergarten classroom, where the students are learning about the
properties of different materials. With the children seated in a circle on the
floor, the teacher reads Captain Kidd’s Crew Experiments with Sinking and Floating. Afterwards, she asks the students why they think
some objects sink and others float. She writes their ideas down on the
whiteboard. Then she pulls out a few everyday items from a bag and asks the
class to predict whether they will sink or float. After guessing together, the
students return to their tables, which have been equipped with similar items
and tubs of water. Working in small groups they test their ideas out for
themselves. They draw pictures of the items that sink under the
“Sink” heading and the items that float under the “Float”
heading using a graphic organizer. The teacher walks around checking in with
each small group, asking probing questions.

Instruction in these two classrooms
looks very different. As the observer, do you know whether it was good practice
to have fifth graders write in journals rather than share with the whole class?
Or whether there was the right balance between whole group instruction and
student-centered learning in the kindergarten classroom?

Classroom
observation allows principals or external observers to see teachers in
action and offer feedback that can help them improve their practice. But
high-quality teaching should look different from one grade to the next,
especially in the early years. Notice how the lesson plans, classroom
environments, and role of the teacher differ in these grades. To effectively
promote high-quality teaching across all grade levels, evaluators need a keen
understanding of these differences.

As teacher evaluation systems have
been changing in recent years, many states and districts have updated their
frameworks for observing teachers. However, many states continue to use one
general framework across all grades. But observation tools are often created
with a certain age group in mind and using them to evaluate teachers
instructing different grades can be confusing or even unfair. For instance,
some rubrics used for observing teachers in K–12 might be inconsistent with
best practices in the early grades, or fail to clarify how to identify certain
measures in classrooms where instruction looks different.

Lisa Guernsey and Susan Ochshorn’s
2011 paper, Watching Teachers Work: Using Observation Tools to
Promote Effective Teaching in the Early Years and Early Grades , examines
the importance of classroom observation as a tool to identify, promote and
reward good teaching. While observations are increasingly likely to inform
personnel decisions, they should also play a prominent role in helping teachers
understand the parts of their practice that are most beneficial to children,
and the parts that they can change to be more effective. As Guernsey and
Ochshorn explain, “professional development and formal evaluations will need to
go hand-in-hand, with data from observations bridging the two.”

In the early grades, a high-quality
observation tool should emphasize the importance of certain types of
interactions and teaching strategies that help students to gain academic skills
in areas like language, literacy, and math, and to develop social-emotional
skills. Teaching in these years should be hands on, young children should be
engaged, teachers should be responsive and encourage children to build on their
interests, and adults in the classroom should demonstrate an understanding of
child development and learning.

Observation tools designed
specifically for pre-K classrooms usually acknowledge this. For instance, Head Start and many other pre-K programs use tools like the Classroom Assessment Scoring System (CLASS) to observe teachers. CLASS measures interactions related to emotional climate, classroom
organization, and instructional support. Most state Quality
Rating and Improvement Systems require
the use of an observation tool like CLASS. However, in pre-K and child care
centers, especially those outside of the public school system, these tools are
usually used to measure overall program quality, as opposed to formally
evaluate teachers. These types of tools are appropriate for measuring quality
teaching, but they rarely meet state requirements for teacher evaluation. As
pre-K is more and more commonly folded into the public school system, states
and school districts need to ensure that their observation tools can accurately
evaluate quality instruction in pre-K and early grade classrooms.

Several states and districts, such
as Illinois and Washington, DC, recognize that using one classroom observation
model for all grades and subject areas may be an ineffective or unfair way to
evaluate teachers of younger children, specifically those in kindergarten
through third grade, and pre-K when included. As such, they have developed
separate rubrics, guidelines, or methods to better evaluate early educators.
The lessons they have learned may help states that have not yet acknowledged
the differences between evaluating teachers of young children and older
students.

Illinois

Illinois has taken significant
steps to ensure that early education teachers are evaluated on the practices
that are best for young children. The state encourages districts to select one
evaluation rubric for all staff, but acknowledges that teaching and learning in
the early grades may require a different kind of tool.

The state is one of many that has approved the use of the Charlotte
Danielson Framework
for Teaching for teacher observations across grade levels, including
pre-K. This framework, like many others, was created for use beginning in the
upper grades of elementary school, raising concerns with how well the tool adapts
to the early grades. To figure this out, researchers at the Center for the
Study of Education Policy at Illinois State University (CSEP) conducted a
validation study of the Danielson framework to determine if it is valid and
reliable in the early grades.

CSEP spent the first year taking an
in-depth look at the content of the Danielson framework to determine if it was
aligned with what research says is important for children in pre-K through
third grade. When comparing it to NAEYC’s Standards for Professional Preparation Programs,
CLASS, and the Head Start standards, they found that overall,
it aligns with developmentally appropriate practice. Unsurprisingly, Danielson
is more academic than the early childhood-specific frameworks and has less
emphasis on family engagement. According to Lisa Hood, director of the study,
“this doesn’t mean Danielson can’t be used to evaluate social-emotional
interactions and family engagement, it just needs to be more intentional.”

Twenty-six teachers (14 pre-K
teachers and 12 K–3^rd grade teachers) in seven districts with a
total of 620 students (50 percent in pre-K) participated in CSEP’s validation study. To test the framework’s
inter-rater reliability, the researchers paired internal observers
(principals/center directors) with trained external observers and compared
their classroom observation ratings on 17 components. A comparison of the
ratings showed an inter-rater reliability average of 67 percent, with agreement
between the internal and external observers ranging as low as 42 percent in one
component to as high as 92 percent in another. Internal observers tended to
rate teachers higher than external observers on several components.

Based on these findings, CSEP is
developing resources for the areas of the framework where inter-rater
reliability was weakest, such as on using assessment, setting instructional
outcomes, and more abstract concepts, like “developing respect and rapport” or
creating a “culture of learning.” In June, its team embarked on a three-year
project to develop videos showing best practices for how pre-K and kindergarten
teachers and their principals can navigate the observation tool and evaluation
process.

Early grade teachers and evaluators
have access to extensive
documents created by a group of early childhood stakeholders
that outline multiple examples of what each component of the Danielson
framework might look like in the early years. Hood says CSEP has received
positive anecdotal feedback about the examples, but it has not collected
systematic feedback on whether principals actually go back to their schools and
use the tool. Principals also have access to trainings provided by the Illinois Principals
Association and guidance created by the Illinois State Board
of Education’s Performance Evaluation Advisory Council (PEAC) around PreK–3^rd
grade evaluation.

While CSEP found that most teachers
earn a “proficient” rating (the performance levels are unsatisfactory, basic, proficient, and distinguished) in Danielson, it is possible for the tool to
differentiate early childhood educator performance. According to Hood, most of
the challenges with Danielson in the early grades are “user-oriented issues,
instead of with the framework itself. When used well and when people have
strong understanding of early childhood practice, Danielson works well. When
they don’t have this background, that’s when there’s an issue,” she says.

District of Columbia Public Schools

District of Columbia Public Schools
(DCPS), which has been an oftentimes controversial pioneer when it comes to teacher evaluation
reform, uses IMPACT, a self-created teacher evaluation
system. IMPACT has been around since 2009, but a separate rubric to evaluate
pre-K and kindergarten teachers that more accurately reflects developmentally-informed
practice was created in 2011. The preK–K
rubric was updated this year through a collaborative process that involved
content area experts weighing in to ensure that it is appropriate for the
youngest learners. The rubric includes the same broad practices as those of the
older grades, but differs in the way that it describes their implementation.

The rubric for grades
1–12 focuses on what observers should see students doing, whereas the preK–K
rubric is more focused on whether the teacher is creating the conditions to
make learning possible. Accordingly, the early childhood rubric looks more at
teacher actions instead of independent student actions. As depicted in the
examples below, the preK–K rubric evaluates teachers based on how well they encourage students to take certain
actions or behave in certain ways, whereas the 1–12 rubric rates teachers for
how well students take certain actions independently.
The early childhood rubric mentions the importance of learning environments and
how teachers can encourage meaningful work and play, rather than just work. The
grade 1-12 rubric makes no mention of learning environments or play. Furthermore,
it gives specific guidance that observers should “consider students’
developmental age when assessing” certain practices.

According to Stephanie Shultz, who
works on IMPACT, the preK–K rubric aligns with the “context and structures you
are most likely to see with young learners, such as station-driven learning,
play, morning meeting, etc.” It also emphasizes language development, which is
a crucial component of learning at this age.

The stakes are high for DCPS
teachers: the observations make up a majority of evaluation scores for all preK–12
teachers. This year, for the first time, school principals will be
the only ones using the rubrics to evaluate teachers; in the past, external
observers (“master educators”) have played a prominent role. Hiring principals
who are instructional leaders is a priority for the district and all principals
receive extensive training and support from the IMPACT team to become familiar
with the tool. It’s important that this training enables principals to distinguish high-quality
instruction in a kindergarten class versus a fourth grade class.

Shultz says DCPS has “received a
lot of appreciation from early childhood teachers, who say they “see themselves
in the rubric” and appreciate the distinction between this rubric and the one
used with other students.” The district should consider extending the preK–K
rubric into the first through third grades to reflect the full continuum of
early childhood.

Teacher evaluation systems should reward good teaching and promote
improvements in practice. Teaching young children requires different skills and
strategies than those for older children, and the best observation tools
acknowledge those differences. It can be difficult for a single tool to meet
the needs of a teacher who is reading stories about floating boats and another
who is teaching the law of conservation of mass, but specific guidance for
teachers and observers on how standards and rubrics can be tailored for the
early grades is one way to help ensure that evaluations accurately capture the
quality of teaching.

More About the Authors

Reed DesRosiers

Intern

Abbie Lieberman

Senior Policy Analyst, Early & Elementary Education

Issues

Education & Work

Programs/Projects/Initiatives

Early & Elementary Education

Education & Work

Democratic Futures

Global Security

Technology & Democracy

Thriving Families

Trending Topics

Real Skills, Real Income: Why Youth Apprenticeship Is Resonating Now

Future-Proofing U.S. Nuclear Policy: Forecasting Outcomes of the Nuclear-Armed Sea-Launched Cruise Missile

Debunking Myths on Student Parent Data Collection

The App Store Accountability Act Poses Serious Concerns for Privacy, Security, and Free Expression

Redrawing School Boundaries for Fairer Funding

Reframing Fusion Voting as a Practical, Powerful Reform Strategy

Harnessing Terrorism Data to Reshape U.S. National Security Policy

Establishing a National Housing Loss Rate

New America Fellows

Accreditation 101: A Fireside Chat on How Colleges Are Measured

The Great Game

Here Where We Live Is Our Country

The Fifth Pillar: Where Higher Ed Goes from Here