Growing Pains

Adelina Alegria and Candace Kelly-Hodge assess how the new Race to the Top District (RTT-D) competition over relies on student growth measures

Recently, the U.S. Department of Education began inviting applications for the Race to the Top District (RTT-D) competition. Many educators are aware that RTTT requires evaluations of teachers and principals and, in this new district competition, local education authorities (LEAs) must design an evaluation of the superintendent. It must be “...a rigorous, transparent, and fair annual evaluation... that provides an assessment of performance and encourages professional growth. This evaluation must reflect: (1) the feedback of many stakeholders, including but not limited to educators, principals, and parents; and (2) student outcomes.” The evaluation of the school board, on the other hand, was defined in the executive notice but did not appear in the final notice.

As with the other RTTT competitions, the most highly contested stake is the use of student test scores in the evaluation of teachers and principals. Districts that enroll a large percentage of language minority students face particular hurdles to compete for their share of the 383 million RTT-D dollars.

Since February 2009, Washington has spent $90.5 billion on education. According to, the three largest areas of spending (in billions) have been student aid ($16.4), special education ($12.1), and the Elementary and Secondary Education Act (ESEA) Compensation for the Disadvantaged ($11).

The lion’s share of ESEA funding was allotted to Race to the Top (RTTT) Phase One (2009-10) and Phase Two (2010-11). Together, they received $4.8 billion. Phase Three (2011-12) also added early childhood education program options. By definition, RTTT funds are for five-year discretionary and competitive grant programs used to encourage and reward innovation and reform. They were first offered to State Educational Agencies (SEAs) across the nation over three consecutive years between 2009 and 2011. In 2012, districts will have the opportunity to apply for new RTT grants in the fall. This is the first time they are offered to individual districts and the awards will be up to $40 million, with a total of $383 million available. Districts with 2,000 to 5,000 participating students may apply for $5-10 million, and districts with more than 25,000 participating students may apply for $30-40 million.

There are now 19 states implementing RTT programs which are required to follow four educational reform areas: a) adopting rigorous college- and career-ready standards and high-quality assessments; (b) establishing data systems and using data for improvement; (c) increasing teacher effectiveness and equitable distribution of effective teachers; and (d) turning around the lowest-performing schools (Federal Register, Vol. 75, No 71, April 2010).

In addition, the RTTT grant criteria specifically require value-added models (VAMs), using student growth measures as a significant factor in the evaluation of teacher and principal effectiveness. The term value-added refers to the measure of teacher and school effects. “Because these methods attempt to estimate how much teachers or schools add to the achievement of entering students, they are generally called ‘value-added’ methods, using a term from the economic production function literature” (McCaffrey et al., 2003, p.2).

A number of school districts are now committed to using VAMs in each of the 19 states as part of the RTTT grant application procedure. States have been charged with implementing a statewide longitudinal data system specifically defined by the America COMPETES Act. The section concerning P-16 data systems required (among other items) that they produce yearly test records on all students; information on students not tested; and a teacher identifier system with the ability to match teachers to students ( In this way, RTTT funding in 19 states has increased the reliance on VAMs in teacher and principal evaluation systems.

RTTT grant application criteria specify that student growth measures be a significant portion of the total evaluation. Student growth is defined as “the change in student achievement for an individual student between two or more points in time. A state may also include other measures that are rigorous and comparable across classrooms” (National Archives and Records Administration, 2010). Furthermore, the Department of Education defined the use of the evaluation system more narrowly within the Teacher Incentive Fund grant application criteria. This grant requires that the school district develop and implement a performance-based compensation system for teachers, principals, and other personnel in high-needs schools. Further statements included, “The district wide educator evaluator system is for all teachers and principals and is based on student growth, multiple observations, and other factors. The system provides overall evaluation ratings with at least three performance levels that is used to inform human capitol [sic] decisions and professional development” (, 2012). Of the three performance levels, the mid-level is defined as:

“Effective teacher means a teacher whose students achieve acceptable rates (e.g., at least one grade level in an academic year) of student growth (as defined in this notice). States, LEAs, or schools must include multiple measures, provided that teacher effectiveness is evaluated, in significant part, by student growth (as defined in this notice). Supplemental measures may include, for example, multiple observation-based assessments of teacher performance.” (National Archive and Records Administration, Federal Register, p.19499)

Although, RTTT put into place teacher and principal evaluation systems with nearly $5 billion in funding (, there is strong and ever-mounting evidence by educational researchers that value-added models do not represent the entirety of teaching practices that affect student performance (Hill, Charalambou, & Kraft, 2012). This is especially relevant to classrooms where a high proportion of students are learning English as a new language, are raised in poverty, or are students with special needs (McCaffrey et al., 2003; Newton, Darling-Hammond, Haertel & Thomas, 2010). VAMs are notoriously “noisy” measures of teacher or principal effectiveness at best, and are not recommended for use in high stakes decisions about employment such as retention, promotion or tenure (Papay, 2012).

Furthermore, recent research in urban schools found that teachers believe that VAM systems are neither reliable nor valid because these systems do not take into account the students’ background (poverty, urban school settings, transiency, truancy, non-schooled parents) and characteristics (English learners, emotional and psychological issues and concerns) (Alegria & Kelly-Hodge, in press). The available literature fully supports these teachers’ beliefs. For example, according to Strauss (2011), most of these students’ factors are not actually measured in value-added models, and the teacher’s effort and skill, while important, constitute a relatively small part of this complex equation. In addition, a Darling-Hammond article states that:

“A teacher who teaches less advantaged students in a given course or year typically receives lower effectiveness ratings than the same teacher teaching more advantaged students in a different course or year. Models that fail to take student demographics into account further disadvantage teachers serving large numbers of low-income, limited English proficient, or lower-tracked students.” (Newton et al., 2010, p. 1)

Despite the clear caution against the use of VAMs in teacher and principal evaluation, 19 states are currently jump-starting school districts to use student growth measures as a significant proportion of their evaluation system for all credentialed educators, including teachers, teacher specialists, and principals. However there is one notable caveat. The RTTT grant application criteria specifies that student growth measures be used as a significant portion or as a significant part of the evaluation of teachers and principals but does NOT specify what significant means.

The term “significant part” requires a rational and prudent interpretation. However, the fact is that a number of RTTT grant applicants were implicitly given authority to set the definition standard by virtue of winning the grant competition. For example, the winning RTTT proposals submitted by Rhode Island and Massachusetts set the definition standard at 51%. Many applicants, therefore, interpret the criteria of being “a significant part of the evaluation system” as being 51% or greater.

A majority measure of 51% is far too simplistic. A significant part or a significant factor should be considered within its context and purpose. In teacher/principal evaluation, a fair and prudent definition of a significant part is one that is large enough to impact the outcome yet judicious and reasonable considering its use in high-stakes teacher/principal effectiveness. Therefore, it may be advisable to reduce the proportion of the student growth measure to as low as 20% of the total evaluation system. This reduced proportion will meet the government criteria of being significant but will reduce the stakes around the use of test scores in high stakes personnel decisions. The smaller proportion is necessary based on the review of the literature on VAMs.

In addition to these problems related with VAMs, letters to Arne Duncan from the National Education Association and from James Crawford, a contributor to Language Magazine and a leading expert on English Learners (Zehr, 2010), have opposed RTTT measures because of the unfair use of test scores and a greater need to advocate for the needs of ELLs. Other education leaders have expressed concern that many states were not very specific about how the needs of English Learners would be addressed (Zehr, 2010). Furthermore, it was noted that in RTTT Phases One and Two, the winning states (and DC) “...have a total of nearly 873,000 English-language learners according to their reports to the federal government. Mr. Rice (from the Multicultural Education, Training and Advocacy group) estimates that’s only 16% of all the ELLs in the country (Zehr, 2010).

Inadequate attention to English learners is a common complaint. In addition, advocates of schooling for democracy and social justice pedagogy decry the discourse of the initiative.

Thus we argue that “college for all” (just like “no child left behind” and the “race to the top”) functions as an ideological velvet to soften educational policy talk, talk that actually carries big sticks that punish the very students proclaimed to be the beneficiaries of the proposed changes in schooling (Glass & Nygreen, 2011).

In addition to carrying a big stick, RTTT conveys a race where there are more losers than winners; and where the rush to the top is only for a few (Glass & Nygreen, 2011). In sum, RTT is criticized for relying on tenuous student test scores in high stakes personnel decisions; there is little direction in meeting the needs of ELLs; and a lack of support for our most vulnerable students. Yet despite these glaring criticisms, as a nation we face an educational crises and funding is sorely needed. Therefore, in order to embrace the upcoming RTT-D grant competition, districts should:

1. Determine that they can do better with $40 million than without it;

2. Hold steadfast to the needs of English Learners in the program design by consulting with experts in bilingual education;

3. Consider that RTTT is a competition for districts and not between districts; and

4. Make a paradigm shift to a smaller percentage of student growth measures in VAM systems.

These recommendations are intended to encourage a clearer focus on what is meaningful reform for our most vulnerable students during the most desperate of times. A more thoughtful approach in the use of student growth measures is necessary. RTT-D is definitely a race, and an opportunity to design a personalized program by LEAs or by a group of LEAs. Districts have the ability to design the programs to meet English Learners’ academic needs. This is not a top-down, state-mandated reform approach but a means to fund, win, or better yet, earn a place at the top.

Alegria, A. & Kelly-Hodge, C. (in press). Value added models in the evaluation of teachers: perspectives from urban middle school teachers. (2012). “General Teacher Incentive Grant Competition Webinar.” CFDA Number 84.374A Retrieved 6-12 from

Glass, R. D. & Nygreen, K. (2011). “Class, race, and the discourse of ‘College for All.’” A response to “Schooling for Democracy” in Democracy & Education Vol. 19 (1). Retrieved on August 2012 from

Hill, H. C., Charalambous C. Y. & Kraft, M. A. (2012). “When rater reliability is not enough: Teacher observation systems and a case for the generalizability study.” Educational Researcher 41(2) pp. 56-64.

McCaffrey, D. F., Lockwood, J.R., Koretz, D. M. & Hamilton, L. S. (2003). Evaluating value-added models for teacher accountability. Santa Monica, CA: RAND Corporation.

Newton, X. A., Darling-Hammond, L., Haertel, E. & Thomas, E. (2010). Value-added modeling of teacher effectiveness: An exploration of stability across models and contexts. Vol. 18 (23) Education Policy Analysis Archives, Arizona State University e-journal ISSN10682341.

National Archives and Records Administration. Federal Register, Race to the Top Funding Notice; Notice Inviting Applications for the New Awards for Fiscal Year (FY) 2010 Vol. 75(71) April 14, 2010 Notices.

Papay, J. P. (2012). “Different tests, different answers: The stability of teacher value-added estimates across outcome measures.” American Educational Research Journal 48(1), pp. 163-193. Track The Money, American Reinvestment and Recovery Act (ARRA) retrieved on 7-12 from

Sargent, J. F. Jr. (2010). America COMPETES Act and the FY2010 Budget, Congressional Research Services, Report for Congress. Retrieved 7-12 from

Strauss, V. (2011). “Getting teacher evaluation right.” Washington Post, Posted at 03:35 PM ET, 09/15/2011. Retrieved 7-26 from

Zehr, M. A. (2010). “Letter to Arne Duncan: Race to the Top Is Unfair to Teachers of ELLs and Groups say ELLs got Short Shrift in Race to the Top.” In EdWeek as retrieved on 8-12 from

Candace Kelly-Hodge, Ph.D., is an Assistant Professor, Adjunct at the University of Southern California. She served as a Peer Reviewer for RTT Phase Two and can be contacted at and at

Adelina Alegria, Ph.D. is an Assistant Professor at Occidental College. She has authored NSF and GEAR UP federal grants.