Teachers, Schools, and Student Performance

Other research summaries in this Reporter

NBER Reporter 2016 Number 4: Research Summary

Teachers, Schools, and Student Performance

Brian A. Jacob

    Brian A. Jacob is the Walter H. Annenberg Professor of Education Policy and professor of economics in the University of Michigan's Gerald R. Ford School of Public Policy. His primary fields of interest are labor economics, program evaluation, and the economics of education.

    Jacob's research on education covers a wide variety of topics, from school choice to teacher labor markets to standards and accountability. His work has appeared in leading economics journals, including the American Economic Review, the Quarterly Journal of Economics, and the Review of Economics and Statistics. Earlier in his career, he served as a policy analyst in the Office of the Mayor of New York City and taught middle school in East Harlem.

    Jacob is a research associate in the NBER's Program on Children and Program on Education, and a member of the editorial boards of the American Economic Journal: Applied Economics, Education Finance and Policy, and the Review of Economics and Statistics. He received his B.A. from Harvard College and his Ph.D. from the University of Chicago. In 2008 he was awarded the Association for Public Policy Analysis & Management's David N. Kershaw Prize for distinguished contributions to public policy and management by an individual under the age of 40.

    Economists have long realized the importance of education for the well-being of individuals and the productivity of society. Over the past few decades, the economic returns to education have risen dramatically, increasing the importance of this issue. Yet researchers have made only limited progress in understanding how various policies can influence educational outcomes. My research in education economics has focused on three areas: standards and accountability, teacher policies, and measurement of individual ability.

Standards and Accountability

    One approach to school reform involves holding schools accountable for student performance. In 2002, President Bush signed the No Child Left Behind Act (NCLB), which dramatically expanded federal influence over the nation's public schools. NCLB is arguably the most far-reaching education policy initiative in the past four decades. The legislation compelled states to conduct annual student assessments, calculate and report the fraction of students deemed at least proficient in key subjects, and institute an increasingly severe set of sanctions for schools that did not show sufficient progress toward having all students proficient.

    In a series of papers, Thomas Dee and I study how NCLB affects school practices and student outcomes. We identify the impact of NCLB by comparing changes across states that already had school accountability policies in place prior to NCLB and those that did not. To examine student achievement, we utilize a state-year panel of student achievement scores from the National Assessment of Educational Progress (NAEP), a common metric that was low-stakes for schools.¹ Our results indicate that NCLB generated substantial increases in the average math performance of elementary students [Figure 1]. Moreover, we find evidence of improvement at both the top and bottom of the performance distribution, suggesting that the benefits were not limited to students near the proficiency threshold. There is also evidence of improvements in eighth-grade math achievement, particularly among traditionally low-achieving groups and at the lower percentiles. In contrast, we find no evidence of any effects on reading performance.

    We also use a similar design to examine the impact of NCLB on education policies and practices.² Our results indicate that NCLB increased per-pupil spending by nearly $600, which was funded primarily through increased state and local revenue. We find that NCLB increased teacher compensation and the share of elementary school teachers with advanced degrees but had no effect on class size. We also find that NCLB did not influence overall instructional time in core academic subjects, but did lead schools to reallocate time away from science and social studies and toward the tested subject of reading.

    As states have implemented school accountability systems, they have also raised standards. Since the 1970s, states have slowly increased high school graduation requirements. Recently, some have begun requiring students to pass rigorous college preparatory classes. Michigan was among the first states to do so when it began requiring students in the high school class of 2011 to pass geometry, algebra 2, biology, and chemistry/physics.

    My colleagues and I use several non-experimental strategies to study the impact of this policy.³ Our analyses suggest that the higher expectations embodied in the Michigan Merit Curriculum have had little impact on student outcomes. Looking at student performance on the ACT, the only clear evidence of a change in academic performance is in science. While our estimates for high school completion are sensitive to the sample and methodology, the weight of the evidence suggests that the policy had a small negative impact on high school graduation for students who entered high school with the weakest academic preparation

The Teacher Labor Market

    A second area of my research focuses on teachers. A growing body of evidence finds that there is substantial variance in teacher effectiveness, but that very little of it can be explained by easily observable teacher characteristics such as certification or advanced degrees.⁴

    This naturally raises the question of whether school principals or district officials can distinguish between more and less effective teachers. Lars Lefgren and I surveyed elementary school principals and asked them to evaluate all of their teachers along a variety of dimensions.⁵ We then calculated value-added measures of teacher effectiveness, using standardized test scores as the outcome. When we compare these subjective and objective measures of teacher performance, we find that principals' assessments of teachers predict future student achievement significantly better than the traditional measures used for teacher compensation, such as educational credentials or prior experience. We find that principals are quite good at identifying those teachers who produce the largest and smallest test score gains in their schools, but have far less ability to distinguish among teachers in the middle.

    In subsequent work, I take advantage of a policy change in Chicago to examine a similar question.⁶ The Chicago Public Schools (CPS) and Chicago Teachers Union (CTU) signed a new collective bargaining agreement in 2004 that gave principals the flexibility to dismiss probationary teachers for any reason and without the documentation and hearing process that is typically required for such dismissals. With the cooperation of the school system, I matched information on all teachers who were eligible for dismissal with records indicating which teachers were dismissed. With these data, I estimated the relative weight that school administrators place on a variety of teacher characteristics. I found evidence that principals do consider teacher absences and value-added measures, along with several demographic characteristics, in determining which teachers to dismiss [Figure 2].

    Given the large variance in teacher effectiveness and the high financial and political costs of dismissing ineffective teachers, many observers have noted that teacher selection may be a cost-effective means of improving educational quality. However, to date there has been little research that links information gathered during the hiring process to subsequent teacher performance.

    In a recent project, several colleagues and I partnered with the District of Columbia Public Schools (DCPS) to study teacher hiring.⁷ We examined detailed teacher candidate data collected during a multi-stage application process, including written assessments, a personal interview, and sample lessons. We identified a number of background characteristics, such as undergraduate GPA, as well as screening measures, such as applicant performance on a mock teaching lesson, that strongly pre-dicted teacher effectiveness. Interestingly, we found that these measures are only weakly associated with the likelihood of being hired, suggesting considerable scope for improving teacher quality through the hiring process.

    In response to this finding, DCPS changed the way it presented information on applicant quality to principals. Specifically, the district assigned each applicant a letter "grade" that corresponded to our measures of predicted effectiveness. We are currently in the process of studying how this change affected teacher hiring and student performance.

Measurement of Student Ability

    Most recently I have written about how individual ability is measured in modern assessment systems. Economists use test scores to measure human capital in explaining wages and other employment outcomes and, increasingly, as outcome measures in evaluations of programs or policies aimed at improving human capital formation. Applied researchers typically take cognitive test scores from pre-existing surveys or datasets without exploring how they are constructed. These test scores often reflect non-trivial decisions about how to measure and scale student achievement.

    Jesse Rothstein and I discuss several important issues relating to the measurement and scaling of individual ability measures, highlighting the implications for secondary analyses.⁸ We point out that the test score measures reported in many surveys are rarely simple summaries of student performance like the fraction of items answered correctly, but rather are estimates generated by complex statistical models. The resulting scores are generally not unbiased measures of student ability. For example, scores computed for students who take the NAEP test depend not only on the examinees' responses to test items, but also on their background characteristics, including race and gender. As a consequence, if a black student and a white student respond identically to questions on the NAEP assessment, the reported ability for the black student will be lower than for the white student—reflecting the lower average performance of black students.

    Even when reported scores are unbiased measures of student ability, they often are transformed to scale scores. This undermines many of the purposes for which researchers use test scores, such as measuring the magnitude of a treatment effect or quantifying the difference in ability between two demographic groups. Rothstein and I currently are working on a project to characterize the magnitude of biases that arise in common applications.

    ^1. T.S. Dee and B.A. Jacob, "The Impact of No Child Left Behind on Student Achievement," NBER Working Paper No. 15531, 2009, and Journal of Policy Analysis and Management, 30(3), 2011, pp. 418–46. ↩
    ^2. T.S. Dee, B.A. Jacob, and N.L. Schwartz, "The Effects of NCLB on School Resources and Practices," Educational Evaluation and Policy Analysis, 35(2), 2013, pp. 252–79. ↩
    ^3. B.A. Jacob, S. Dynarski, K. Frank, and B. Schneider, "Are Expectations Alone Enough? Estimating the Effect of a Mandatory College-Prep Curriculum in Michigan," NBER Working Paper No. 22013, February 2016. ↩
    ^4. R. Chetty, J.N. Friedman, and J.E. Rockoff, "Measuring the Impacts of Teachers I: Evaluating Bias in Teacher Value-Added Estimates," American Economic Review, 104(9), 2014, pp. 2593–632. ↩
    ^5.B.A. Jacob and L. Lefgren, "Principals as Agents: Subjective Performance Measurement in Education," NBER Working Paper No. 11463, July 2005, and Journal of Labor Economics, 26(1), 2008, pp. 101–36. ↩
    ^6. B.A. Jacob, "Do Principals Fire the Worst Teachers?" NBER Working Paper No. 15715, February 2010, and Educational Evaluation and Policy Analysis, 33(4), February 2011, pp. 403–34. ↩
    ^7. B.A. Jacob, J. Rockoff, E. Taylor, B. Lindy, and R. Rosen, "Teacher Applicant Hiring and Teacher Performance: Evidence from D.C. Public Schools," NBER Working Paper No. 22054, March 2016. ↩
    ^8. B.A. Jacob and J. Rothstein, "The Measurement of Student Ability in Modern Assessment Systems," NBER Working Paper No. 22434, July 2016, and forthcoming in Journal of Economic Perspectives. ↩