Power4kids Report

According to the National Assessment of Educational Progress, nearly 4 in 10 fourth graders read below the basic level. To help these struggling readers improve their skills, the nation's 16,000 school districts are spending hundreds of millions of dollars on often untested educational products and services developed by textbook publishers, commercial providers, and nonprofit organizations. Yet we know little about the effectiveness of these interventions in the school environment. Which ones work best, and for whom? Do they have the potential to close the reading gap between struggling and average readers?

To provide the nation's policymakers and educators with credible answers to these questions, the Haan Foundation for Children, the Florida Center for Reading Research, Mathematica Policy Research, the American Institutes for Research, the Allegheny Intermediate Unit, and the Institute of Education Sciences are collaborating to carry out the Power4Kids evaluation. Conducted just outside Pittsburgh, Pennsylvania, in the Allegheny Intermediate Unit, the evaluation is assessing the effectiveness of either substantial parts or all of four widely used programs for elementary school students with reading problems: Corrective Reading, Failure Free Reading, Spell Read P.A.T., and Wilson Reading. The interventions for the evaluation were structured to offer about 100 hours of pull-out instruction in small groups of three students during one school year.

This model supports the heralded 3-Tier approach to reading instruction:

The 3-Tier, Power4Kids model should be provided to a student before the student is referred to special education service, which is costly for our schools and the students who have a high likelihood of not exiting special education services during their academic career.

The Power4Kids randomized controlled trial consists of an impact study, an implementation study, and a functional neuroimaging study. Nearly 800 children are included in the study. We have administered multiple rounds of reading achievement tests to these children; conducted surveys of their parents, teachers, and principals; and collected school records data, including test scores from the Pennsylvania System of School Assessment. This Power4Kids interim report presents impacts at the end of the intervention year, and a final report will present impacts at the end of the following school year.

Power4Kids is funded and supported by a unique partnership dedicated to finding solutions to the devastating problem of reading failure.





Power4Kids Partners

Funding Partners

Ambrose Monell Foundation
Barksdale Reading Institute
Grable Foundation
Haan Foundation for Children
Heinz Endowments
William and Flora Hewlett Foundation
W.K. Kellogg Foundation

Richard King Mellon Foundation
Raymond Foundation
Rockefeller Foundation
Rockefeller Brothers Fund
Smith Richardson Foundation
U. S. Department of Education
Institute of Education Sciences

Executive Management

Power4Kids Reading Initiative
Cinthia Haan, Haan Foundation for Children

Principal Investigator
Joseph K. Torgesen, Florida Center for Reading Research at Florida State University

Co-Principal Investigator, Impact Study Component:
David Myers, Mathematica Policy Research, together with Wendy Mansfield, Elizabeth Stuart, Allen Schirm, and Sonya Vartivarian

Principal Investigator, FMRI Brain Imaging Component
Marcel Just, Carnegie Mellon University, together with Co-PI's John Gabrieli, Stanford University, and Bennett Shaywitz, Yale University

Executive Director, School-Management Study Component
Donna Durno, Allegheny Intermediate Unit, together with Rosanne Javorsky

Co-Principal Investigator, Fidelity and Implementation Study Component
George Bornstedt, American Institutes for Research, together with Fran Stancavage

Additionally, we acknowledge the support and direction throughout the study from Dr. Audrey Pendleton of the Institute of Education Sciences


Scientific Board of Directors

Dr. Rebecca Felton
Dr. Jack Fletcher
Dr. Barbara Foorman
Dr. Ed Kame'enui
Dr. Maureen Lovett

Dr. G. Reid Lyon
Dr. Frank Manis
Dr. Gil Noam
Dr. Richard Olson
Dr. Stephen Raudenbush

Dr. Sally Shaywitz
Dr. Joe Torgesen, Chair
Dr. Maryanne Wolf

Education Board of Directors

Mr. Claiborne Barksdale
Mr. Joseph Dominic
Dr. Donna Durno, Chair
Mr. Steve Fleischman

Dr. Fred Frelow
Ms. Marion Joseph
Mr. Jeffrey Lewis
Dr. David Rose

Dr. Marshall Smith
Mr. Craig Stewart
Ms. Randi Weingarten
Dr. Rosalie Whitlock

Neuroscience Board of Directors

Dr. Susan Bookheimer
Dr. Guinevere Eden
Dr. John Gabrieli, Chair

Dr. Marcel Just, Co-PI
Dr. Andrew Papanicolaou
Dr. Russell Poldrack

Dr. Elise Temple
Dr. Bennett Shaywitz, Co-PI
Dr. Anthony Wagner

Advisors

Mr. Gerry Balbier
Mr. Jon Baron
Ms. Susan Brownlee
Dr. Michael Casserly
Mr. Christopher Cross

Mr. C. Michael Gilliland
Dr. Kent McGuire
Dr. Rebecca Maynard
Dr. Alice Parker
Dr. Robert Pastertnack

Mr. John E. Porter
Mr. Howard Rubenstein
Mr. Robert Sweet
Dr. Darv Winick

Congressional Support

 
Mr. C. Michael Gilliland

Hogan and Hartson, L.L.P.
Mr. John E. Porter

 
Dr. Christine Warnke

Corporate Sponsors

Mattel Corporation

Scholastic, Inc

Tommy Hilfiger, Inc


Corporation for the Advancement of Policy Evaluation Executive Summary -- March 2006

Contract No.:		                                      CORPORATION FOR THE ADVANCEMENT
Reference No.:	8970-400 	                                         OF POLICY EVALUATION

Closing the Reading Gap:
First Year Findings from a Randomized Trial of Four Reading Interventions for Striving Readers

Executive Summary
February 2006

Joseph Torgesen, Florida Center for Reading Research   Fran Stancavage, American Institutes for Research
David Myers, Allen Schirm, Elizabeth Stuart,
Sonya Vartivarian, and Wendy Mansfield,
Mathematica Policy Research
  Donna Durno and Rosanne Javorsky, Allegheny Intermediate Unit
   Cinthia Haan, Haan Foundation

Submitted to:

    U.S. Department of Education
    Institute of Education Sciences
    Washington, DC

    Project Officer:
    Audrey Pendleton
  Submitted by:

    Corporation for the Advancement of Policy Evaluation
    600 Maryland Ave., SW, Suite 500
    Washington, DC 20024-2512
    Telephone: (202) 264-3469
    Facsimile: (202) 264-3491

ABOUT CAPE

The Corporation for the Advancement of Policy Evaluation (CAPE) is a nonprofit organization that facilitates, conducts, and supports research activities to advance the development of public policy. CAPE also assists in determining the effectiveness of programs and policies designed to address social problems and informs the policy and service communities about its findings. Its major clients include private foundations and government agencies.

ACKNOWLEDGMENTS

This report reflects the contributions of many institutions and individuals. We would like to first thank the study funders. The Institute of Education Sciences of the U.S. Department of Education and the Smith Richardson Foundation funded the evaluation component of the study. Funders of the interventions included the Ambrose Monell Foundation, the Barksdale Reading Institute, the Grable Foundation, the Haan Foundation for Children, the Heinz Endowments, the W.K. Kellogg Foundation, the Raymond Foundation, the Rockefeller Foundation, and the U.S. Department of Education's Institute of Education Sciences. We also thank the Rockefeller Brothers Fund for the opportunity to hold a meeting of the Scientific Advisory Panel and research team at their facilities in 2004, and the William and Flora Hewlett Foundation and the Richard King Mellon Foundation for their support of the functional magnetic resonance imaging study, which collaborated with our evaluation of the four supplemental reading programs.

We gratefully acknowledge Audrey Pendleton of the Institute of Education Sciences for her support and encouragement throughout the study. Many individuals at Mathematica Policy Research contributed to the writing of this report. In particular, Mark Dynarski provided critical comments and review of the report. Micki Morris and Daryl Hall were instrumental in editing and producing the document, with assistance from Donna Dorsey and Alfreda Holmes.

Important contributions to the study were received from several others. At Mathematica, Nancy Carey, Valerie Williams, Jessica Taylor, Season Bedell-Boyle, and Shelby Pollack assisted with data collection, and Mahesh Sundaram managed the programming effort. At the Allegheny Intermediate Unit (AIU), Jessica Lapinski served as the liaison between the evaluators and AIU school staff. At AIR, Marian Eaton and Mary Holte made major contributions to the design and execution of the implementation study, while Terry Salinger, Sousan Arafeh, and Sarah Shain made additional contributions to the video analysis. Paul William and Charles Blankenship were responsible for the programming effort, while Freya Makris and Sandra Smith helped to manage and compile the data. We also thank Anne Stretch, a reading specialist and independent consultant, for leading the training on test administration.

Finally, we would particularly like to acknowledge the assistance and cooperation of the teachers and principals in the Allegheny Intermediate Unit, without whom this study would not have been possible.

EXECUTIVE SUMMARY

EVALUATION CONTEXT

According to the National Assessment of Education al Progress (U.S. Department of Education 2003), nearly 4 in 10 fourth graders read below the basic level. Unfortunately, these literacy problems get worse as students advance through school and are exposed to progressively more complex concepts and courses. Historically, nearly three-quarters of these students never attain average levels of reading skill. While schools are often able to provide some literacy intervention, many lack the resources -- teachers skilled in literacy development and appropriate learning materials -- to help older students in elementary school reach grade level standards in reading.

The consequences of this problem are life changing. Young people entering high school in the bottom quartile of achievement are substantially more likely than students in the top quartile to drop out of school, setting in motion a host of negative social and economic outcomes for students and their families.

For their part, the nation's 16,000 school districts are spending hundreds of millions of dollars on often untested educational products and services developed by textbook publishers, commercial providers, and nonprofit organizations. Yet we know little about the effectiveness of these interventions. Which ones work best, and for whom? Under what conditions are they most effective? Do these programs have the potential to close the reading gap?

To help answer these questions, we initiated an evaluation of either parts or all of four widely used programs for elementary school students with reading problems. The programs are Corrective Reading, Failure Free Reading, Spell Read P.A.T., and Wilson Reading, all of which are expected to be more intensive and skillfully delivered than the programs typically provided in public schools.[1] The programs incorporate explicit and systematic instruction in the basic reading skills in which struggling readers are frequently deficient. Corrective Reading, Spell Read P.A.T., and Wilson Reading were implemented to provide word-level instruction, whereas Failure Free Reading focused on building reading comprehension and vocabulary in addition to word-level skills. Recent reports from small-scale research and clinical studies provide some evidence that the reading skills of students with severe reading difficulties in late elementary school can be substantially improved by providing, for a sustained period of time, the kinds of skillful, systematic, and explicit instruction that these programs offer (Torgesen 2005).

EVALUATION PURPOSE AND DESIGN

Conducted just outside Pittsburgh, Pennsylvania, in the Allegheny Intermediate Unit (AIU), the evaluation is intended to explore the extent to which the four reading programs can affect both the word-level reading skills (phonemic decoding, fluency, accuracy) and reading comprehension of students in grades three and five who were identified as struggling readers by their teachers and by low test scores. Ultimately, it will provide educators with rigorous evidence of what could happen in terms of reading improvement if intensive, small-group reading programs like the ones in this study were introduced in many schools. improvement if intensive, small-group reading programs like the ones in this study were introduced in many schools.


[1] These four interventions were selected from more than a dozen potential program providers by members of the Scientific Advisory Board of the Haan Foundation for Children. See Appendix Q for a list of the Scientific Advisory Board members.

This study is a large-scale, longitudinal evaluation comprising two main elements. The first element of the evaluation is an impact study of the four interventions. This evaluation report is addressing three broad types of questions related to intervention impacts:

To answer these questions, the impact study was based on a scientifically rigorous design-an experimental design that uses random assignment at two levels: (1) 50 schools from 27 school districts were randomly assigned to one of the four interventions, and (2) within each school, eligible children in grades 3 and 5 were randomly assigned to a treatment group or to a control group. Students assigned to the intervention group (treatment group) were placed by the program providers and local coordinators into instructional groups of three students. Students in the control groups received the same instruction in reading that they would have ordinarily received. Children were defined as eligible if they were identified by their teachers as struggling readers and if they scored at or below the 30th percentile on a word-level reading test and at or above the 5th percentile on a vocabulary test. From an original pool of 1,576 3rd and 5th grade students identified as struggling readers, 1,042 also met the test-score criteria. Of these eligible students, 772 were given permission by their parents to participate in the evaluation.

The second element of the evaluation is an implementation study that has two components: (1) an exploration of the similarities and differences in reading instruction offered in the four interventions and (2) a description of the regular instruction that students in the control group received in the absence of the interventions and the regular instruction received by the treatment group beyond the interventions.

Test data and other information on students, parents, teachers, classrooms, and schools is being collected several times over a three-year period. Key data collection points pertinent to this summary report include the period just before the interventions began, when baseline information was collected, and the period immediately after the interventions ended, when follow-up data were collected. Additional follow-up data for students and teachers are being collected in 2005 and again in 2006.

THE INTERVENTIONS

We did not design new instructional programs for this evaluation. Rather, we employed either parts or all of four existing and widely used remedial reading instructional programs: Spell Read P.A.T., Corrective Reading, Wilson Reading, and Failure Free Reading.

As the evaluation was originally conceived, the four interventions would fall into two instructional classifications with two interventions in each. The interventions in one classification would focus only on word-level skills, and the interventions in the other classification would focus equally on word-level skills and reading comprehension/vocabulary.

Corrective Reading and Wilson Reading were modified to fit within the first of these classifications. The decision to modify these two intact programs was justified both because it created two treatment classes that were aligned with the different types of reading deficits observed in struggling readers and because it gave us sufficient statistical power to contrast the relative effectiveness of the two classes. Because Corrective Reading and Wilson Reading were modified, results from this study do not provide complete evaluations of these interventions; instead, the results suggest how interventions using primarily the word-level components of these programs will affect reading achievement.

With Corrective Reading and Wilson Reading focusing on word-level skills, it was expected that Spell Read P.A.T. and Failure Free Reading would focus on both word-level skills and reading comprehension/vocabulary. In a time-by-activity analysis of the instruction that was actually delivered, however, it was determined that three of the programs-Spell Read P.A.T., Corrective Reading, and Wilson Reading-focused primarily on the development of word-level skills), and one-Failure Free Reading-provided instruction in both word-level skills and the development of comprehension skills and vocabulary.

MEASURES OF READING ABILITY

Seven measures of reading skill were administered at the beginning and end of the school year to assess student progress in learning to read. As outlined below, these measures of reading skills assessed phonemic decoding, word reading accuracy, text reading fluency, and reading comprehension.

For all tests except the Aimsweb passages, the analysis uses grade-normalized standard scores, which indicate where a student falls within the overall distribution of reading ability among students in the same grade. Scores above 100 indicate above-average performance; scores below 100 indicate below-average performance. In the population of students across the country at all levels of reading ability, standard scores are constructed to have a mean of 100 and a standard deviation of 15, implying that approximately 70 percent of all students’ scores will fall between 85 and 115 and that approximately 95 percent of all students’ scores will fall between 70 and 130. For the Aimsweb passages, the score used in this analysis is the median correct words per minute from three grade-level passages.

IMPLEMENTING THE INTERVENTIONS

The interventions were implemented from the first week of November 2003 through the first weeks in May 2004. During this time students received, on average, about 90 hours of instruction, which was delivered five days a week to groups of three students in sessions that were approximately 50 minutes long. A small part of the instruction was delivered in groups of two, or 1:1, because of absences and make-up sessions. Since many of the sessions took place during the student's regular classroom reading instruction, teachers reported that students in the treatment groups received less reading instruction in the classroom than did students in the control group (1.2 hours per week versus 4.4 hours per week.). Students in the treatment group received more small-group instruction than did students in the control group (6.8 hours per week versus 3.7 hours per week). Both groups received a very small amount of 1:1 tutoring in reading from their schools during the week.

Teachers were recruited from participating schools on the basis of experience and the personal characteristics relevant to teaching struggling readers. They received, on average, nearly 70 hours of professional development and support during the implementation year as follows:

According to an examination of videotaped teaching sessions by the research team, the training and supervision produced instruction that was judged to be faithful to each intervention model. The program providers themselves also rated the teachers as generally above average in both their teaching skill and fidelity to program requirements relative to other teachers with the same level of training and experience.

CHARACTERISTICS OF STUDENTS IN THE EVALUATION

The characteristics of the students in the evaluation sample are shown in Table 1 (see the end of this summary for all tables). About 45 percent of the students qualified for free or reduced-price lunches. In addition, about 27 percent were African American, and 73 percent were white. Fewer than two percent were Hispanic. Roughly 33 percent of the students had a learning disability or other disability.

On average, the students in our evaluation sample scored about one-half to one standard deviation below national norms (mean 100 and standard deviation 15) on measures used to assess their ability to decode words. For example, on the Word Attack subtest of the Woodcock Reading Mastery Test-Revised (WRMT-R), the average standard score was 93. This translates into a percentile ranking of 32. On the TOWRE test for phonemic decoding efficiency (PDE), the average standard score was 83, at approximately the 13th percentile. On the measure of word reading accuracy (Word Identification subtest for the WRMT-R), the average score placed these students at the 23rd percentile. For word reading fluency, the average score placed them at the 16th percentile for word reading efficiency TOWRE SWE), and third- and fifth-grade students, respectively, read 41 and 77 words per minute on the oral reading fluency passages (Aimsweb). In terms of reading comprehension, the average score for the WRMT-R test of passage comprehension placed students at the 30th percentile, and for the Group Reading and Diagnostic Assessment (GRADE), they scored, on average, at the 23rd percentile.

This sample, as a whole, was substantially less impaired in basic reading skills than most samples used in previous research with older reading disabled students. These earlier studies typically examined samples in which the phonemic decoding and word reading accuracy skills of the average student were below the tenth percentile and, in some studies, at only about the first or second percentile. Students in such samples are much more impaired and more homogeneous in their reading abilities than the students in this evaluation and in the population of all struggling readers in the United States. Thus, it is not known whether the findings from these previous studies pertain to broader groups of struggling readers in which the average student's reading abilities fall between, say, the 20th and 30th percentiles. This evaluation can help to address this issue. It obtained a broad sample of struggling readers, and is evaluating in regular school settings the kinds of intensive reading interventions that have been widely marketed by providers and widely sought by school districts to improve such students' reading skills.

DISCUSSION OF IMPACTS

This first year report assesses the impact of the four interventions on the treatment groups in comparison with the control groups immediately after the end of the reading interventions. In particular, we provide detailed estimates of the impacts, including the impact of being randomly assigned to receive any of the interventions, being randomly assigned to receive a word-level intervention, and being randomly assigned to receive each of the individual interventions. For purposes of this summary, we focus on the impact of being randomly assigned to receive any intervention compared to receiving the instruction that would normally be provided. These findings are the most robust because of the larger sample sizes. The full report also estimates impacts for various subgroups, including students with weak and strong initial word attack skills, students with low or high beginning vocabulary scores, and students who either qualified or did not qualify for free or reduced price school lunches. [2]

The impact of each of the four interventions is the difference between average treatment and control group outcomes. Because students were randomly assigned to the two groups, we would expect the groups to be statistically equivalent; thus, with a high probability, any differences in outcomes can be attributed to the interventions. Also because of random assignment, the outcomes themselves can be defined either as test scores at the end of the school year, or as the change in test scores between the beginning and end of the school year (the "gain"). In the tables of impacts (Tables 2-4), we show three types of numbers. The baseline score shows the average standard score for students at the beginning of the school year. The control gain indicates the improvement that students would have made in the absence of the interventions. Finally, the impact shows the value added by the interventions. In other words, the impact is the amount that the interventions increased students' test scores relative to the control group. The gain in the intervention group students' average test scores between the beginning and end of the school year can be calculated by adding the control group gain and the impact.


[2] The impacts described here represent the impact of being selected to participate in one of the interventions. A small number of students selected for the interventions did not participate, and about 7.5 percent received less than a full dose (80 hours) of instruction. Estimation of the effect of an intervention on participants and those who participated for 80 or more hours requires that stronger assumptions be made than when estimating impacts for those offered the opportunity to participate, and we cannot have the same confidence in the findings as we do with the results discussed in this summary. Our full report presents estimates of the effects for participants and those who participated for at least 80 hours. These findings are similar to those reported here.

In practice, impacts were estimated using a hierarchical linear model that included a student-level model and a school-level model. In the student-level model, we include indicators for treatment status and grade level as well as the baseline test score. The baseline test score was included to increase the precision with which we measured the impact, that is, to reduce the standard error of the estimated impact. The school-level model included indicators that show the intervention to which each school was randomly assigned and indicators for the blocking strata used in the random assignment of schools to interventions. Below, we describe some of the key interim findings:

Future reports will focus on the impacts of the interventions one year after they ended. At this point, it is still too early to draw definitive conclusions about the impact of the interventions assessed in this study. Based on the results from earlier research (Torgesen et al. 2001), there is a reasonable possibility that students who substantially improved their phonemic decoding skills will continue to improve in reading comprehension relative to average readers. Consistent with the overall pattern of immediate impacts, we would expect more improvement in students who were third graders when they received the intervention relative to fifth graders. We are currently processing second-year data (which includes scores on the Pennsylvania state assessments) and expect to release a report on that analysis within the next year.


[3] In future analyses, we plan to explore another approach for estimating the impact of the interventions on closing the reading gap. This approach will contrast the percentage of students in the intervention groups and the control groups who scored within the "normal range" on the standardized tests.