| We should all applaud growth and improvement, but claiming victory for NCLB is an outrageous misuse of the data and the report. It is fuzzy math, propaganda and distortion. It is spinning.
The authors of the report specifically warn against drawing any such conclusions regarding causes:
- Cautions in Interpretations
- As previously stated, the NAEP reading and mathematics trend scales make it possible to examine relationships between students performance and various background factors measured by NAEP. However, a relationship between achievement and another variable does not reveal its underlying cause, which may be influenced by a number of other variables.
- Source: NAEP 2004 TRENDS IN ACADEMIC PROGRESS - Page 117
When did NCLB begin and what years are covered by this testing?
Voted into law by Congress in 2002, NCLB did not really begin to shift school practice until the 2002-2003 school year, but the Report Card covers the school years since the 1999 Report:
| Years Covered by the Report Card |
Years NCLB was Effecting School Programs |
| 1999-2000 |
. |
| 2000-2001 |
. |
| 2001-2002 |
. |
| 2002-2003 |
. |
| 2003-2004 |
2003-2004 |
Laid out coldly in black and white, it is obvious that the White House and the Ed Department have stretched the truth dramatically.
How Big are the Gains Really?
The point gains on the NAEP scaled scores amount to each child in the study getting one or two extra items correct compared to the last teasting in 1999.
This hardly a revolutionary shift in the reading and math performance of our students.
Conflicting Results
Issues of Methodology
The Report Card uses sampling techniques that are quite complicated and the methodology section is hard to read and understand, but it is time well spent in order to measure the reliability of the findings.
The 2004 Report Card emerges in two versions - a summary and a detailed report. While there have been some dramatic shifts in the way this report and its sampling were conducted since the 1999 Report, newspaper coverage ignored these changes and the potential validity issues emerging from such changes.
Although the authors warn that the Report's value depends upon consistent methods being applied across the years, they go on to report quite a few changes in this Report that could undermine confidence in the findings if one ever took the time to read about them.
- Measuring trends of student achievement, or change over time, requires the precise replication of past procedures. Since their inception, the design and methodology of the NAEP long-term trend assessments have remained constant, to the extent feasible, thereby enabling the continuous monitoring of a fixed set of curriculum topics.
Source: NAEP 2004 TRENDS IN ACADEMIC PROGRESS - Page 91
Sadly, the methodology is rarely reviewed or considered before the findings are passed along as gospel. Even though the authors make the statement about precise replication above, they list many potentially troubling changes that were made in the 2004 Report. The 2004 Report is not a precise replication.
The following are examples of methodology issues drawn from a careful reading of the report, listed here first as headings while linked to explanatory sections.
- Sampling techniques have changed. Explanation.
- The percentage of Hispanic 9 year olds rose dramatically since 1999. Explanation.
- The assessment was explicitly scaled in a cross-age manner only in the base year (1971). Explanation.
- Students answer far fewer items than they would in the real NAEP tests. Explanation.
- Math items changed from year to year. Explanation.
- The basis for geographic primary sampling units changed since 1999. Explanation.
- Target population sample size was fewer than 15,000 for 9 year olds. Explanation.
- Impact of the weighting system for sample selection. Explanation.
- Nonpoststratified weights have been used in the 2004 analysis. Explanation.
- Response rates for nonpublic schools selected for participation in the 2004 trend assessments failed to reach the necessary threshold for reporting. Explanation.
- Item response theory (IRT) was used to estimate average proficiency for the nation and various student groups of interest within the nation. Explanation.
- Degree of uncertainty. Explanation.
- NAEP results, like those from all surveys, are also subject to other kinds of errors, including the effects of necessarily imperfect adjustments for student and school nonresponse and other largely unknowable effects associated with the particular instrumentation and data collection methods used. Explanation.
- Nonsampling errors can be attributed to a number of sources. Explanation.
|
- Sampling techniques have changed.
- The percentage of Hispanic 9 year olds rose dramatically since 1999.
- The assessment was explicitly scaled in a cross-age manner only in the base year (1971).
- Students answer far fewer items than they would in the real NAEP tests.
- Math items changed from year to year.
- The basis for geographic primary sampling units changed since 1999.
- Target population sample size was fewer than 15,000 for 9 year olds.
- Impact of the weighting system for sample selection.
- Nonpoststratified weights have been used in the 2004 analysis.
- Response rates for nonpublic schools selected for participation in the 2004 trend assessments failed to reach the necessary threshold for reporting.
- Item response theory (IRT) was used to estimate average proficiency for the nation and various student groups of interest within the nation.
- Degree of uncertainty.
- NAEP results, like those from all surveys, are also subject to other kinds of errors, including the effects of necessarily imperfect adjustments for student and school nonresponse and other largely unknowable effects associated with the particular instrumentation and data collection methods used.
- Nonsampling errors can be attributed to a number of sources.
-
|