ID Bar
Feature headline
Volume III, Number 5, May, 2005

The Annual Testing Myth

Many of the ideas and strategies embedded in NCLB/Helter-Skelter are just plain foolish.

Beyond foolish. They are untested, unvalidated and potentially damaging.

Annual testing ranks high on the list of foolish, untested NCLB ideas.

Annual testing is imposed on all states as some Golden Rule of school reform, but there is no valid evidence to prove that it is worthwhile doing or safe. Dictating this strategy is imprudent and ill-advised.

Annual testing is a wild and irresponsible social experiment.

It is reckless and improvident.

Annual testing is a factory owner's pipe dream of how to fix schools.

It has never been proven effective in the states where it was tried.

Testing zealots substitute testing for capacity building, as if tougher tests produce better readers.

Quick to require research evidence from schools before they may select a reading program, the Feds pay little attention to research when advancing their pet projects.

Annual testing is without basis in the research.

The FED ED policy makers dress in dark suits and pretend to know what they are doing, but they are remarkably ignorant of what actually works in schools. Many of them, including Secretary Spelling, have never worked in schools as teachers or administrators. They are policy wonks, political economists and ideologues who have no business dictating to schools.

Fond of simple metaphors and untested change strategies, the brave new school reformers view schools as factories and fast food restaurants whose efficiency would improve if there were an increase in tension, surveillance and pressure for results.

It takes more than platitudes to improve the performance of students . . .
Adding insult to injury, the Feds require annual testing without paying for its true costs.

President Bush likes to brag about increasing spending for education, but his increases do not actually fund the vast NCLB expansion of federal mandates. When he took office, the federal role was very small. It was easy to make big percentage increases without actually paying the costs of the new law.

For the 1999-2000 school year, state spending accounted for 50% of all education spending, while local spending accounted for 43.2% and federal spending totaled approximately 6.9%.
Source: Rankings of the States, National Education Association, 1980, 1990 and 2000

Recent lawsuits from Connecticut as well as NEA are centered on this failure to fund expensive mandates.

But Why is Annual Testing A Bad Idea?

1. There is no convincing evidence that annual testing actually improves learning. (See below.)

2. Annual testing is expensive. (See below.)

3. Annual testing reduces the amount of time for learning. (See below.)

4. Annual testing creates pressure to teach to narrow tests. (See below.)

5. Annual testing increases anxiety. (See below.)

6. Annual testing leads to cheating in some places. (See below.)

7. Annual testing diverts resources from capacity building to testing. (See below.)

8. Annual testing ignores what we know about healthy change and systems. (See below.)

1. There is no convincing research evidence that annual testing actually improves learning.

One state offered as justification for this strategy, Texas, has employed it for years but leads the nation with Mississippi in the size of the gap between state claims of student proficiency levels and their actual performance on the more rigorous NAEP tests - the National Assessment of National Educational Progress. The chart below shows the percentage by which the state proficiency claims fell below performance on the 8th grade NAEP reading test.

. Source for data above is the Rand Study, "Achieving State and National Literacy Goals, a Long Uphill Road," available at

Mississippi, Texas and North Carolina had annual testing prior to the start of NCLB but were leaders in over-stating student proficiency levels and gains.

  • Mississippi reported 87% of its eighth graders proficient while only 18% were judged proficient on the NAEP test. Gap = 69%
  • Texas reported 85% of its eighth graders proficient while only 27% were judged proficient on the NAEP test. Gap = 58%
  • North Carolina reported 81% of its eighth graders proficient while only 33% were judged proficient on the NAEP test. Gap = 48%

How can we look at such results and consider the annual testing strategy worth imposing on an entire nation?

One Rand commentary suggested that annual testing along with financial incentives and extremely high pressure may partially explain such performance gaps:

To sum up, states that use high-stakes exams may encounter a plethora of problems that would undermine the interpretation of the scores obtained. Some of these problems include the following:
(1) students being coached to develop skills that are unique to the specific types of questions that are asked on the statewide exam (i.e., as distinct from what is generally meant by reading, math, or the other subjects tested);
(2) narrowing the curriculum to improve scores on the state exam at the expense of other important skills and subjects that are not tested;
(3) an increase in the prevalence of activities that substantially reduce the validity of the scores; and
(4) results being biased by various features of the testing program (e.g., if a significant percentage of students top out or bottom out on the test, it may produce results that suggest that the gap among racial and ethnic groups is closing when no such change is occurring).

What Do Test Scores in Texas Tell Us? (2000)
Stephen P. Klein, Laura S. Hamilton, Daniel F. McCaffrey, Brian M. Stecher

What test enthusiasts fail to recognize is the difference between testing and capacity building. While they suggest that annual testing will increase the ability of teachers to diagnose student needs and provide the instruction each student requires, the types of testing programs implemented rarely provide much diagnostic data in a way that might shift performance in a sustained fashion. To make such a shift, one would have to implement what Stiggins calls "assessment for learning" in his 6 June 2002 Kappan article, "Assessment Crisis: The Absence Of Assessment FOR Learning."

In this article, Stiggins contrasts the use of standardized tests with assessment for learning, listing eight key elements required to improve student performance, three of which we will list here:

  • translating classroom assessment results into frequent descriptive feedback (versus judgmental feedback) for students, providing them with specific insights as to how to improve;
  • continuously adjusting instruction based on the results of classroom assessments;
  • engaging students in regular self-assessment, with standards held constant so that students can watch themselves grow over time and thus feel in charge of their own success;

Stiggins' approach would require a substantial investment in professional development and a more expensive approach to data collection than is now emerging with the NCLB mandates, but it would promote capacity building rather than emphasizing fear, sanctions, judgment and blame.

Unlike the annual testing scheme, Stiggins points out that assessment for learning has been documented as successful:

Black and William uncovered and then synthesized more than 250 articles that addressed these issues. Of these, several dozen directly addressed the question of the impact on student learning with sufficient scientific rigor and experimental control to permit firm conclusions. Upon pooling the information on the estimated effects of improved formative assessment on summative test scores, they reported unprecedented positive effects on student achievement. They reported effect sizes of one-half to a full standard deviation.
Paul Black and Dylan William, "Inside the Black Box: Raising Standards Through Classroom Assessment," Phi Delta Kappan, October 1998, p. 141. Their work is reported in more detail in idem, "Assessment and Classroom Learning," Assessment in Education, March 1998, pp. 7-74

When NCLB first arrived, less than a third of the states were already doing annual testing according to a 2001 report from ECS (The Educational Council of the States) available at

Only 15 states currently test elementary and middle school students as extensively as Bush proposes – every year in grades 3-8, in both reading and math: Alabama, Arizona, California, Florida, Idaho, Louisiana, Maryland, Mississippi, New Mexico, North Carolina, South Carolina, Tennessee, Texas, Utah and West Virginia. Most states test at only two or three of those grade levels each year, and not always in both reading and math.

When the NAEP proficiency gaps mentioned above are considered for these 15 states, two of them, Mississippi and Texas lead the nation with the largest shortfall or misstatement of proficiency, running at more than a 50% gap between state reported proficiency levels and NAEP levels, and North Carolina makes the top ten list. This would indicate that annual testing can produce the appearance of student achievement on locally developed tests without achieving achievement that stands up to demanding and challenging nationally normed tests.

With a cheating scandal brewing in Texas as well as astronomical attrition rates masked by reports of low dropout rates, the data should cause grave concern about this brand of education. Note "A Lost Generation? A Million Left Behind?" and this article on the cheating scandal:

The questions of cheating arose after an investigation by The Dallas Morning News last year found strong evidence that educators were helping students cheat at nearly 400 schools statewide, including Houston.

"TAKS rates fall at probed schools - 17 Houston campuses accused of cheating see larger drop than district."

Annual testing can make teachers hyper-vigilant about teaching how to score well on a battery of tests while neglecting the broader and more important challenge of teaching students to read and understand difficult material.

2. Annual testing is expensive and reduces funds available for capacity building.

A number of groups have been calculating the gap between what Congress promised to spend with NCLB and the actual Bush funding. In its suit against FED ED for underfunding, NEA goes state by state to show the shortfalls.

In reporting on Connecticut, for example, NEA reports the following:


Public schoolchildren: 571,900
Public school districts: 191

Impact of costly federal regulations:
In FY 2005, Connecticut received $68.2 million less than it would have if No Child Left Behind was funded at the level Congress authorized. Under President Bush’s proposed budget for 2006, Connecticut would receive $109 million less than the authorized funding level.
• The proposed FY '06 budget shortchanges disadvantaged children by $71.3 million, leaving behind 24,888 children in Connecticut.

Source: NEA, "Washington Shortchanges Children and Public Schools Across the Nation"

A second group, Augenblick, Palaich and Associates, Inc., reports the following budgetary impacts on Connecticut:

In total, for FY 2004 the Connecticut study places a $5.9 million price tag on state costs and $58.8 million on the local costs of implementing NCLB, costs that by FY 2007 are expected to escalate to $7.2 million and $117.2 million, respectively. These costs do not include expanding state assessments to additional grades or adding science tests.

In FY 2004, the cost of implementing NCLB is expected to be 10 times greater for LEAs than for the SEA; by FY 2007, LEA costs are projected to have doubled and to exceed by 16 times the SEA’s cost burden.

Source: Costing Out No Child Left Behind - A Nationwide Survey of Costing Efforts April 2004 by Augenblick, Palaich and Associates, Inc.

The gaps in NCLB funding start off at a serious level but those gaps and shortfall become monumental as the demands of the law spiral.

The NEA lawsuit is based on one section of the law that purports to outlaw unfunded mandates:

"Nothing in this Act shall be mandate, direct, or control a State, local education agency, or school's curriculum, program of instruction, or allocation of State or local resources, or mandate a State or any subdivision thereof to spend any funds or incur any costs not paid for under this Act."

Since the law's enactment in 2002, there has been a $27 billion funding shortfall in what Congress was supposed to provide schools to meet the law's regulations and what has been funded. Cost studies in Ohio and Texas estimate that the price of the regulations to state taxpayers could run as high as $1.5 and $1.2 billion, respectively.

Source: NEA, "Washington Shortchanges Children and Public Schools Across the Nation"

While there has been some funding of reading programs under NCLB for the early grades, the focus on testing has strained local and state resources to fund professional development and capacity development.

In FY 2005, the nation's schools received $9.8 billion less than they would have if No Child Left Behind was funded at the level Congress promised in the law. Under President Bush's proposed budget for 2006, the nation's schools will receive $12 billion less than what Congress authorized when the federal law was enacted.

Making matters worse, in the current school year ten states and a majority of districts had their Title I funding cut, making it more difficult for them to provide extra reading and math help to disadvantaged students. Next year, nine states and two-thirds of all school districts will see less money than this year.

Source: NEA, "Washington Shortchanges Children and Public Schools Across the Nation"

3. Annual testing reduces the amount of time for learning.

Annual testing is a distraction from schools' real mission: teaching and learning, as several weeks are now required at most schools to step through test preparation exercises and the testing itself. Rather than reading new material and moving forward with state curriculum standards, students are held in a kind of testing Limbo for weeks at a time. To make matters worse, anxiety over NCLB performance has led in many schools to supplemental diagnostic testing at the start of each year to make sure that each students receives highly targeted instruction in specific skills.

Between pre-testing and the actual testing, students may be involved in as much as 3-4 weeks of test-related activities distinct from normal learning. This distraction may account for as much as 10% of the year's instructional time.

Rarely discussed is the impact of this lessened instruction, the actual reduction of effective teaching time because testing has become an obsession.

It would be great if we had scientifically controlled studies proving that less time spent learning leads to better reading and math performance, but those studies have not been conducted and funding for them seems unlikely under the current regime.

Common sense tells us that less time on task means less learning, but apologists for this obsessive testing approach try to mask its damage by word play, suggesting that devotion to testing is really just an integral but hitherto neglected aspect of instruction. How can we know what to give students, they ask, if we don't know what they are missing?

If teachers were implementing the type of daily instructional assessment for learning mentioned by Stiggins earlier, this argument might be persuasive, but the kind of annual testing conducted by most districts does not meet the criteria he stated. It is more about labeling, judging and sorting.

4. Annual testing creates pressure to teach to a narrowed curriculum.

Whatever happened the notion of teaching the whole child?

In this current test-driven context, the whole child is an endangered species. Note article this month, "Half Baked Ideas and Half Educated Children."

Sadly, high stakes annual testing leads to a primal focus on the tests themselves. Unless they are on the tests, other curriculum content, activities and goals become immaterial. Activities like silent reading, recess and art are pruned to achieve maximum test score progress.

Elementary schools phasing out recess

District sees teaching time as higher priority

By Emily Richmond

Recess at Clark County School District elementary schools is out as school officials try to wring as much teaching time as possible out of the school day.

Facing the pressure to increase test scores under the federal No Child Left Behind Act, school officials are enforcing regulations that bar the traditional elementary school ritual of recess.

Some will rejoice to see the end of recess and will applaud the narrowing of the curriculum to focus on nothing but basics, but the consequences for a democratic society are serious.

Pressurizing the lives of young children runs contrary to everything we know about the psychology of learning and narrowing the curriculum to a thin gruel of basic skills is a walmartization of the school experience.

Across the land, schools are pruning, cutting and emphasizing nothing but the tests.

Not content with the wholesale exporting of jobs overseas to low paid workers in foreign sweat shops, domestic educational policy makers have now endorsed what amounts to sweat shop practices for our youngest citizens. This would make good material for Charles Dickens.

Are there no work houses?

But even more chilling than the elimination of enrichment activities is the focus on drill and practice for tests to the exclusion of anything that might interfere with performance.

If the state tests require little in the way of thinking, why teach it?

Sadly, as we saw from the NAEP data above, students may do well on the state tests without developing the proficiencies they really need to perform well in the society.

5. Annual testing increases anxiety.

Increasing stress may work in sweat shops, but it violates almost every principle of good management for industry, and it certainly pays few dividends in schools. In the days of Greek and Roman galleys it may have paid to whip the slaves to row harder and faster, but children and teachers are not likely to respond well to this kind of pressure.

It is remarkable that such Draconian measures have crept into schools with so little challenge.

(Draconian laws, a code of laws made by Draggy. Their measures were so severe that they were said to be written in letters of blood; hence, any laws of excessive rigor. Source: The Collaborative International Dictionary of English v.0.44).

The planners in charge of reforming schools seem to know little about either children or schools. They seem ignorant of basic learning psychology. They advocate a harsh, no nonsense approach to schooling and learning that makes no sense at all.

NCLB/Helter-Skelter forces teachers, students and school leaders into a survival mode - the bottom rungs of Maslow's Hierarchy of Needs.

What motivates people to do their best?

Some are fond of punishment and threat as motivators.

"How can you have any pudding if you don't eat your meat?" sang Pink Floyd a few decades back, foreshadowing the heavy handed reform policies of this century and this Congress.

This approach to changing schools and motivating teachers and schools is simply preposterous. There is no research that proves that fear and punishment lead to improved learning.

While the term "accountability" has widespread acceptance, it is employed to mean different things by different people.

The architects responsible for AYP and NCLB's system of labeling schools mean something very threatening by accountability.

  • Get your scores up or we'll close you down.
  • Get your scores up or we'll fire you.
  • Get your scores up or we will humiliate you.
  • Get your scores up or we'll call you names.
  • Get your scores up or we'll open the floodgates and encourage all your families to flee your school.
  • Get your scores up for all sub groups or we'll act like you did nothing for any of the others.

In his book, A Third Way to a Good Society, Professor Amitai Etzioni argues that a good society requires much more than this survival mode:

The evidence shows that profound contentment is found in nourishing ends-based relationships, in bonding with others, in community building and public service, and in cultural and spiritual pursuits.... The most profound problems that plague modern societies will be fully addressed only when those whose basic needs have been met shift their priorities up Maslow's scale of human needs. That is, only after they accord a higher priority to gaining and giving affection, cultivating culture, becoming involved in community service an seeking spiritual fulfillment." Page 33

How ironic that many schools have been working hard lately on bullying in the traditional sense, a welcome effort to block students from terrorizing their peers, while Washington has been bullying state governments, teachers, principals and schools.

6. Annual testing leads to cheating in some places.

We will never know how widespread the cheating in Texas became. The Dallas Morning herald did an exposé in December of 2004 and identified some 400 or more schools where test results seemed so dramatically out of whack with past patterns that cheating seemed the only plausible explanation.

Why won't we know? Because Texas has inadequately funded accountability for its own system of school accountability.

That may seem strange, but the TEA budget for audits and auditing staff has been so small that educational fraud and cheating could go on without much fear of consequence.

That is worthy of repeating. There are almost no funds for audits and accounting.

How interesting that proponents of accountability would offer generous financial incentives to school leaders who "fixed" test scores and dropout rates while starving the compliance budget that would have added integrity to the system.

How can they proclaim miracles without making sure that fraud and cheating was identified and severely punished/ The zeal for accountability seems distorted, uneven and selective.

While then Governor Bush was warned by TEA that the system of rewarding administrators might lead to misreporting dropout figures unless the budget for audits was increased, these warnings were ignored so the "miracle" could proceed unencumbered with veracity.

As far back as the year 2000 when the current President was Governor of Texas and the Secretary of Education was Superintendent of the Houston ISD, the TEA (Texas Educational Agency) issued a harsh report warning that the Texas system for recording dropouts (when combined with various incentive programs) would lead to serious under-reporting and under-counting. (Dropout Study: A Report to the 77th Texas Legislature) In short, the report stated that some schools and districts might sweep the dropout problem under some magic carpet. Instead of taking care of these troubled students, the system might erase them from the books. "Transferred."

Now that harsh publicity has descended upon Texas schools suspected of cheating, there has been a sudden reversal of fortunes.

The questions of cheating arose after an investigation by The Dallas Morning News last year found strong evidence that educators were helping students cheat at nearly 400 schools statewide, including Houston.

"TAKS rates fall at probed schools - 17 Houston campuses accused of cheating see larger drop than district."

Accountability evidently cuts in several directions.

7. Annual testing diverts resources from capacity building to testing.

Unfunded mandates have the effect of draining off resources from one activity to the mandated activities. Thus, in the case of annual testing, the federal failure to fund the expansion means that states like Connecticut must steal money from other programs to pay for the testing. Rather than building capacity and strengthening the delivery of sound instruction, annual testing can siphon off the funds needed for professional development and instruction so that the very resources most likely to produce results are strained and reduced in order to produce a testing program of questionable value.

8. Annual testing ignores what we know about healthy change and systems.

The premise lurking behind annual testing is that threat works wonders. It is a coercive strategy. Concealed behind rhetoric and smooth talk is an ominous message about human behavior. The architects of this plan basically hold teachers and school workers in contempt and feel they are incapable of making life better for children unless someone holds a (testing) gun at their heads and keeps them under constant pressure to perform.

"The Boss is watching you!"

In fact, it is now fashionable in many school districts to use test results as a form of supervision and management control, with principals assigned to make sure that all teachers are on the right page reading from the right script.

Lurking behind this strategy is the wish for teacher proof materials and teacher proof schools, the industrialization of schooling so that all children can benefit from a standardized, assembly-line approach.

Sadly, many schools that were doing quite well with school improvement prior to NCLB have now been judged failures under its strange rules. The impact on staff morale of such messages is devastating.

The architects of this change strategy simply do not understand schools and how they can change in healthy ways. They feel that autocratic, top down measures like annual testing will somehow scare teachers into both compliance and competence.

The success of such measures can be seen in Texas.

There are huge problems with drop outs, a huge gap between state claims of student proficiency levels and a potentially huge cheating scandal.

We simply have the wrong people dictating change strategies to schools. They are wrong-minded, ill-informed and hazardous to the well being of children and schools.


ECS Special Report
A CLOSER LOOK: State Policy Trends in Three Key Areas of the Bush Education Plan -- Testing, Accountability and School Choice

Costing Out No Child Left Behind - A Nationwide Survey of Costing Efforts April 2004 by Augenblick, Palaich and Associates, Inc.

© 2005, Jamie McKenzie, all rights reserved. This article may be e-mailed to individuals by individuals, but all other duplication, distribution, publication and use is prohibited without first receiving explicit permission. Contact for information.
Volume III, Number 4, April, 2005