Since 2002, 4 million visitors plus:
hit counters
search engine optimization service

  Appletcollection Vertical Menu java applet, Copyright 2003 GD

AGE vs GRADE Based scores

A question has come up about how to test kids who have been retained. On the Woodcock, do we score them by age, or by grade?? It would seem to me that they should be scored by grade, since it would not be fair to score them with their peers in a higher grade since they have not been exposed to the material or subject matter at the higher grade level. Let me know. Thanks! Kim

In my experience with younger kids, there is little statistical difference in the standard scores when scoring by age or by grade in the case of retention. I typically will score a retention case both ways on the scoring program and then compare them. They tend to meet the discrepancy either way. This may be different with older kids, but if they are run through the computer both ways, the examiner will be able to make very clear judgments regarding what is instructionally related and what is learner related.

Subject: RE: Grade/Age Equivalents

You are most correct in that many different interpretations of age/grade equivalents are made and most of them are WRONG. It is for this reason and the fact that they are not equal interval metrics that measurement experts for the past 80 or so years have warned against their use. The usual

meaning that parents and teachers come away with when grade/age equivalents are used is that the child has the skills approximately of the age/grade specified which in not at all the case. The age/grade equivalent is simply that the child obtained the raw score corresponding to the average raw

score obtained by a group of children at the specified age/grade. Most tests lack a sufficient number of items at any specific age/grade level of difficulty due to their broad age/grade range of the test. I recommend to my undergraduate and graduate students to present only standard scores and

percentiles and interpret those. Age/grade equivalents are so misused that they create more problems than answers. Sadly, I have heard psychologists misinterpreting age/grade equivalents to parents and teachers!


Subject: Re: Grade/Age Equivalents

A further caution about the age equivalents on the WISC-III. John Willis and I did a study examining the intra-subtest scatter on the subtests of the WISC-III. (Dumont, R. & Willis, J. O. (1995) Intra-Subtest Scatter on the WISC-III for Various Clinical Samples vs. the Standardization Sample: An examination of WISC Folklore?

Journal of Psychoeducational Assessment , 13, 271-285)

What we found was that it was quite normal for someone to have scatter within a subtest. Since the age/grade equivalents are interpolated from the raw score, the fact that a person can have quite a bit of scatter within the scale suggests that the age/grade equivalents are further confounded by normal variation. A person can have a raw score of 10 on information by answering the first 10 items correctly and failing all others, or get a raw score of 10 by answering correctly every other item

from 1 to 20. Same raw score, very different performance (but remember scatter is normal).

Hope this is useful.


> I am only familiar with a table of age equivalents that appears in the WISC-III supplementary > tables in the back of the Manual; I use the other Wechslers so seldom in my current assignments > I'm not sure what they have. The WISC-III gives age-equivalents for subtest raw scores; and my > interpretation is that this raw score was the average score obtained by students of the given age. > At least that's usually the implication. I don't know how this specific table was generated, but often > in drawing up age-equivalent tables the means at specific ages are plotted, and ages in between > obtained means are interpolated. My understanding is an overall developmental age for the > WISC-III may be estimated by computing a median among the obtained subtest age-equivalents.

> I use WISC-III age equivalents very cautiously, usually only where it may be useful to have an > approximate cognitive development level. For example, a child referred as possibly having > AD/HD, but who is cognitively delayed, is going to be developmentally like a much younger child > in the degree of distractibility and impulsivity they demonstrate. If they demonstrate > distractibility and impulsivity relative to same-age peers, it may be due to general cognitive

> delays, rather than AD/HD. There are times when a developmental age is useful to emphasize a > student's ability to reason and use good judgement; e.g., in a manifestation determination hearing > on an MR student who was caught with drugs -- it may be useful for the Multi-Disciplinary Team > to understand that this is a 14 year old who reasons like an 8 year old.

> Having said all that, age and grade-equivalents must carry a lot of caveats. They are not > equal-interval scales (going from age 8 to 9 is actually a greater amount of growth than going from > age 11 to 12), and because of that, you cannot apply ordinary arithmetic (which is the reason for > using median age-equivalent as a measure of central tendency), as you might want to do in > showing growth over time. Variability in scores is not standard throughout the range of the

> scale, so a "two year deficit" is not as severe at some ages as it may be at others.Grade- and > age-equivalents appear to be more sample-sensitive than standard scores; that is, there appears

> to be more variability between tests in grade- and age-equivalents, and more consistency when > comparing standard scores between tests, when samples are comparable but don't contain the same subjects.

> John 

Subject: RE: Grade/Age Equivalents

Gary, partly your sadness is due to the failure of training programs to provide a thorough grounding in the relationships between data types (categorical, interval, etc.) and the metrics used to describe them, and partly also a failure of the interpreters' to understand that the underlying distributions in skills represented by different grade equivalents can be enormously different; the reading skills difference, for example, between a beginning 2nd and 3rd grader is much larger than that between a 9th and 10th grader, but both differences "look" like 1 year.


Sadly, I have heard psychologists misinterpreting age/grade equivalents to parents and teachers!

Subject: Re: Grade/Age Equivalents

Some years ago, the International Reading Association condemned the use of grade equivalents for reading tests. Good for them! Reading Teacher, January 1982, p. 464.

John Willis

Subject: Re: Grade/Age Equivalents

I had the pleasure of testifying in a due process hearing in which a doctoral-level neuropsychologist swore under oath that a student was working at a fourth grade level in arithmetic (the student's grade level on the WRAT). Since the student was in eighth grade, that seemed very sad. However, the subpoenaed protocol, showed the student computing with common fractions with unlike denominators, doing long division with decimal fractions, and generally handling very advanced arithmetic competently. The low grade equivalent stemmed from simple errors on sums and products and occasional misreadings of digits and operations signs. The neuropsychologist not only misinterpreted her own findings, but convinced him/herself of the validity of the misinterpretation. Happily, the hearing officer was wiser.


Subject: Re: Grade/Age Equivalents

The complaints you've seen about age equivalents on the WISC also pertain to both age and grade equivalents on the WIAT. It doesn't matter how they are derived. Don't use them. Grade-level designations of items are a different matter. For example, reading inventories and some reading tests, e.g., Diagnostic Assessments of Reading with Trial Teaching Strategies and the Diagnostic Reading Scales by Spache, have the student attempt tasks designated at specified grade levels, so you can say Ralph handled the third grade material easily, the fourth grade material marginally, and crashed and

burned on the fifth grade material.

John Willis

Date: 3/20/99 10:16pm

Subject: Re: [Fwd: age vs. grade norms -Forwarded]

Depends on what you are trying to prove. In the state of Colorado age norms are preferred. I generally will run a profile of both if there has been a retention because when there is a retention involved, the kid will always come out looking bad on age norms when it is actually a lack of opportunity for instruction due to the retention. If you are simply interested if the grade placement is appropriate for that child, then grade norms might be preferred. Otherwise, compare the child to others the same age. Should you run both age and grade norms on a child who has not been retained, you will find there is little to no difference in the standard scores.

Subject: Re: No Subject

International Reading Association (1982). Misuse of grade equivalents: resolution passed by the Delegates Assembly of the IRA, April 1981. Reading Teacher, January, p. 464.

Lyman, H. B. (1991). Test scores and what they mean. Boston: Allyn and Bacon. pp. 112-113.

Sattler, J. M. (1992). Assessment of children, rev. & updated 3rd ed. San Diego: Jerome M. Sattler, Publisher. pp. 20-21.

(Willis, J.O. & Dumont, R.P. (1998). Guide to identification of learning disabilities, 1998 NY State edition. Acton, MA: Copley. pp. 69-70, 222-223.

John Willis

[NOTE: email the IRA and they may send you a copy. That's how I got one. - rogerp]

Subject: Re: Test Interpretation

Just a point about intrasubtest scatter (when a person fails "easy" items but goes on to pass "harder" items). John Willis and I investigated this phenomenon by sampling 410 students having various classifications (ld, adhd, s/l, premature birth). [ Dumont, R. & Willis, J. O. (1995) Intra-Subtest Scatter on the WISC-III for Various Clinical Samples vs. the Standardization Sample: An

examination of WISC Folklore? Journal of Psychoeducational Assessment , 13, 271-285.] In our conclusions we wrote:

"Although intrasubtest scatter might be indicative of some dysfunction in adults, extreme caution must be taken when making the same interpretation with children. The face-value assumption that groups of children identified as needing some form of educational service would display more intrasubtest scatter than normal was not supported by this study. In fact, of the 12 cases where

scatter was evident and significantly different from the mean of the standardization group, 7 indicated significantly less scatter. Further analysis with subgroups based on both diagnosed disability and test age in fact demonstrated no meaningful difference among students with diagnosed

disabilities, students referred and found not to have educational disabilities, and students in the original WISC-III norming sample. As seen from this study, the data on intrasubtest scatter suggest that it may be fairly common in those suspected of having learning disabilities, but the intrasubtest scatter was neither large nor consistent enough to serve as a diagnostic marker."

We called this sort of issue "WISC Folklore" since it sounded like it should make sense but when examined empirically, it was not supported.


Jim wrote:

> Sattler's book does a good job of explaining inter and intra subtest scatter. If I remember correctly, > if the range of subtests in the verbal or performance realm is 7 or great (e.g. Vocabulary 12 and > Information 4) intrasubtest scatter exists and must be explained in the interpretation section that > the result may not be an accurate representation of the child's true ability due to the scatter in > performance. Jim Lloyd

This article comes from:

CEC: TODAY: Exclusively for membership of the Council for Exceptional Children, Vol. 5 No. 4 November 1998

Assessments Fail to Give Teachers Relevant Information

Through students with disabilities are routinely given assessments to determine their eligibility for special education services, these assessments rarely provide special or general education teachers the information they need to provide effective instructions. Too often, teachers are given only an overall score for a student's IQ or achievement level, which provides only limited information about a student's functioning. In other cases, the assessments are too narrow to provide an accurate picture of the student's abilities.

To redress this situation, assessments need to be restructured and teachers need to receive more complete information, according to experts in the field. Rather than giving students a standard set of tests, such as Wechsler and Woodcock-Johnson, as is often the case, educational diagnosticians must broaden their use of assessment tools. In addition, educational diagnosticians should know when they can appropriately provide students with accommodations for individual assessments.

While assessments have always played a central role in determining a student's eligibility for special education, they have come under fire in the past few years. Critics claim that current assessment technigues do not accurately identify the presence of disabilities, particularly learning disabilities or behavioral disorders. Others say that through our assessments we focus too much on a student's weaknesses rather than his or her strengths. The need for refinement of assessment strategies has been further highlighted by the Individuals with Disabilities Education Act (IDEA) 1997, as well as current reform movements emphasizing standards. Because students with disabilities must have access to the general education curriculum and they are expected to meet the same high standards, teachers must have more information about a student's abilities and current functioning if they are to work with him or her effectively. "IDEA '97 points out the need for individual comprehensive evaluations," said Douglas Smith, professor at the University of Wisconsin, Fiver Falls.

Weaknesses of Current Assessment Practice

The primary problem with current assessment practice appears to be the lack of comprehensive information, either because the tests fail to provide it or the assessor limits the information that could be gleamed from the process. He or she may restrict the number of tests given or the number of subtests within an assessment tool.

Ironically, computerization has contributed to the lack of relevant information about a student's functioning that is passed on to the teacher. Rather than receiving the educational diagnostician's analysis of student achievement, learning patterns, and other information that can be used to determine instructional strategies, the teacher often receives an overall computerized score, according to Rosalind Rothman, professor at the Center of Sunny and Cuny, NY. Such scores might say a student is working at the third grade level but do not give any type of analysis of the student's abilities or problems.

Following is an overview of some of the assessments often used with students with disabilities and their strengths and weaknesses.

Intelligence Tests

The way we currently test for cognitive processing shortchanges teachers in two ways: we can derive a falsely high score and we do not learn in what ways learning is being interfered with, according to Gary Hessler, consultant for Malcomb Intermediate School District in Clinton Township, Mich. That is because examiners often administer only the first three subtests, which assess fluid intelligence (abstract thinking, problem solving), verbal intelligence, and non-verbal intelligence. Unfortunately, it is often the lower level cognitive processing that causes problems for our students, Hessler says. Lower level processing includes

Long-term retrieval - The ability to retrieve information on demand.

Short-term memory - The ability to hold information in one's immediate awareness long enough to think about it.

Working memory - The ability to remember information long enough to think about it and use the information to solve a problem.

Processing speed or automaticity - How rapidly and automatically one can perform simple tasks (affects routine abilities like sigh word knowledge and math facts).

Phonologiacal awareness - How well one understands that words are made up of sounds.

Orthographic ability - How well one perceives and retains visual letter patterns. Fine motor ability - The ability to rapidly perform fine motor tasks, such as handwriting.

To use an intelligence test, such as the Wechsler, to its best advantage, we must go beyond an overall IQ score, recommends Hessler. Examiners should administrator (Note: This was the way the word was spelled in the text) several of the subtests, evaluate each separate subtest, and get a sampling of each. From this information, the evaluator can see patterns and discern where the student is having a problem in cognitive processing. While all 10 of the intelligence factors should be considered, the student may not need to be evaluated formally in each area, says Hessler. Examiners can learn a lot about a student's capabilities by interviewing his or her teachers, parents, and others who work with the student.

Achievement Tests

While the Woodcock-Johnson Achievement Test (WJ) and the Wechsler Individual Achievement Test (WIAT) are two of the most commonly used assessments for students with disabilities, a study by Rothman showed that the tests have several weaknesses.

Reading. The WJ passage comprehension sections were viewed to be unrelated to what is done in the classroom. The WIAT subtest questions are poorly constructed and the passages are too short for older students.

Math. Neither the WIAT nor the WJ provided enough examples of different problem types in the computational math subtest. Again, the WIAT was inappropriate for older students. The problem solving section of the WJ relies too much of time and money to give valid assessment of student's abilities.

Written/Oral Language. Interpretation of written and oral language on both tests is very subjective. In addition, the WIAT's subtest on oral language should be redesigned. The WJ's written language subtest for the older grades is unrelated to the type of writing students are expected to perform.

While no one denies the need for testing, we must re-evaluate the instruments we currently use to make the assessments more relevant to what is expected in the classroom. The question is whether or not can we do that with our current instruments, Rothman says.

Marci, in one sense I guess it depends on what the reason is for reporting the scores. If you want to see how the child is doing compared to his grade peers, some people use the grade scores. The problem with using grade scores, however, is best seen through a little thought experiment. Let's assume a hypothetical first grade kid is expected to master adding a column of 3 single digit numbers and let's further assume our target kid has just learned to accomplish that task reliably. On the surface it sounds like he gets an "average grade standard score" for first grade on this task. Now what if I tell you the kid is 14. Hmmm. My point is that using grade scores for kids who have been retained (or as a measurement of any child of an arbitrarily older age than his grade peers) covers up the fact that our hypothetical kid is probably better thought of as having seriously deficient math skills rather than as having solid first grade skills. My experience is that grade standard scores are often used to give an illusory sense of accomplishment and justify the retention of kids through statements like "Johnny's doing so much better now that he has been retained. His scores are average!" There is often the additional tacit assumption that because Johnny's skills are first grade level, he can function in the first grade curriculum. That's often wrong too; Johnny's functioning is probably very different from a first grader, although he happened to get the same number of problems correct on the test. - This problem again came into focus recently for me when assessing an 8th grade student. Her scores, for the most part, were close to average for 8th grade students, but she was still failing all her core courses miserably. The tests said she shouldn't be having such serious problems when grade norms were used. However, the girl had been retained twice; she was actually two years older than her peers. Using age standard scores it was easy to make a case that she needed special services. This is not the first time I have seen students experience de facto denial of services because they had been retained. - The moral of the story now that I get off the soap box? Stick with age standard scores. 


Excellent replies from all. I'd like to add the legal aspect.


300.541 Criteria for determining the existence of a specific learning disability.
(a) A team may determine that a child has a specific learning disability if*
(1) The child does not achieve commensurate with his or her age and ability levels in one or more of the areas listed in paragraph (a)(2) of this section, if provided with learning experiences appropriate for the child's age and ability levels; and
(2) The team finds that a child has a severe discrepancy between achievement and intellectual ability in one or more of the following areas:

We all know this section. Notice that the criteria is "does not achieve commensurate with his age and ability" not "his grade and ability."

I'd also like to add the point of looking closely at the

(a) A team may determine that a child has a specific learning disability if* phrase. "The team MAY determine...."

It does not require you to have a severe discrepancy. Discrepancy is inclusionary but not exclusionary.

Ron Dumont

We work in a legal environment. My state requires that age based scores be used. That's based on the federal regulations, which say:

Sec. 300.541 Criteria for determining the existence of a specific learning disability. (a) A team may determine that a child has a specific learning disability if-- (1) The child does not achieve commensurate with his or her age and ability levels

Age. Not grade. You could of course invoke Section 300.354* and argue that a severe discrepancy based on age was the result of your school having failed to instruct him. In its guidebook, "Taking Responsibility for Ending Social Promotion" (issued July 1, 1999), the U.S.D.O.E suggests that retention is never an appropriate or sound educational intervention. So you could say, "Sorry. Your kid doesn't qualify because we shafted him. Since we effectively ruined your child's chances of ever doing well in life by retaining him, he doesn't qualify for any help." If you try that argument, please let me know how it turns out.

*300.541 (b) A child may not be determined to be eligible under this part if-- (1) The determinant factor for that eligibility determination is (i) Lack of instruction in reading or math.

Guy M. McBride

While age norms should always be used to consider eligibility, there is a place for grade norms. I am not talking about age and grade equivalent scores, as some have, but about the standard scores and percentiles compared with others at the same age and the same grade level. For a retained child, I

will run grade norms on math, which is very much tied to teaching certain skills at certain grade levels. Once a child is a reader, reading and writing skills are tied less to instruction than to accessibility, so age norms are almost always the most appropriate. However, for young children in first and second grades (our cut-off is Dec. 2, as opposed to Sept. 1, so many of these children are young by most state standards) at the SST level I will run both grade and age norms to show teachers that even though these children seem "behind" compared to classmates, they are indeed acquiring skills at a "normal" rate for their age.

Rhonda J.

Actually, as much as I have personally railed against the use of grade equivalents, they also have a place. Some courts have accepted g.e.'s as valid indicators of a child's progress. When a school system is arguing in a tuition reimbursement case that it had provided FAPE to a child, pre-post grade equivalents showing, on average, a year's progress can be very persuasive. (Houston Independent School District v. Caius R. et. al., 5th Circuit, January 2000.)

I'm not as comfortable over using grade norms in an attempt to convince teachers that children with serious classroom problems are really normal. Telling a teacher a kid didn't meet state standards is one thing; implying s/he was behaving like Chicken Little when there's nothing wrong with the child wouldn't enhance teacher-psychologist rapport in my district.

I find that grade standard scores are more misleading than age standard scores. When using grade standard scores and the different curriculums from school to school system, one can not say that Johnny is achieving at the 2.3 grade level in reading is the same from school to school. It depends on the curriculum. However the standard age scores looks at where you should be academically based on your age. Therefore you can alleviate if they have been retained or not. I think the standard age score alleviates a lot of possible contributing factors. For example all children do not begin school at the same age. Just because a child is seven years old does not mean that they will be in the second grade. They may be a third grader who started school early.

That's just my $.02 worth.

I have not been paying close attention to this thread, but just wanted to add a simple FYI.

I have had considerable experience in calculating age and grade norms for major tests. I was the person who calculated all the age/grade norms for the WJR and am doing the same for the WJIII. When calculating norms, we applied psychometricans typically plot median scores for ordered age or grade groups and then smooth a curve through the data points. The curve smoothing is necessary because of the sampling error present in the small subssample statistics (each data point is typically based only on 50-100 subjects).

For what it is worth, the grade norm curve fitting/smoothing is always MUCH easier. The sample statistics show much less "bounce" (sampling error) and the curve fitting is like a hot knife going through a stick of butter. On the other hand, age norm plots always show much more sampling error between the different points and the curve fitting is often more difficult. In other words, the grade norm fitted solutions are more precise and have less error (the standard error's are smaller).

Just thought some folks may find this interesting.

Kevin McGrew

Just to clarify, Guy, I don't use this technique for children with serious classroom problems at the Student Study Teams. What usually occurs is the teacher brings up a child who is well-adjusted and making good academic progress but is "behind" most of the other children. This information, when

combined with other supporting ecological information, is intended to provide a context for the child's performance as developmental rather than pathological and to identify appropriate areas of support and intervention given this situation.

In California, our teachers are so hyper about the type of emphasis placed on test results by teacher that some are referring everyone who is not at or above grade level, being as the state goal is to have all students above the 50th percentile (for grade) on the statewide assessments. Those politicians are SO smart, to be sure. Add to that the fact that principals and teachers can be removed if their school has not accomplished that goal in three years and, well, you can see that California has its OWN special education system.


Okay, see there were these two carpenters eating their lunches while sitting under a china berry tree. One said to the other, "I don't believe in saws; I think hammers are always better. To which the other one said, "Please pass the mustard."

I probably should just stop there, but some people will not see the relevance to the "Saw vs. Hammer" debate. See it is just real hard to cut boards with age norms and hard to hammer nails with grade norms. So, next time you want to compare a child to his grade-mates as opposed to his age-mates, please remember to use a hammer.


The only change OSERS made in the LD definition in 1997 were to address Congress's intent to expand the rights of parents to participate in the process. It recognized that there is a lot of debate over the LD definition, but its only response (both in the preface in Appendix B) was to make a commitment to study the issue over the next few years and make a recommendation.

There must be a severe discrepancy under the law for a child to qualify for services. OSERS has left that to the states to operationally define since 1977. Ron is quite right that the Federal Regulations do not and have not ever operationalized "severe discrepancy" which is why you have some states (like New York, for example, with its non defined 50% discrepancy rule) that have virtually no definition at all, some states with regression formulas, and other states with fixed cut offs.

In 1996, Wisconsin, after having applied its rules without criticism for eighteen years, came under the scrutiny of OSERS. Inasmuch as OSERS was sufficiently concerned that it withheld $52 million in state aid, and as there have been no substantive changes in its definition since, their concerns are worth reviewing in attempting to make some sense out of the legal implications of the definition.

Wisconsin's response can be accessed at:

Wisconsin's interpretation, unchallenged by OSERS, is summarized in the following paragraph from their letter to Thomas Hehir, then OSERS' director:

"M-teams in Wisconsin generally recognize that the criteria contained in both the federal regulations and the state rules must guide an evaluation, but do not direct an M-team to make a finding of LD eligibility or ineligibility for a particular child. Both the state and federal criteria are permissive in that they require evaluation teams to consider certain eligibility criteria, but they do not require an evaluation team to reach a conclusion solely because the child meets or fails to meet those stated criteria. The rules require evaluation teams to consider the criteria and the performance of the child against those criteria, but they also require the evaluation teams to use professional judgment in making individual eligibility determinations."

Wisconsin's rules were idiosyncratic in a number of respects, but their contention was that MD teams misconstrued them when they were applied in a manner inconsistent with the federal definition. In some cases, children with IQs below 90 were being denied services; the federal definition only excludes children who are mentally retarded. (Which, by the way, does not exclude every child with 68 IQ from being considered LD, if their adaptive behavior scores are not significantly below average.) Wisconsin had added several areas of potential disability to the federal definition, but it required MDTs to identify two areas in order to qualify the child; in cases where a child had a discrepancy in an area not covered by the federal definition, this was not problematic, but in a few cases some otherwise very eligible kids were being excluded. Lastly, some children with severe discrepancies were being automatically identified by some teams even though there was no finding that they needed special education in order to receive FAPE. OSERS position was that this was a flagrant violation of IDEA, and Wisconsin agreed.

It is probably not possible to list ALL the factors that a team might consider in reaching a decision, but court decisions and OSERS letters have given us some suggestions.

1. Neither a low IQ score (see above) nor a high IQ score may be used to exclude a child from consideration as LD.

2. The Department of Education, in its letter to LDA of North Carolina, wrote that it is "generally" appropriate for the multidisciplinary team to include in its written report (to determine eligibility) information regarding "outside or extra" instructional help or support which "may indicate the child's current educational achievements reflects the service augmentation, not what the child's achievement would be without such help." Within context, the Learning Disabilities Association interpreted that to mean a child need not have failing grades, if he or she is only passing a result of special service or support such as tutoring twice a week or a parent who spends three to five hours with the child on homework each evening. (Ref: see above.)

3. Matthew Effect. If there is prior evidence of higher IQ, and present testing shows a decline that results in the child being ineligible, the team may consider whether the disability may have resulted in significantly different learning experiences which have negatively impacted the scores. While this argument has been advanced in hearings and sometimes favorably regarded (Brody v. Dare, for example, on the website), I have not seen it argued at a circuit level. Also, while I've seen research suggesting Matthew (or Mark) effect is a real consideration, I do not have data quantifying it--that is, I have no clue as to whether one would normally expect a three point drop or a twenty point drop as a consequence. I am therefore not a strong proponent for applying this on a regular basis to every child whose score dropped (why re-test if we're going to do that?), but I do recognize it as a factor to be considered.

4. North Carolina says "When there are verbal/performance IQ discrepancies of at least 20 points on the Wechsler Scale, the higher scale IQ may be used to determine the achievement-ability discrepancy providing there is evidence that the higher score accurately reflects the student's intellectual functioning. Because of the importance of the intellectual assessment to the identification process, group intelligence tests, unjustified prorated scores or extrapolated scores and abbreviated forms shall not be used." Other state definitions may offer similar clauses.

5. North Carolina also says, "If the multidisciplinary team determines that the assessment measures did not accurately reflect the discrepancy between achievement and ability, the team shall state in writing the assessment procedures used, the assessment results, the criteria applied to judge the importance of any difference between expected and current achievement, and whether a substantial discrepancy is present that is not correctible without the provision of special education. [1999 Procedures, .1505, C. (8) (c) (iii)]" With respect to that underlined phrase, I see that as saying, "If you don't like our definition, you may use your own, as long as you write down what it is." I suspect other states, however, have similar elastic clauses. It could be especially handy when reviewing a child who meets another state standard and who would otherwise be qualified for sped under Section 504. And while it is frustrating to have your state say, in effect, "Gee, guys, if you don't like our definition, make up your own," I think that what they've written is actually consistent with and a reflection of the ambiguity in the federal regulations.

Guy M. McBride

I think you are mistaken on several accounts. You are right about the different curricula and how grade scores do not reflect any particular curriculum. However, neither does age. For one thing, different states, and different districts, as you have noted, have different cutoff dates for school entry, so that a child who turns 5 in October may enter school in one district and may not enter until the following year in another district. More importantly, there is no absolute "what a child should know" or "where a child should be" based on either age or grade. What we have are learning curves with a broad range of distribution along the curve for any age or grade. Furthermore, most norm referenced tests are just that, referenced to the norm, not the criterion - curricular or developmental. Standardized tests best measure how well a student performs on that test compared to other children that age or grade across the nation (or otherwise, if state or local norms are available). Use of age or grade based norms identifies the reference group, and the child's relative performance to that group. As we depart from that, we enter the realm of speculation, in which we try to apply research, theories, and experience to interpret the meaning of these scores.

Just a couple more cents, as Mr. Greenspan tries to hold back inflation.

Jonas Taub

I fine this interpretation of IDEA problematic. Are you say that a student whose achievement is commensurate his or her age and ability is disabled? My read of this section may be faulty but it seems to me that the intent was for lack of a discrepancy to be at least potentially exclusionary. I thought the "may" language was intended to allow teams flexiblity in that just having a discrepancy does not necessarily indicate SLD eligiblity. Help me to understand your interpretation.



I read IDEA to say that the determination of an educational handicapping condition is a team decision, made with adequate and accurate information. It is not, as my friend John Willis points out, "An exercise in arithmetic...but if you use arithmetic do it right." If a child has a severe discrepancy (and there is I believe no definition of severe or mandate that it be some numerical) then, if all else is in place (pre-referral interventions, a disorder in basic psych. process, a need for an

individualized program of specialized instruction) then the child MAY be considered as a child with a handicapping condition. What I also believe (and I bet a whole bunch of attorneys do to) is that a child who does not live up to some numerical "severe discrepancy" should not be automatically excluded from being classified as a child with a handicapping condition. I think it ridiculous (but real) that a district near me has set as a criteria for severe discrepancy the need for a 24 point difference between IQ and achievement. Just yesterday I stupidly asked "What about the child with a 23 point difference." Answer: " classification." States, districts, and teams have the right to determine for themselves what a severe discrepancy is. Just because you dont find one doesn't mean the child cant qualify.

I am not in anyway advocating an opening of the floodgates. (We may already have that with EBD - Every Body's Disabled or was that Emotional/Behavioral Disturbed?) I simply believe that many children with true learning problems and true learning disabilities are being unfairly, and I claim illegally, excluded from services because of the over reliance on the severe discrepancy statements. I repeat that the law says MAY determine, not must determine. May does not mean must. There are other places in the revised IDEA where the schools are allowed to do things if they choose. The language there is also MAY, not must. Look at section 20 USC 1413 (g) School based improvement plan:

"Each LEA may in accordance with paragraph (2), use funds made available...."

This doesn't demand that the LEA do it, only that it has the option to do it.

Sure there should be some "discrepancy" that triggers concern, but lack of a numerical discrepancy may be the result of inadequate assessment, improper choice of tests/tasks, misunderstanding of statistical aspects (correlation, regressions), etc.. Can't a discrepancy be that the child is in 3rd grade, has been given wonderful access to good teaching (avoiding the common disorder of dyspedagogia), has had interventions that seemed possible to solve the problems, and yet the child is still, based on a curriculum-based assessment, well behind where he/she should be. I say the team has the right to determine that all things being considered, this child has a severe discrepancy and is eligible for classification.

I think Shakespeare had it right when he wrote "Much Ado About Nothing." If Joseph Heller were alive today he could add Severe Discrepancy Discrepancies to his wonderful Catch-22. I think that a district that recognizes a problem in a child and refuses to classify because of the severe discrepancy clause is going to run the risk of alternative interpretations. I say, if the discrepancy is there you have an added indicator that the child is in trouble, but when the numbers dont fit, you still can classify. We would not use the clause as the sole reason to include a child, we certainly should not use it not as the sole reason to exclude.

Hope that makes sense.


Aged based scores are typically lower for those whose age is greater than average compared to grade based scores. If the average student from a given area is older than the average student in a given grade in the norm group (are we confused yet?), then the age norms are more likely to be lower than the grade norms.

In other words, Dave may be correct in his assumption that age norms are lower than grade norms if students in his district are generally older than the norm group to whom they are being compared. In a bilingual population, that may very well be true, due to the greater likelihood of retention and late enrollment. In other areas, the converse may be true.

Ron Anderson

>Go to


>This is Margaret Kay's site and you will find a short article called "Problems Associated with the Use of Grade Equivalence Scores. In addition to the excellent suggestions you have already received, you might want to look at the Assessment/Evaluation section at School Psychology Resources Online. In particular, look at Glossary of Measurement Terms and Some Things Parents Should Know About Testing. The latter has a simple explanation of grade equivalents and other testing concepts. two possible resources:




>Sandra Koser Steingart <

John Willis <>


I had the same goal this past year in my district only I didn't do it by having a large workshop. We have a language arts person who holds grade level meetings with teachers and these meetings are much more informal and relaxed. The language arts person gave me an hour at each of the grade level meetings and I put it into my schedule.

I gave a short presentation explaining the different types of test scores, what they are, what they mean, and how certain ones influence the decision making process (contribute to Type II error). Then, I used scores from the teacher's classroom reading tests and children's cumulative files from their school (without names) as examples, and we did a short exercise where we found and interpreted real scores that the teachers use every day. That made it relevant to them.

I was surprised how little background and training teachers actually had in interpreting test results. Even reading teachers who give standardized tests all the time. They were all very interested in learning, very appreciative for the chance to learn more about scores, and are using their knowledge of scores form these sessions in our Child Study Meetings and PPT's. It was a rewarding experience for everyone.

Good Luck,