Since 2002, 4 million visitors plus:
hit counters
search engine optimization service

  Appletcollection Vertical Menu java applet, Copyright 2003 GD

By Hubert Lovett

Subsequent to a discussion among several members of this list (Matthew Warren, John Willis, Ron Dumont, etc.), I decided there may be some confusion about regression methods used to identify sever discrepancies.  I decided to write a more or less complete statement of those methods.  I do hope it will be of value to someone. At the outset, I would like to thank John Willis, Ron Dumont, and Matthew Warren for their input. These are classy folks, as helpful as they are knowledgeable.

I will here try to describe the rational and method of using regression to determine severe discrepancies in diagnosing learning disabilities.  A severe discrepancy in achievement occurs when a child 's achievement deviates severely from what one would expect.  It is essential, therefore, to establish expectation for a particular child.  Few test scores ever coincide exactly with what is expected.  In making a decision to label one discrepancy as severe and one as normal, some criterion must be established to which to compare an actual discrepancy.  This application of decision theory will also be discussed here.  I will then compare the method presented here with a method described by Cecil Reynolds in Chapter 24 of Handbook of Psychological and Educational Assessment of Children :  Personality, Behavior, and Context by Cecil R. Reynolds (Editor), Randy W.  Kamphaus (Editor). Hardcover (1990). 

Terminology:

Y = Achievement score,
X = IQ score,
Y' = Predicted achievement score,
MY = Mean achievement score,
MX = Mean IQ score,
SDY = Standard Deviation for achievement scores,
SDX = Standard Deviation for IQ scores,
rYY = Reliability for achievement scores,
rXX = Reliability for achievement scores,
rXY = Correlation between achievement scores and IQ scores,
TY = True score for achievement scores for a particular child,
EY = Expected value of Y,
e = Y - EY,
SE = Standard error of estimate when Y' is determined using X, and
zpn  = normal deviate for probability = p and the number of type of test = n, one tailed or two.

There is no particular need to translate X and Y to the same metric.  However, if this is done, it should be accomplished before calculations begin.

Assumptions:

  1. Y is normally distributed,

  2. The regression of Y on X is best described by a straight line,

  3. Variance of Y on X is independent of X, and

  4. The best method of determining EY is the method that minimizes SUM(e^2).

Because of assumption 2 above, the formula for predicting achievement given IQ is a special case of the general linear formula and is given by

Formula 1

Y' = SDY(rXY((X - MX)/SDX)) + MY.

It can be shown that, given assumption 2, using Y' as EY will minimize SUM(e^2).  Therefore, using Y' as expectation for Y will satisfy assumption 4 above, whereas using MY or X as the expected value of Y will not satisfy this basic assumption.  One object in measurement is to minimize error.  Since e is error, we would like to minimize it.  However, since e is an unknown for a particular person on a particular administration of a test, we can only hope to minimize it within a group.  Summing e across a group is fruitless.  The mean is zero, and, therefore, so is the sum.  If we square e before summing, then the result must be a nonnegative number contingent upon e.  That is why we stipulate assumption 4 above.  There are those who would like to use MY as an estimate of EY.  Others would use X.  Neither of these will minimize error.  The attractiveness of either is based mostly on concern for a child not learning as well as his age mates and on the convenience of calculation.

While using Y' as an estimate of EY minimizes SUM(e^2) for a group, it may  not minimize SUM(e^2) for a particular person.  The task in the next section is to determine whether it is reasonable to believe that Y minimizes SUM(e^2) for a particular person.  This is tantamount to asking whether Y' = TY.

To establish a criterion against which to compare actual performance, it is necessary to select a unit of measurement for deviations from expected.  Given assumption 1 above, the natural unit of measurement is some type of standard deviation.  In this case, SE is the appropriate unit.  Given assumptions 1 and 3 above, SE is given by the following formula:

Formula 2

SE = SDY(Sqrt(1 - rXY^2)).

We next pose a question. The exact nature of the question reflects our philosophy of severe discrepancies.    The two most common methods of stating this question are:

  1. Is it reasonable to believe that, for child C, TY = Y'.

  2. Is it reasonable to believe that, for child C, TY > Y'.

As Matthew pointed out, if we think we are concerned with question 1, then we would select a probability and normal deviate such that a deviation from expectation in either direction must be explained.  Most school psychologists have tested children whose achievement scores significantly exceed expectation.  This is sometimes more difficult to explain than the child who underachieves.

If we take the approach that severe discrepancies only fall below expectation, then question two is the appropriate question.  Deciding which question is appropriate in a particular situation is of major importance.  It is one of the two chief concerns in selecting a normal deviate for use in the next step.

Those trained in research will immediately recognize that the above questions correspond to the null hypotheses used in research.  Question one evokes the use of a two-tailed test, while question two, a one-tailed test.  In a very real sense, determining whether a particular child has a severe discrepancy is testing an hypothesis about that child.  The logic is the same as in hypothesis testing.  The null hypothesis is assumed to be true.  This gives us a way of determining the probability of various events.  We can tell which events are common and which are rare.  When we test the hypothesis, we allow an event to occur and observe whether it is a rare event.  The presence of rare events creates doubt about the truth value of the null hypothesis.

For example, suppose we are playing the old game, Twenty Questions, and are trying to identify an object.  We have developed the null hypothesis that the object is a dog.  Before venturing a "guess" as to what the object is, we propose to test the hypothesis with a question, the possible answers to which have known, relatively speaking, probabilities.  We ask the question,  "How many legs does this object have?"  In my experience, the most likely answer, given that the hypothesis is true, is 4.  However, in my life I have seen several three-legged dogs, one two-legged dog, and one five-legged dog (As an aside, I must admit that I paid 50 to see the five-legged dog).  I have seen pictures of a six-legged dog and an eight-legged dog.  Suppose we get this answer to our question, "It has six legs."  This is not an impossible answer, but it is rare.  It is so rare that most of us would decide to reject the hypothesis as untenable.

The question is, "How rare must an event be before we decide to reject the hypothesis as untenable in the face of the data?"  As Reynolds (1990) argues, the traditional values are a likelihood of less than, or equal to, five in a hundred (.05 level), or less than, or equal to, 1 in a hundred  (.01 level).  Ultimately, the probability level must be set by the person making the decision.

Suppose we decide that any deviation from expectation, above or below, are of interest and that rare events have probabilities less than, or equal to,  0.05.  On the normal curve, the normal deviate that corresponds to this decision is z = 1.96.  

                       xxxxx
                   xxxxxxxxx
             xxxxxxxxxxxxxxx
          xxxxxxxxxxxxxxxxxxx
      xxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  -2         -1         0         +1         +2 
  -1.96                                         +1.96

We would, therefore, calculate two critical values, one 1.96 SE above Y' and one 1.96 SE below Y'.  Actual values between these two critical values would be common events.  Actual values outside these two critical values would be rare events.  The formula for the critical values would be:

Formula 3

Critical values = Y' +/- 1.96SE.

If, on the other hand, we decide that only values below expectation are of interest, then on the normal curve, the normal deviate that corresponds to this decision is z = 1.65.  We would, therefore, calculate only one critical values, 1.65 SE below Y'.  Actual values equal to, or below, this critical value would be rare events.  The formula for the critical value would be:

Formula 4

Critical value = Y' - 1.65SE

The two formulae may be generalized as follows:

Formula 5

Critical values = Y' +/- zp2SE, and

Formula 6

Critical value = Y' - zp1SE.

Matthew Warren posted some data to the list for which he had accomplished the calculations necessary to decide whether a particular child has a  severe discrepancy.  I have appropriated his data and will use it to illustrate the above formulae.  The data are given below:

Matthew Warren wrote:

Scores:
FSIQ(wisc3) = 80
WJ(Writing Fluency) = 62
DATA:
Correlation (FSIQ, Writing Fluency) = .60
Reliability(FSIQ) = .95
Reliability(Writing Fluency) = .95
Calculated values:
Predicted WJ (Writing Fluency) = 88
Standard Error of Estimate = 12

Critical values (95% confidence), given that any deviation from expectation must be explained, would be an achievement score less than or equal to 64 or a score greater or equal to 112.

Critical value (95% confidence), given that only negative deviations from expectation are of interest, would be an achievement score 68 and below

Formula 1

Y' = SDY(rXY((X - MX)/SDX)) + MY

     = 15(.60((80 - 100)/15)) + 100 = 88.

Formula 2

SE = SDY(Sqrt(1 - rXY^2))

      = 15(Sqrt(1 - .60^2)) = 12.

Formula 5

Critical values = Y' +/- z(.05)2SE

                       = 88 + 1.96(12) = 111.52, and

                       = 88  -  1.96(12) = 64.45. 

Note that in the application of Formula 5, the first value, if not an integer, always rounds up to the next possible score, and the second value, down.  In this case, rounding goes to the nearest integer, but that is not always the case.  Therefore, as Matt said, test scores of 112 and above and 64 and below indicate a severe discrepancy at the .05 level of significance.  If you have a machine that yields anything else, the machine is wrong.

Formula 6

Critical value = Y' - z(.05)1SE

                     = 88 - 1.65(12) = 68.2 

Note that in the application of Formula 6, the value, if not an integer, always rounds down to the next  possible score.  In this case, rounding goes to the nearest integer, but that is not always the case.  Therefore, as Matt said, test scores of 68 and below indicate a severe discrepancy at the .05 level of significance.  If you have a program that yields anything else, it is wrong.

Up to this point, the formulae developed by Cecil Reynolds parallel the ones presented here.  Cecil, however, at this point shifts his focus.  When we test to see if a particular child has a severe discrepancy, there are four possible outcomes: 1) We correctly identify a child who really has a severe discrepancy [True positive]; 2) we correctly identify a child as not having a severe discrepancy [True negative]; 3) we erroneously identify a child as having a severe discrepancy when in fact he does not  [False positive]; and 4) we erroneously fail to identify a child as having a severe discrepancy, when in fact he does [False negative].  The significance level, often called alpha, that we use, .05 above, is the probability of a false positive.  The probability of a false negative is often called beta.  The relationship between alpha and beta is inverse and nonlinear.  If we decrease the likelihood of one type of error, then we increase the likelihood of the other.  After developing all the formulas given above, Cecil decided that he should add something to reduce the likelihood of false negatives, the value of beta.  He decided to do this without any notion what the value of beta was.  He reduced the difference between Y' and the critical value of a two-tailed test by 1.65SEresid, where SEresid was defined in Critical Measurement Issues in Learning Disabilities in the Journal of Special Education, 18, 451-467, 1984.

For the current example, the critical value becomes 70.90.  Clearly this does reduce the probability of a false negative, but it also increases the probability of a false positive.  We are no longer working at the .05 level, but at the .1556 level.  Cecil (1990) cites an example where the relevant z-score moved from 2.00 to 1.393.  This changed the probability from about .05 to .1646.  He seemed to have been somewhat confused about the question he was asking at the time.  He changed the probability to .082, as it would have been in a one-tailed test.  He clearly started his discussion using a two-tailed test, then stated that sever discrepancies went in only one direction.  When he reduced the distance to the critical value, he subtracted a value based on a one-tailed test.  The situation then is this: he selects only one of the two critical values from a two-tailed test and used a value from a one-tailed test to move that toward the mean.  Confused?  Worry not; it gets worse.  Cecil then drew a picture (his Figure 24.2) to clarify matters.  In this figure, he shows only one of the critical values of the original one-tailed test (happens not to be the one discussed in the text).  Then he both subtracts 1.65 and adds 1.65 to this critical value to get two more critical values.  Thus, he takes what should be a one-tailed test at the .05 level, stacks it on top of a two-tailed test at the .05 level, but runs it in both directions so that the probability would be .10.  At this point, I think I will give up trying to explain what he proposed.  The logic is clearly muddled and the issue of significance level becomes totally garbled.  In the two examples that I ran, Cecil's and Matthew Warren's, the significance level multiplied by about 3.  I think there is no way to predetermine what will happen to the significance level, but clearly it alters drastically with the addition of Cecil's invention.

Interestingly, Cecil added the "correction" to control beta.  He describes no way of determining what beta is, either before or after his fix.  There are methods of controlling beta, but not with Cecil's formula. 

Here's the damage done by this method.  Cecil gives a lengthy discussion of why the .05 level of significance is appropriate.  However, when he acknowledged that the significance level changed, he changes terminology.  Instead of significance, it becomes the "percent of the total population."  Would an astute reader miss this shift?  Ron Dumont and John Willis, the best in the business, missed it.  On their template for calculating severe discrepancies using Cecil's method, they specified that the results are at the .05 level.  Has anyone else missed it?  In the WIAT manual, page 188, the significance level is clearly identified as either .05 or .01, when it clearly is not.  Big Cecil himself acknowledged that the chances changed (1990, p. 552).

Developing procedures to assume control of beta, correctly is beyond the scope of this post.  I may do that another time.

Also, the WIAT manual (p. 189) implies that Cecil's procedures were used to establish the significance bands in its tables.  I have neither the time nor inclination to check that assertion.  Certainly I would view those tables with suspicion.

Hubert

              Irish Blessing
May the road rise to meet you.
May the wind be always at your back.
May the sun shine warm upon your face.
And rains fall soft upon your fields.
And until we meet again,
May God hold you in the hollow of His hand.

For a download of a severe discrepancy analyzer that follow Hubert's description press here.