School Psychologists or Soothsayer?

Ron Dumont & Rob Finn

This article was first published in the NASP Communiqué

This past month, the editors received a letter from a school psychologist asking some questions that seems fairly common in evaluations. The questions are often raised by team members and parents and the school psychologist is delegated to answer them.

"When I review the test scores at team meetings, sometimes I report specific weaknesses significantly below average on each scale, Verbal and Performance. Teachers and parents alike will then ask how do these specific weaknesses impact academic skill areas such as Reading, Writing, and Arithmetic. Do we know how specific low subtest scores on either Verbal or Performance scales singularly or in combination imply that there will be adverse effects on academic performance........Should I stick to saying the FSIQ is the best predictor of academic performance and ignore 1 or 2 low subtest scores or is there data to support how low subtest scores can mean trouble for an academic area?"

This is a complex question dealing with a number of important issues. It would be impossible to try to answer the questions with a simple yes or no response. We will attempt an answer to the question by addressing the differing levels of interpretation that are alluded to in this letter. These include subtest, subtest groupings (clusters) and composite levels of interpretation.

The question of how do specific subtest weaknesses impact academic skill areas is not as straightforward as it may sound. Depending on the subtest, a low score may be tapping into one component of a particular complex skill area, but it may also be nothing more than a statistical anomaly.

Subtest level interpretation is typically thought to be the least reliable and valid strategy (Kaufman, 1979 Sattler, 1988, Elliott, 1990). When interpreting any test, it is important that those interpretations be based upon the most reliable aspects of the test. Interpretation should focus on the most general and reliable of scores and these are typically derived from the entire test (Full Scale IQ for example). Below these measures come the Index or Cluster scores, followed by shared ability factors, and finally the individual subtests. Although subtest scores are related, they differ in item content and test administration and thus these differences cause the subtest scores to vary. Subtests can, and do, differ from each other. Before one can evaluate the differences between what appear to be high or low subtest scores, one must evaluate whether these apparent differences are enough to warrant interpretation. To do so we must know if the difference is large, reliable, and significant. In statistical terms, each subtest carries with it components of shared common variance, while at the same time most also have some proportion of specific, reliable variance. Before attempting individual subtest interpretation, one must be sure that the subtest being interpreted has adequate specific variance. For example, on the WISC-III, Object Assembly has much common variance but little specific variance and probably should not be interpreted in isolation. Subtest level strength and weaknesses must be interpreted cautiously not only because of this low specificity but also because variations (strengths or weaknesses) are common. Kaufman noted that for the Wechsler scales it was very common for children to have rather large intersubtest variability which produced the peaks and valleys in scaled scores that often serve as interpretive points.

A second caution about predicting academic performance on individual subtest strengths and weaknesses is that academic skill areas involve multiple cognitive processes working in parallel. Reading, for example, is a complex process involving basic skills, conceptual understanding, and cognitive strategies. This can be further broken down into the knowledge about letters, phonemes, morphemes, words, ideas, schema, and subject matter as well as decoding, literal comprehension, inferential comprehension, and comprehension monitoring. It becomes clear that a particular low subtests score may only scratch the surface when it comes to fully understanding its academic implications. It must also be remembered that variations between and within the different functions do occur as a result of the individual's uniqueness and therefore these variations may simply be describing that uniqueness and not necessarily any 'difficulty.' This is not to say that strengths and weaknesses are without value, but rather that identification of a learning problem must be made on an individual basis. Subtest analysis is obviously only one piece in the assessment pie, and probably not the best piece at that.

Does the use of composites or 'profiles' increase ones ability to predict learning difficulties and thus academic performance? The use of composites is certainly more statistically reliable than individual subtests, but the cautions about psychometric properties remains applicable. Before a prediction can be made about a person based on the relative value of a composite, one needs to know the reliability and integrity of such a composite. As an example, the WISC-III Freedom from Distractibility index has a high reliability coefficient yet accounts for only 3-4% of the test's common variance. Secondly, before interpreting this factor, one must be sure that it is meaningful; that the subtest scores that make it up have measured a similar skill. If the Arithmetic and Digit Span subtest scores differ by 5 points, the interpretive utility of this factor is lost.

Do certain profiles exist that can be used to identify/predict learning difficulties? The continuing scholarly debate about this issue seems only to add confusion to what we do. The answer to this question seems to depend on who you believe and the approach used to justify the analysis. For example, Kavale and Forness argued against the utility of profile analysis in their article "Meta-analysis of WISC-R Profiles-Patterns or Parodies." (1984). Their analysis of Wechsler scale data from 9,372 learning disabled children failed to distinguish these children from their normal peers on any of the ability patterns that have conventionally been held to characterize LD children's test performance. This was followed by Lawson and Inglis' "Micro-interpretation or Misinterpretation? A reply to Forness, Kavale, and Nihira." (1987). Lawson and Inglis argued that their learning disability index did reveal a pattern that distinguishes LD from a normal sample. More recently, Keith, et. al.(1992), added "Profile Analysis with the Wechsler Scales: Patterns, not Parodies". In this paper, the authors disagree with the conclusions of Kavale and Forness and offer the opinion that "proper analysis of the data reported in their article reveals that many such profiles and recategorizations are indeed significantly different for the two groups."

Is the FSIQ the best predictor of academic performance? To answer this, one might ask "Best in comparison to what? Better than the Verbal and/or the Performance IQs? or better than a comprehensive achievement assessment? or better than a review of the students' past history including report cards and grades?" Much has been written about the relationship between IQ and prediction of school achievement. Regarding that relationship, Kaufman (1990 pg 18) reviewed a number of studies and noted, regarding the correlations, "The overall value of .50 is high enough to support the validity of the IQ for the purpose that Binet originally intended it, but low enough to indicate that about 75% of the variance in school achievement is accounted by factors other than IQ." On related matters, an entire issue of the Journal of Learning Disabilities (October 1989) was devoted to a debate about the relevance and usefulness of IQ in the assessment and determination of learning disabilities. Comments and questions have also been raised about the integrity of IQ scores for learning disabled children. (Communiqué 1988 and 1994). Does the presence of a learning disability affect the scores on an IQ test so as to call into question the issue of current functioning versus potential functioning?

To sum up, let us suggest that by determining the specific qualities measured by a subtest, a factor, or a composite score and by analyzing its relative psychometric integrity, the clinician "may suggest" that a child has a potential for experiencing difficulty in a particular academic area. However, it is important to remember that an intelligence test is not meant to be used as a diagnostic instrument. Rather, a weak performance on a particular subtest or subtests may prove most useful as a compass guiding the course your assessment takes. IQ tests are typically very good instruments for generating a hypothesis about someone's strengths and weaknesses, but they are poor diagnostic instruments for evaluating a learning disability. Particular patterns on an intelligence test may give hints to a possible weakness or disorder, but the assessment of such things is typically done with other tools.

Identifying a person's strengths and weaknesses is a process involving empirical guides while the interpretation of the strengths and weaknesses requires clinical inferences and a broad theoretical base. It may be similar to the issue of sight versus insight. The identification of true strengths or weaknesses gives us some 'sight' but provides little insight. Let's stick to what we see, and not get sucked into becoming soothsayers.

