Since 2002, 4 million visitors plus:
hit counters
search engine optimization service

  Appletcollection Vertical Menu java applet, Copyright 2003 GD

Comprehensive Test of Phonological Processing (CTOPP): Cognitive-linguistic assessment of severe reading problems

James E. Lennon press name to email author

New Jersey City University

Christine Slesinski

Monroe-Woodbury Central School District

Wagner, Torgesen, & Rashotte (1999) recently offered the Comprehensive Test of Phonological Processing (CTOPP) as a measure of phonological coding. This measure may be of value to school psychologists who are interested in the etiology of severe reading disorders. Rather than seeking to identify intelligence-achievement discrepancies of limited utility, assessment approaches, such as the CTOPP, that measure phonological coding abilities may help school psychologists to more accurately differentiate students with learning disabilities from students who may be experiencing academic failure as a result of other causes.

Phonological coding consists of the analysis and synthesis of phonemes (the smallest unit of recognized sounds). Beginning readers who have deficits in phonological coding seem to have difficulty naming letters of the alphabet, identifying sounds for alphabet letters, segmenting words into phonemes and syllables, and applying knowledge of letter-sound correspondence to decode words (Vellutino, et al., 1996). Phonological coding is an oral language skill. It involves analysis such as recognizing that the first sound of the word ball (/b/), can be replaced with /t/ to produce the word tall. Phonological coding abilities associated with this process of changing ball to tall include letter-sound correspondence, phonemic awareness and segmentation, and working with information in phonological memory. It also involves the synthesis of sounds into words. Since the most common forms of severe reading problems are caused by deficits in one or more aspects of phonological coding, school psychologists should consider including measures specifically designed to address this cognitive-linguistic process in their assessment of cognitive functioning.

The recently published Comprehensive Test of Phonological Processing (CTOPP; Wagner, Torgensen & Rashotte,1999) has been designed as an extension and improvement over commercially available tests of phonological coding, including the Test of Phonological Awareness (TOPA; Torgensen and Bryant, 1994), the Lindamood Auditory Conceptualization Test (LAC; Lindamood & Lindamood, 1979), and the Phonological Awareness Test (PAT, Robertson & Salter, 1995). The CTOPP provides greater extension across age ranges, has stronger psychometric properties than previous measures (Torgesen & Wagner, 1998), and will be reviewed in the paragraphs that follow.

Wagner, Torgensen & Rashotte (1999) developed the CTOPP in a manner consistent with their theoretical assumptions about the nature of phonological coding deficits. They present a three-part model, based on earlier studies in this area (e.g., Torgesen & Wagner, 1998; Wagner & Torgesen, 1987), consisting of the following:

    • (a.) Phonological awareness: analysis and synthesis of the sound structure of oral language. The order of progression of phonological awareness starts with syllables and moves toward smaller units of speech sounds (Adams, 1990). Phonological awareness provides individuals with the ability to break words into syllables and component phonemes, to synthesize words from discrete sounds, and to learn about the distinctive features of words (Torgesen & Wagner, 1998).
    • (b.) Phonological memory: coding information phonologically for temporary storage in working or short-term memory. Phonological short-term memory involves storing distinct phonological features for short periods of time to be "read off" in the process of applying the alphabetic principle to word identification.
    • (c.) Rapid naming: efficient retrieval of a series of names of objects, colors, digits, or letters from long-term memory. Rapid naming of verbal material is a measure of the fluid access to verbal names, in isolation or as part of a series, and related efficiency in activating name codes from memory (Wagner, Torgesen, & Rashotte, 1999).

Grounded in a theory of phonological processing, supported by both empirical studies and confirmatory factor analytic findings, the CTOPP was designed to measure a student’s ability in these three domains (Wagner, et al., 1997).

The CTOPP is intended to provide a reliable, valid, and standardized measure of phonological coding. The authors developed two versions of the measure, one for kindergarteners and first graders (ages 5 and 6) and the second for second graders through college students (ages 7 through 24). A total of 12 subtests (6 core and 6 supplemental) are provided. Subtests typically consist of 18 to 20 items, providing adequate floors and ceilings. The CTOPP is individually administered, and requires about 30 minutes of testing time to administer the core subtests.

The test produces three core composite scores: (a.) phonological awareness, comprised of Elision, Blending Words and Sound Matching for 5 and 6 year-olds and Elision and Blending Words for persons 7 to 24-years-old; (b.) phonological memory, consisting of Memory for Digits and Nonword Repetition for all individuals and (c.) rapid naming comprised of Rapid Color Naming and Rapid Object Naming, and Rapid Digit Naming and Rapid Letter Naming, for younger and older students respectively. Rather than relying on a single measure, the CTOPP provides two measures for each composite score, increasing the reliability of the measurement of the construct. Additional alternate measures of phonological awareness and rapid naming are provided for clinical and research interest. Interpretation involves the conversion of raw scores into percentile ranks and standard scores (mean = 10, standard deviation = 3 for subtests; mean = 100, standard deviation = 15 for composite scores). The authors offer, but caution against using, age and grade equivalent scores

The measure was normed on a stratified sample of 1,656 individuals, reflecting the demographic status of the US population in 1997. Between 76 to 155 students were included in each age range, with the greater representation in the youngest age ranges. Normative information in provided by the half-year for 5 and 6-year-olds, by the year for 7 through 17-year-olds, and for an 18 to 24-year-old composite group.

CTOPP subtests were derived from experimental tasks used in the research literature to assess phonological processing. Pilot studies allowed for extensive item and subtest analyses, including classical item analyses, item-response theory, and confirmatory factor analyses. There were careful efforts to design empirical tasks that were representative of the constructs in question.

For example, the phonological memory composite includes Memory for Digits and Nonword repetition. Though similar to the Digit Span test of the WISC-III, Memory for Digits is presented via audiocassette recorder at a faster rate of presentation (two digits per second) and with a specification of forward-only recall. These modifications were meant to stress the efficiency of the phonological loop, i.e., "brief verbatim storage of auditory information" (Wagner, Torgesen & Rashotte, 1999, p.5), avoiding the involvement of other cognitive processes, such as rehearsal and elaboration. On Digit Span, many students "think through" the backward recall of numbers or refresh the phonological loop through rehearsal, calling on other cognitive strategies. The modifications on the CTOPP attempt a purer measure of the underlying phonological process that is not confounded by other cognitive operations.

Nonword repetition has also been shown to be a good measure of phonological memory in experimental tasks (Gathercole & Baddeley, 1990). The authors created the orthographically legitimate, or plausible English language, items such as "nirp" by randomly combining phonemes to fill slots in syllables, discarding non-pronounceable ones. This was done to avoid the possible confound of using analogies to real words, once again avoiding the use of cognitive-linguistic processes other than phonological memory (R. Wagner, personal communication, May 15, 2000). Similar to most subtests, Nonword Repetition requires the use of an audiocassette recorder to ensure standardized administration, particularly as the items become more difficult.

Measures of both analysis (Elision) and synthesis (Blending Words) are included in the composite for phonological awareness, consistent with recent factor analytic findings (Flanagan, McGrew, & Ortiz, 2000; Wagner, et al., 1997). Alternate measures of the analysis and synthesis components of phonological awareness using nonwords are offered for experimental or clinical interest. Additionally, experimental measures with unlimited ceilings are included. For example, Phoneme Reversal requires the repetition of a nonword, reversing the order of sounds, and pronouncing the resultant word. ("Say teef. Now say teef backward. " Answer: feet.) Phoneme reversal requires the coordination of phonologic, strategic, and memory processes. As noted in the manual, the measures of the CTOPP can be quite challenging for both the examiner and examinee.

Some researchers describe Rapid Naming as part of a "double deficit hypothesis" (Wolf & Bower, 1999), representing a separate category of severe reading deficits along with phonological coding deficits. However, on the CTOPP as a result of confirmatory factor analytic findings, the rapid naming composite is thought to be a component process of the phonological coding construct, correlated to other components, but containing unique variance as well (Wagner, Torgesen, & Rashotte, 1999).

The CTOPP appears to have sound technical features. Reliability estimates of internal consistency of the items are provided. The age interval alpha coefficients of the CTOPP subtests reach .80 reliability, 76% of the time, while the CTOPP composite scores reach the .80 reliability criterion. Standard errors of measurement for the composite scores are relatively low, suggesting the composite scores are reliable measures of student performance.

Reliability over time was estimated by the test-retest method, and ranged from .70 to .97 for individual subtests and .78 to .95 for composite scores. Measurement reliability is improved by using more than a single subtest to report composite scores.

Validity information is offered in the form of (a.) a detailed discussion of the rationale used in selecting items and subtest format, (b.) conventional item analysis and response theory modeling and (c.) logistic regression and delta scores to detect bias. Little or no bias in the groups investigated was reported. Item discrimination and item difficulty statistics reach acceptable levels. Criterion-related validity is reported between concurrent measures, such as the Lindamood Auditory Conception Test, and predictive measures, such as the Woodcock Reading Mastery Test – R (Word Attack and Word Identification subtests). Finally, construct validity is reported in the form of confirmatory factor analysis and studies of age group differentiation.

Using the CTOPP: Domain-specific deficits vs. IQ/Achievement discrepancies

Now consider the case of an entering student, Hannah, a five-year, eight-month old kindergarten student in the same school district. Hannah, the oldest of three children, currently lives at home with both parents and her younger siblings. Hannah was born following a full-term, uncomplicated pregnancy and achieved developmental milestones within normal limits. Between the ages of one and three, Hannah suffered from recurrent ear infections. At the age of three, she underwent surgery to remove her adenoids and have drainage tubes placed in her ears. Prior to attending kindergarten, Hannah attended an academically oriented preschool for two years. Based on her below average performance on a kindergarten screening measure, Hannah was placed in a regular kindergarten classroom, but referred for remedial reading instruction.

After the first marking period, Hannah’s classroom and remedial teachers referred her to the CSE. At the time of the referral, her teachers described Hannah as an intelligent student who had made a good social transition to kindergarten. They noted that she was friendly, eager to please, and attentive in class. Despite this, they expressed concerns that Hannah just didn’t seem to be "getting it" when it came to reading. They noted that she could identify only 8 letters consistently, did not evidence knowledge of letter-sound correspondence, had difficulty rhyming words, and was unable to identify sounds in spoken words. While she could write her name, they described this ability as a "rote-learned" skill. Psychological testing revealed average intelligence (WPPSI Full Scale IQ = 105), with no significant inter-subtest scatter. Commensurate with her intelligence, Hannah’s achievement was measured to be in the average range (WJ-R Broad Reading standard score = 95; WJ-R Broad Written Language standard score = 101.) Despite average intelligence, average achievement, and seemingly appropriate early educational experiences, Hannah was having considerable academic difficulty in her kindergarten classroom. Once again the school psychologist was left to wonder what was going on. The absence of an aptitude-achievement discrepancy made it difficult to attribute Hannah’s reading problems to a learning disability. Rather than waiting to see if a learning disability "developed," the school psychologist wanted to take a closer direct look at the cognitive-linguistic operations that underlie beginning reading.

In this case, the teachers suspected that the student’s reading difficulties resulted from learning disabilities, prompting referrals to the CSE. In turn, the school psychologist attempted to diagnose learning disabilities in the student by identifying an aptitude-achievement discrepancy. However, the search for aptitude-achievement discrepancies left important questions unanswered. While many IQ tests do not address the processes that are associated with significant reading difficulties (Fletcher, et al, 1998; Flanagan, McGrew, & Ortiz, 2000), one might argue that IQ tests are necessary to rule out basic process disorders. However, Stanovich has argued that the concept of unexpected intelligence- achievement discrepancies has "led us astray’" (1991,p.7). He suggests cognitive-linguistic deficits are not necessarily restricted to students in the average to above average range and argues that the measurement of such deficits must be a domain-specific process.

School psychologists search for sources of severe reading problems in various ways. Typically, the search involves identifying students who have significant aptitude-achievement discrepancies as learning disabled. However, serious concerns have been raised about the validity and reliability of this practice (for cogent summaries, see Fletcher, Francis, Shaywitz, Lyon & Shaywitz, 1998; Siegel, 1989; Stanovich, 1991; Vellutino, Scanlon, & Lyon, 2000). These studies, in part, question the relevance of administering global measures of intelligence, which do not tap reading related cognitive abilities, to students suspected of having learning disabilities (see also Flanagan, McGrew, & Ortiz, 2000 for a related discussion). Converging research evidence strongly suggests that the most common forms of severe reading problems are caused by deficits in one or more aspects of phonological coding, a cognitive linguistic ability (Morris, et al., 1998; Torgesen & Wagner, 1998; Vellutino, et al., 1996). Deficits in phonological coding distinguish between average and deficient beginning readers, and predict which deficient readers will demonstrate a limited response to instruction (Vellutino, et al., 1996).

How then might school psychologists best gain diagnostic information about the cognitive processes underlying severe reading problems? How can Hannah’s problems be more clearly understood? Many educational researchers are suggesting that the direct assessment of phonological coding is an appropriate avenue of inquiry, because deficits in this area seem to serve as a bottleneck, impeding the development of robust reading skills (Morris, et al., 1998).

Returning to our case studies, Hannah received the core version of the CTOPP appropriate for their chronological age. Her composite scores were all below average (Phonological Awareness, 75; Phonological Memory, 80; Rapid Naming, 79). Her scores were consistently below average on all of the core subtests. Since the opportunity for instruction should precede disability determinations (Vellutino, et al.,1996), Hannah received phonological segmentation training within a balanced, intensive remedial program (see Lennon & Slesinski, 1999; Wagner, et al., 1998; Vellutino, et al., 1996, for discussions of phonological processing and reading).

Hannah did not make as much relative improvement after 10 weeks as was hoped for. She could identify more alphabet letters than when first seen, but continued to have difficulty associating the appropriate letter sounds. In oral language skills she had difficulty segmenting compound words into syllables and syllables into phonemes. After 20 weeks of remediation, the Committee on Special Education concluded that Hannah was a difficult-to-remediate child, who would benefit from being followed in a formal manner by the Committee. Additional medical information about transient, recurrent ear infections was sought. She was identified as having a learning disability, but was continued in a similar remedial program because progress was noted in both phonological awareness and word reading after 20 weeks.


School psychologists may find the CTOPP challenging to administer. Familiarity and practice in using the audiocassette recorder is needed, since many of the subtests required its use and because many items are novel constructions. School psychologists may not be familiar with listening for subtle distinctions in phoneme repetition, for example. While experienced examiners may be tempted to discard the audiocassette recorder, standardization requires its use. Half-year age norms were helpful in Hannah’s case, but only year-level norms are available for 7-year-olds and older. The face sheet and protocol are laid out logically, but do not provide enough space to record the examinee’s errors for subsequent clinical interest. The manual provides a discussion of severe discrepancies and guidance in computing difference scores, but little direction as to the meaning of such discrepancies. This section would probably be best omitted, particularly in light of the need for a more complete discussion of discrepancy scores as noted above.

Overall the CTOPP appears to be an example of a theory-based, well-researched instrument of a domain-specific aspect of cognitive-linguistic functioning. It provided important information regarding processes that underlie beginning reading skills, and when used in conjunction with curriculum-based measures, trial teaching, and other formal assessment information is likely to aid in understanding the problems some children have learning to read.


Adams, M. J. (1990). Beginning to read: Thinking and learning about print. Cambridge, MA: MIT Press.

Flanagan, D. P., McGrew, K. S., & Ortiz, S. O. (2000). The Wechsler . intelligence scales and Gf-Gc theory: A contemporary approach to interpretation. Boston: Allyn and Bacon.

Fletcher, J. M., Francis, D. J., Shaywitz, S. E., Lyon, G. R., Foorman, B. R., Steubing, K. K., & Shaywiz, B. A. (1998). Intelligent testing and the discrepancy model for children with learning disabilities. Learning Disabilities Research and Practice, 13, 186-203.

Gathercole, S. E., & Baddeley, A. D. (1990). Phonological memory deficits in language disordered children: Is there a causal connection? Journal of Memory and Language, 29, 336-360.

Lennon, J. E., & Slesinski, C. (1999). Early intervention in reading: Results of a screening and intervention program for Kindergarten students. School Psychology Review, 28, 353-364.

Lindamood, C., & Lindamood, P. (1971). Lindamood auditory conception test. Austin, TX: PRO-Ed.

Morris, R. D., Stuebing, K. K., Fletcher, J. M., Shaywitz, S. E., Lyon, G. R., Shankweiler, D. P., Katz, L., Francis, D. J., & Shaywitz, B. A. (1998). Subtypes of reading disability: Variations around a phonological core. Journal of Educational Psychology, 90, 347-373.

Robetson, C., & Salter, W. (1995). Phonological Awareness Test. East Moline, IL: LinguiSystems.

Siegel, L. S. (1989). IQ is irrelevant to the definition of learning disabilities. Journal of Learning Disabilities, 22, 469-479.

Stanovich, K.E. (1991). Discrepancy definitions of reading disability: Has intelligence led us astray? Reading Research Quarterly, 26, 7-27.

Torgesen, J. K., & Bryant, B. R. (1994). Test of phonological awareness. Austin, TX: PRO-ED.

Torgesen, J. K., & Wagner, R. K. (1998). Alternative diagnostic approaches for specific developmental reading disabilities. Learning Disabilities Research and Practice, 13, 220-232.

Vellutino, F. R., Scanlon, D. M., & Lyon, G. R. (2000). Differentiating between difficult-to-remediate and readily remediated poor readers: More evidence against the IQ-Achievement discrepancy definition of reading disability. Journal of Learning Disabilities, 33, 223-238.

Vellutino, F. R., Scanlon, D. M., Sipay, E. R., Pratt, A., Chen, R., & Denckla, M. B. (1996). Cognitive profiles of difficult-to-remediate and readily remediated poor readers: Early intervention as a vehicle for distinguishing between cognitive and experiential deficits as basic causes of specific reading disability. Journal of Educational Psychology, 86, 601-638.

Wagner, R. K., & Torgesen, J. K. (1987). The nature of phonological processing and its causal role in the acquisition of reading skills. Psychological Bulletin, 101, 192-212.

Wagner, R. K., Torgesen, J. K., & Rashotte, C. A. (1999). Comprehensive test of phonological processing. Austin, TX: PRO-ED.

Wagner, R. K., Torgesen, J. K., Rashotte, C. A., Hecht, S. A., Barker, T. A., Burgess, S. R., Donahue, J., & Garon, T. (1997). Changing relations between phonological processing abilities and word-level reading as children develop from beginning to skilled readers: A 5-year longitudinal study. Developmental Psychology, 33, 468-479.

Wolf, M., & Bowers, P. G. (1999). The double-deficit hypothesis for developmental dyslexias. Journal of Educational Psychology, 91, 415-438.