Since 2002, 4 million visitors plus:
hit counters
search engine optimization service

  Appletcollection Vertical Menu java applet, Copyright 2003 GD

 

Obtained Score Differences on the Mesulam Continuous Performance Test: A Comparison between ADHD Subjects and Controls

Ron Dumont, Casey Stevens, Margaret M. Dawson, Richard Guare, and Michael Weiler

Normative data for the Mesulam Continuous Performance Test were gathered from a sample of approximately 1500 controls aged 6-14. These data were used as the basis for comparing the Mesulam performance of 170 children independently identified as having an attention-deficit disorder (ADHD). Due to noted methodological shortcomings, this article serves merely to highlight differences in task performance between groups. As these differences were quite drastic, and thus compelling, research within a more rigorous framework is warranted. Should the Mesulam withstand closer scrutiny, it appears to have advantages over other, more expensive CPTs. It may be found useful as a screener in schools, administered to large groups of children at once, and as a useful tool to be employed in comprehensive psychoeducational evaluations.

Attention-deficit/hyperactivity disorder (ADHD), a disorder originating in childhood and characterized by inattention, impulsivity, and hyperactivity, is estimated to occur in 3 to 9 percent of the school-aged population (Szatmari, 1992), although it has been argued that rates might be much higher than commonly estimated (Shaywitz & Shaywitz, 1988). In recent years, the widespread recognition of the disorder by the popular press has led to increasing numbers of children referred for diagnosis. Along with this has come concern, among both professionals and the lay public, that the disorder is being over-diagnosed. Thus, there is a pressing need for accurate diagnostic procedures.

Diagnosing ADHD typically requires a medical and developmental history of the child along with parent and teacher ratings of behavior. While such information is essential, there is a subjective aspect to this data that can restrict its utility. Evaluators have desired to supplement this information with methods which enable the attention problems to be observed and quantified on a more objective basis. In an effort to satisfy this need, laboratory tasks have been employed in the assessment of attention disorders. For example, beginning in the 1970’s, Sykes (Sykes et al., 1973) engaged subjects in laboratory tasks to demonstrate deficits in sustained attention and vigilance. The early laboratory tasks required subjects to attend to visual or auditory stimuli and respond differentially to target and non-target stimuli. Thus, for example, letters would be flashed sequentially on a screen and the child would be asked to press a button each time the letter X appeared when it was preceded by the letter A.

More recently, common laboratory measures of sustained attention have included computerized continuous performance measures such as the Test of Variables of Attention (TOVA, Greenberg, 1991), the Conners’ Continuous Performance Test (CPT, Conners, 1994), and the Intermediate Visual and Auditory Continuous Performance Test (IVA, Sanford & Turner, 1994). Although presentation formats and task demands are similar to earlier attention tasks, the computerized nature of these measures moves them out of the realm of research tasks requiring sophisticated data analysis skills and specialized equipment and makes them accessible to evaluators for use as part of their diagnostic procedures. However, these tasks typically require, in addition to computers, an initial investment in software and in some cases, a per test administration fee.

Reviews of research on computerized continuous performance tasks have generally been favorable, and they are seen as playing a role, albeit limited, in the evaluation of attention disorders. Barkley and Grodzinksi (1994), for instance, evaluated the utility of neuropsychological measures, including continuous performance tests (CPTs) for distinguishing children with ADHD from normal controls and children with learning disabilities. They found CPT measures among the most useful of the assessment procedures investigated. Nonetheless, they noted that positive but not negative CPT findings can have diagnostic utility. Thus, while poor performance on a CPT measure was indicative of an attention disorder, good performance did not necessarily rule out attention disorders. These results have been replicated by Matier-Sharma and her colleagues (Matier-Sharma et al, 1995). With the proliferation of research on attention and other executive skill tasks, there remains hope that some combination of tests and behavioral observation procedures will further refine the diagnosis.

Paper and pencil tasks have also been used with varying results to examine aspects of attention. Cancellation tasks, which require the individual to cross out a figure or letter from a visual array have been among the more popular paper and pencil tasks. Aman and Turbott (1986) found that a cancellation task discriminated between hyperactive and control subjects. Voeller and Heilman (1988), using a letter cancellation task, found that boys with attention disorders made more errors of omission than a group of normal controls. Weyandt and Willis (1994), however, did not find a significant difference between ADHD children and normal controls on a visual search task.

Should pencil-and-paper cancellation tasks prove effective in discriminating between ADHD and non-ADHD populations, they have several advantages over computerized measures. They are convenient, economical, and portable. Furthermore, they require less time to administer and score than do computerized measures.

As Barkley and others (1994) have noted, in order for a test to be diagnostically useful, it must be able to not only identify the children with ADHD, but it must also accurately identify children without ADHD.

Ellwood (1993) discusses parameters that can be used to examine a test’s diagnostic usefulness. Test specific parameters include sensitivity, or the proportion of individuals with a disorder that exhibit the sign (i.e., the proportion of children with ADHD who receive scores within the abnormal range) and specificity, or the proportion of individuals without a disorder that do not exhibit the sign (i.e., the proportion of controls who receive scores within the normal range). These two parameters are calculated in the research setting by first knowing the diagnosis of the children (through test-independent criteria) and noting how they perform on the test of interest. However, as Ellwood (1993) points out, this is the opposite of the way an evaluator uses a test. The evaluator starts with the test score and attempts to determine the child’s diagnosis. In order to judge the usefulness of a test for this purpose, the evaluator will need to look at a test’s sensitivity and specificity in light of the disorder’s base rate in their referral population.

For example, if a test was used as a screening measure on a population of 1000 children in which 4% (40) of the children have ADHD, and that test gives an abnormal score for 90% of the children with ADHD (i.e., sensitivity) and gives a normal score for 90% of the children without ADHD (specificity), the following diagnostic properties result.

Table 1.  Calculation of Sensitivity, Specificity,  PPP, and NPP

ADHD

Control

Abnormal Score

a

36

b

96

132

Normal Score

c

4

d

864

868

40

960

1000

Sensitivity = a/a+c = .90

Specificity = d/b+d = .90

PPP = a/a+b = .27

NPP = d/d+c = .99

Using this table, one can calculate Positive Predictive Power (PPP), or the chances that a child who receives an abnormal test score actually has ADHD. PPP = a/a+b = 36/132 = 0.27. A test with 90% sensitivity and specificity has restricted usefulness as a diagnostic tool if it is used on a population with a 4% base rate of the disorder because if the child receives an abnormal score, (s)he is still much more likely to be a control than a child with ADHD.

There are two ways of making such a test more useful diagnostically: 1) Use it on a population with a higher base rate of the condition (e.g., the base rate of ADHD at an ADHD clinic is likely much higher than the 3-9% identified in the general population), and/or 2) increase the cutoff point for abnormal scores so that specificity is as high as possible.

Greenberg and Crosby (manuscript in preparation) applied the first two measures to the performance of persons diagnosed as ADHD with those of normal children on the Tests of Variables of Attention (TOVA). Using a cutoff set for a 20% false positive rate, the TOVA yielded sensitivity and specificity rates of .73. If this test were used to screen a 4% base rate population for ADHD, only 10% of the test positives would be true positives (PPP), while 97% of the children who received a normal score would be true negatives (NPP).

In addition to distinguishing between children with and without an attention disorder, an instrument that purports to be a valid measure of attention should also be sensitive to the developmental changes associated with attention. As with other cognitive traits such as memory, research suggests that as children age, their attentional capacity increases, and a variety of measures have been used to demonstrate this phenomenon.

Routh, Schroeder, and O’Tuama (1974) found that activity levels in children declined with age when measured by open-field activity ratings and parent-completed behavior checklists. Levy (1980) found age-related changes in sustained attention using a continuous performance test, in motor inhibition using a line drawing task, and motility using a "Ballistographic Chair." These results suggest that cancellation tasks, such as those employed in the present comparative study, should also demonstrate age-related trends.

The purpose of these studies was to develop a set of normative data for a population of normal children, ages 6-14, on a simple letter cancellation task, and to compare these data with the performance of an ADHD population.

METHOD

Materials

The Mesulam Continuous Performance Test consists of two pages with the letters of the alphabet printed in uppercase. On one page (Ordered), the letters are placed in neat, orderly rows and columns, while on the second page (Random), the letters are placed in a haphazard fashion, with no apparent order imposed. On both pages, 60 A’s are placed among the other letters. Regardless of the page, the A’s are in the same location, dispersed symmetrically, with approximately 15 in each of the 4 quadrants. Figure I shows a portion of the Ordered and Random page.

Figure I

Study 1

Normative Sample

Normative data for the Mesulam were initially gathered from 1540 children (825 male, 715 female; age range 6 to 14 years, grade range 1 to 8). Participants were drawn from eleven schools (nine public and two parochial) located in three predominantly white, middle-class, urban and rural school districts in southern New Hampshire. The classroom teachers were surveyed to identify those children already diagnosed as, or suspected of, having learning difficulties. Approximately 11% (169 children) of the total sample were thus flagged by their classroom teachers as either having a specified learning disability, having a physical or cognitive impairment that prevented them from attempting the Mesulam test, or were formally identified as having an attention-deficit disorder with or without hyperactivity. These children were administered the Mesulam but were later excluded from the "normative" group, which resulted in a total of 1371 children included in this normative study. Testing was done at the end of one school year (May) and at the beginning of the next school year (September, 1993).

Procedure

All children were tested as a group in their classrooms. Separate grade level classrooms were chosen at random from the participating schools. Each child was given a crayon and the Mesulam protocol face down on the classroom desk. The children were given complete instructions that included the explanation that they were to "find all the ‘A’s and circle them. Do this as quickly as you can and when you think you have circled all that there are, turn your paper over." When the tester told the children to start, they were to pick up their crayon, turn the paper over, and begin the task. Because the children were tested as a group and not individually, and because this task was not considered to be a speeded task, the administration procedure did include the approximate but not precise timing of each completion. Children in all grades were limited to 7 minutes to complete the task (very few children went to this time extreme, and those that did were generally finished and simply rechecking the protocols at the end of the 7 minutes). The Mesulam was administered in a balanced manner, with 849 children being administered the Ordered page first (OP1 Administration), while 691 were administered the Random page first (RP1 Administration).

Measures of performance on the Mesulam include:

    • Ordered Page Errors (OPE): This is the total number of "A"s that were missed on the Ordered page, regardless of administration order.
    • Random Page Errors (RPE): This is the total number of "A"s that were missed on the Random page, regardless of administration order.
    • Total Errors (TE): This is the total number of "A"s that were missed on both the Ordered and Random pages, regardless of administration order.

Each child’s protocol was scored for errors (missing an "A"). Ordered page errors (OPE), Random page errors (RPE) and Total errors (TE) were calculated for every student.

RESULTS

Means were computed for each of the nine age groups (6 to 14), and for both administration procedures (OP1 and RP1). The effects of sex, age, and order of administration were analyzed. No significant differences were found between groups according to sex. In contrast, highly significant effects were found for both page administration order and age. The effects of administration order and age were assessed by a two factor analysis of variance. OPE were found to be significantly affected by both administration order (F=6.709, P=.0097) and age level (F=14.654, P=.0001). RPE and TE were found to be non-significantly affected by administration order (RPE: F=.098, P=.7652, TE: F=2.934, P=.0869), but significant effects were noted for age level (RPE: F=10.379, P=.0001, TE: F=18.045, P=.0001). Because of these effects, separate normative tables based upon order of administration and age are provided. Table II presents data for the entire sample separated by age and order of page administration

Table II: Mesulam CPT Means and Standard Deviations for Normative Population by Page Administration Order

Ordered Page First (OP1) Administration

OE

RE

TE

Age

n

M

SD

M

SD

M

SD

six

80

3.60

4.07

1.79

1.97

5.39

4.81

seven

124

2.90

3.23

2.30

2.96

5.20

4.93

eight

142

2.51

3.18

1.56

2.06

4.07

4.14

nine

102

1.40

1.62

1.32

1.63

2.72

2.40

ten

99

.91

1.21

.80

1.05

1.71

1.58

eleven

74

1.00

1.45

.77

1.35

1.77

1.94

twelve

76

.61

.93

.43

.79

1.04

1.19

thirteen

102

.25

.48

.31

.69

.56

.84

fourteen

50

.52

.89

.40

.76

.92

1.21

Random Page First (RP1) Administration

RE

OE

TE

Age

n

M

SD

M

SD

M

SD

six

23

1.96

1.85

.74

1.05

2.70

2.16

seven

56

1.98

2.28

1.89

2.05

3.88

3.04

eight

167

1.59

1.85

1.20

1.54

2.79

2.50

nine

71

1.49

1.71

1.13

1.34

2.62

2.26

ten

45

1.22

1.59

.98

1.42

2.20

2.19

eleven

51

.55

.92

.35

.52

.90

1.22

twelve

59

.54

.92

.17

.38

.71

1.00

thirteen

34

.32

.64

.21

.54

.53

.83

fourteen

16

.50

.97

.38

.81

.88

1.31

Errors on each page, regardless of administration order, demonstrated an expected age effect, with each variable generally improving from one age to the next. For all measures, scores are highest for the 6 to 9-year-old groups, and tend to level out through most of the other years. Order of page administration most highly affected the results for the younger children, ages 6, 7, and 8.

For ages 6, 7, and 8 there was a pronounced difference in the total number of errors (TE) made depending on the order in which the pages were administered. For example, six-year-olds given OP1 Administration had 5.4 total errors while those six-year-olds given RP1 Administration had an average of only 2.7 total errors. Interestingly, most errors occured on the first page presented (Ordered or Random). Whichever page was administered to the child first yielded comparatively more errors than the succeeding page. Six-year-olds given OP1 Administration averaged 3.6 errors on the Ordered page, compared to the 1.8 errors on the Random page. Similarly, six-year-olds given RP1 Administration averaged 1.9 errors on the Random page, while averaging .74 errors on the Ordered page.

Because of the effects of age and order of administration on test performance, age-based norms for OP1 administration were selected for a sampling of an ADHD population.

Study 2

ADHD Sample

170 children, between the ages of 6 to 14, were identified as having an attention-deficit disorder with or without hyperactivity. The children had been referred to an independent center connected to a major medical facility for evaluation of attentional disorders. The children were diagnosed as having ADHD on the basis of history, psychological interview by one of three psychologists, psychological/educational testing, and behavior rating scales. Determination of ADHD followed criteria set forth in DSM-III/IV. Included in the testing battery, but not used as a criterion for diagnosis, was the Mesulam CPT. Maybe here – see above - All children were administered the Mesulam with the ordered page first (OP1). Table III shows the Ordered Page Error (OPE), Random Page Error (RPE), and Total Error (TE) means and standard deviations for the ADHD group. Also included are the results of two-tailed independent t-tests conducted to determine if there were differences in TE at each age between the control and ADHD groups.

Table III  AD/HD Sample Means and Standard Deviations for OP1 Administration and Two Tailed Independent t-Test Comparisons for Total Error - AD/HD Means vs. Normative Means

Ordered Administration

OE

RE

TE

t

Age

n

M

SD

M

SD

M

SD

six

9

13.78

9.62

13.67

11.14

27.44

18.86

3.51

**

seven

27

10.56

9.81

13.46

8.97

24.23

16.70

5.81

**

eight

32

6.88

7.50

9.68

7.57

16.52

13.03

5.32

**

nine

21

6.62

7.77

10.57

9.33

17.19

15.68

4.23

**

ten

32

3.97

2.98

6.16

5.74

10.13

7.24

6.58

**

eleven

8

1.13

1.46

2.88

4.09

4.00

5.18

1.22

*

twelve

19

3.68

3.87

5.42

4.72

9.06

6.77

4.89

**

thirteen

9

2.00

1.23

2.00

1.41

4.25

2.05

5.08

**

fourteen

13

2.31

2.14

3.15

3.81

5.46

4.22

3.89

**

Normative (n=849) and ADHD (n=170)

* p<.01, ** p<.0001

The ADHD group as a whole had significantly more errors on each of the three measures than the normative group at each age level. Comparing the ADHD group’s performance on the Ordered and Random pages to that of the normative group showed an unexpected and reverse demonstration of errors. While the normative group typically did better on the second, Random page when given Ordered Page First (OP1) Administration, the ADHD group did not show this positive trend. ADHD children given OP1 Administration typically had slightly more errors on the second, Random page. At all ages except one (age eleven), the ADHD group committed significantly more TE than did the normative group.

DISCUSSION

This paper presents the results of norming procedures for the Mesulam Continuous Performance Test for a large sample of school-aged children. Results demonstrate the need for caution when administering such tasks, since minor changes (page administration order) had major effects on the results. This set of studies suggests that the Mesulam may potentially serve as a quick, inexpensive evaluative tool in the diagnosis of ADHD.

Evaluators in schools may find the Mesulam a useful tool in screening large numbers of children quickly and inexpensively. Both pages of the Mesulam can be easily group-administered to whole classrooms in less than 15 minutes. The normative data imply developmental trends in attentional capacity, thereby lending credence to its validity.

The Mesulam task should be considered only one component of the complete multi-method approach that is needed in making accurate diagnoses of this disorder.

Another possible limiting factor of this study is that this sample may not be representative of the broader population and additional norming may be needed. Evaluators may wish to develop their own "local norms".

The ability of the Mesulam to identify ADHD children from those with other disorders has not yet been established. Children may often display characteristics that resemble ADHD, but suffer from symptoms that are more appropriately attributed to other disorders. Multi-method evaluations should address whether the learning problem is a manifestation of a specific learning style, underlying process disorder, neurological deficit, or attentional disorder.

REFERENCES

Aman, M.G., & Turbott, S.H. (1986). Incidental learning, distraction, and sustained attention in hyperactive and control subjects. Journal of Abnormal Child Psychology, 14(3), 441-455.

Barkley, R. A. and Grodzinksi, G. M. (1994). Are tests of frontal lobe functions useful in the diagnosis of Attention Deficit Disorders? The Clinical Neurologist, 8, 121-139.

Conners, C. K. (1994) Conners’ Continuous Performance Test, Users Manual. Toronto: Multi-health Systems.

Ellwood, R.W. (1993). Clinical discriminations and neuropsychological tests: An appeal to Bayes’ theorem. The Clinical Neuropsychologist, 7, 224-233.

Greenberg, L. M. and Crosby, R. D. (Manuscript in preparation) The specificity and sensitivity of the Tests of Variables of Attention.

Greenberg, L. M. (1991). T.O.V.A. Interpretation Manual. Minneapolis, MN: Author.

Levy, F. (1980). The development of sustained attention (vigilance) and inhibition in children: Some normative data. Journal of Child Psychology and Psychiatry, 21, 77-84.

Matier-Sharma, K., Perachio, N., Newcorn, J.H., Sharma, V., & Halperin, J.M. (1995). Differential diagnosis of ADHD: Are objective measures of attention, impulsivity, and activity level helpful? Child Neuropsychology, 1, 118-127.

Routh, D.K, Schroeder, C.S., & O’Tuama, L. (1974). Development of activity level in children. Developmental Psychology, 10, 163-168.

Sanford, J.A., & Turner, A. ( 1995), Intermediate Visual and Auditory Continuous Performance Test. Richmond: BrainTrain.

Shaywitz, S. E. & Shaywitz, B. A. (1988). Attention-deficit disorder: Current perspectives. In J. F. Kavanagh & T. J. Truss (Eds.), Learning disabilities: Proceedings of the national conference. Parkson, MD: York Press

Sykes, D.H., Douglas, V.I., & Morgenstern, G. (1973). Sustained attention in hyperactive children. Journal of Child Psychology and Psychiatry, 14, 213-220.

Szatmari, P. (1992). Epidemiology of attention-deficit hyperactivity disorder. In G. Weiss & M. Lewis (Eds.), Child and adolescent psychiatric clinics in North America, Attention-deficit hyperactivity disorder (pp. 361-371). Philadelphia: W. B. Saunders.

Voeller, K.K.S., & Heilman, K.M. (1988, September). Motor impersistence in children with attention-deficit/hyperactivity disorder: Evidence for right-hemisphere dysfunction. Presentation made at the seventeenth national meeting of the Child Neurological Society, Halifax, Canada.

Weyandt, L.L, & Willis, W.G. (1994). Executive functions in school-aged children: Potential efficacy of tasks in discriminating clinical groups. Developmental Neuropsychology, 10(1), 27-38.

Zametkin,A. J. (1995). Attention-Deficit Disorder: Born to be hyperactive? Journal of the American Medical Association, 273, 1871-1874.