|
| |
- OUTLINE FOR
THE EVALUATION OF A TEST
- John Willis,
Ed.D. & Ron P. Dumont, Ed.D.,
NCSP
-
- Title of the Test
- Author(s)
- Publisher
- Date of Publication
- Date of previous editions
- Forms Available
- Cost
-
- 1. Manual
-
- Does a manual accompany the test?
- Adequacy
- Is there a separate technical manual and at what cost?
- 2. Stated purpose of the test?
-
- Definition of Construct
- "Dumbed Down Tests" (tests designed for adults or
adolescents redone for children)
- 3. Does the name of the test reflect the test content?
-
- Do the names of the Individual Subtests (where applicable)
reflect the content?
- 4. Form(s) of the items: (Oral, Hands-on, Multiple-choice,
Fill-ins, etc.)
-
- Are there problems with this form or content?
- Is scoring ambiguous?
- Do the items appear to measure what was intended? (e.g., Do
reading items really test memory?)
- 5. Basis of the arrangement of the items in the test?
-
- Subtests
- Scales
- Spiral Omnibus
- Random
- Hierarchical
- Homogeneity: Changes within subtests
- Distinctness
- Sexism and other biases
- 6. Printing, format and arrangement of test items.
-
- Easels and other hardware
- Color use: does it help or hurt?
- Readability
- 7. Protocols
-
- Room to write
- Answers to examinee
- Report forms
- Clarity
- Ease of use
- Do they encourage use of confidence bands? Do they offer
90% and 95% bands?
-
- 8. Directions for administration
-
- Clarity and adequacy?
- Location (manual/protocol/both)
- Flexibility
- Age appropriateness
- 9. Directions to the examinee
-
- Clarity and adequacy
- Natural or Stilted
- Boehm's basic concepts
- Alternative directions
- 10. Time limits and bonuses?
-
- Are they justified?
- Are there alternatives?
- 11. Teaching items?
-
- Scored or unscored
- Adequacy of instructions
- Can you teach over and over?
- 12. Test materials
-
- Child safety
- Ease of use
- Durability
- 13. Scoring
-
- Is scoring easy? objective? subjective? arbitrary? agreed
upon?
- Are there adequate samples of correct answers?
- Rotation errors: differences on tests
- Are printed norms tables also available?
- Is computer program necessary?
- Is computer program provided?
- 14. Raw scores conversions
-
- Interpolation
- Which standard scores are reported?
- Age scores: Why/why not
- Grade scores: Why/why not
- Percentiles
- Standard scores:
- Z
- T
- Stanines
- Deviation quotients (M=100, s.d.=15 or 16)
- Others
- 15. Standardization groups?
-
- Total
- Number per year of age
- National representation
- Breakdowns
- 16. For what groups is the test designed?
-
- Recent
- Relevant
- Representational
- Age
- Grade
- Sex
- SES
- Education
- Geographic regions
- Urban vs. rural
- Ethnicity
- Disabilities
- 17. Reliability coefficients
-
- Internal (split halves)
- Alternate forms
- Test retest
- practice effect
- inflation of r
- Length of test
- Test retest interval
- SEm
- SEest
- Inter-rater reliability
- 18. Validity
-
- For what purpose?
- Content
- are the questions appropriate ?
- are there enough questions?
- level of mastery being measured?
- Criterion
- concurrent vs. predictive
- Construct
- Discriminant use vs. divergent use
- 19. Factor analysis
-
- Exploratory
- Confirmatory
- Rotations
- Different groups
- Variance
- Common
- Error
- Specificity
- 20. User friendliness
-
- Administrator
- Client: Take it yourself
- 21. References
-
- Antiquity
- Authors of bibliography
- Relevance to current edition
- 22. Interpretation
-
- Base rate
- Definitions for constructs and shared abilities
- Multiple comparison tables (critical values)
- Significance vs. abnormality (unusualness vs. importance)
(scatter)
- Testing the metaphysically handicapped (dead)
- What a difference a day makes
- Table Games
- Floor and Ceilings
- Descriptive terms
- Errors
- Cautions
|