The purpose of a traditional articulation test is to assess phonological and articulatory development. Theoretically, the results of two different articulation tests applied to the same child should yield similar results (Schissel & James, 1979). However, previous investigations have suggested that this may not be the case (Ogburn et al., 2008). In this investigation, two groups of children, ages 3 and 4, were tested with two articulation assessments, the Arizona Articulation Proficiency Scale-Third Revision (Arizona-3; Fudala, 2000) and the Clinical Assessment of Articulation and Phonology (CAAP; Secord & Donohue, 2002). Results suggested a statistically significant difference was found between the two instruments with children scoring 4.3 points lower on the CAAP as compared to the Arizona-3. Several explanations for this discrepancy are examined and include test composition and normative data differences.
Speech-language pathologists (SLPs) administer tests of articulation to measure the accuracy of articulatory behavior and phonological development. Generally speaking, if two different, standardized, age-appropriate tests of articulation are administered under similar conditions, most clinicians expect that the outcomes will be similar (Schissel & James, 1979). However, several previous investigations have suggested that this is not necessarily the case.
Schissel and James (1979) conducted an investigation to determine if differences existed between the standardized scores of two articulation tests, specifically the Deep Test of Articulation (DTA; McDonald, 1964) and the Arizona Articulation Proficiency Scale-Revised (AAPS; Fudula, 1974). They evaluated the participants' performance for individual phonemes, as well as global test performance, and found significant differences between the two assessment instruments. Most interestingly, these researchers found accuracy differences for 8.2% of the phonemes tested, but this effect was predominate throughout the participant sample (i.e., approximately 83% of children tested displayed this trend), which indicates that the performance of children assessed varied between the two instruments and this effect could potentially affect the clinician's decision whether to treat the child. The authors go on to note that out of 29 participants, 18 displayed more accurate production on the DTA, whereas 14 participants demonstrated greater accuracy on the AAPS. Also, of the participants tested, 8 children demonstrated more accurate performance on some phonemes of the DTA and some from the AAPS, and only five children presented with performance that was similar on both measures.
The investigators also indicated that the DTA identified three children in need of treatment, whereas the AAPS was unable to make these identifications. Based on these findings, Schissel and James suggested that the scoring process of the AAPS may be deficient for the following reasons: (1) the AAPS assumes that a child will demonstrate consistently correct production, but this appears to be far from the case, based on the results of the DTA, (2) the number of trials from the AAPS is too low, thus giving the child either 0% or 100% correct production as compared to the DTA, especially with consonant phonemes known to be frequently in error, and (3) the AAPS does not take into account the frequency of misarticulated consonants as compared to vowels.
The limitation of the AAPS notwithstanding, Schissel and James suggested that when examining overall test performance, clinicians would likely come to a similar conclusion regarding the need for treatment regardless of which test was administered. One reason for this could be related to the large number of phonemes the children produced in a consistent way between the tests. In fact, 91.8% of items were produced consistently between the AAPS and the DTA.
Sign Up For CEU Total Access or StudentUnion to get the whole article and handouts.