Clinical variables at age 2 predictive of mental retardation at age 5 in children with pervasive developmental disorder


Toshinobu Takeda, MD, Department of Neuropsychiatry, Graduate School of Medicine, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan.


Abstract  This study attempted to find clinical variables evaluated at age 2 that would predict mental retardation (MR, IQ/cognition-adaptation developmental quotient [C-A DQ] < 70) at age 5 in 57 children with pervasive developmental disorder (PDD). About two-thirds of subjects had MR at both initial and outcome evaluations. The C-A DQ at initial evaluation was significantly lower in mentally retarded PDD (MRPDD) than in high-functioning (IQ ≥ 70) PDD (HFPDD). MRPDD changed less than HFPDD in IQ/C-A DQ between ages 2 and 5. The C-A DQ at age 2 was a potent predictor for MR at age 5 and the total score and three item scores of Childhood Autism Rating Scale-Tokyo Version evaluated at age 2 were also useful in predicting MR at age 5.


Pervasive developmental disorder (PDD) are now considered to have a higher prevalence rate than previously thought: 30–60 per 10 000 children according to Fombonne's review.1 Although high-functioning PDD children seem to be the majority of PDD,2 mentally retarded PDD children need more intensive early interventions3 than their high-functioning counterparts. Several studies have demonstrated the effectiveness of early intervention of PDD children,4–8 therefore early prediction of mental retardation in PDD infants is important to facilitate their development and to help their families.

Prognostic studies on PDD to date9–13 have demonstrated that an IQ or developmental level around age 5 was a good predictor of mental development of PDD children up to adolescence and adulthood. If there is a clue in early infancy to predict mental retardation in PDD children around age 5, it would help professionals to plan early therapeutic intervention for such infants. However, we do not seem to have such a predictor.

In this study, we attempted to identify predictors of mental retardation (MR, IQ < 70) at age 5 in PDD children from clinical variables including developmental quotient (DQ) and autistic symptoms evaluated at age 2.


The subjects of this study were 57 PDD children (45 males and 12 females) extracted from all PDD children visiting two clinics in and near the Tokyo region, specializing in developmental disorders and related conditions, provided that the subjects had received both initial evaluation at age 2 (mean age at initial evaluation [MI] = 31.4 months, SD = 3.3, range = 23–35) and outcome evaluation at age 5 or later (mean age at outcome evaluation [MO] = 67.0 months, SD = 7.0, range = 60–90), around the age of entering a primary school in Japan, on instruments introduced later.

At the time of their first visit to each clinic, an experienced social worker observed them and interviewed their parents about their family and developmental histories for an average of 1.5 h. Then, experienced pediatric neurologists and child psychologists conducted neurological evaluations and psychological evaluations including the administration of developmental tests of children. Based on these data, an experienced child psychiatrist (HK) performed a child psychiatric evaluation of the children. After that, clinical teams consisting of experienced professionals who participated in the evaluation of the children, made diagnoses of PDD and its subcategories by consensus according to DSM-IV.14 This process was replicated at each follow-up visit. Since there have not yet been any standardized and structured diagnostic instruments for autism in Japan, close attention was paid when making diagnoses at every visit.

We employed ICD-1015 criteria for atypical autism with atypicality in symptomatology to diagnose PDDNOS, which is defined as a residual category of PDD and has no operational criteria in DSM-IV. Buitelaar and van der Gaag proposed similar diagnostic criteria for PDDNOS which required a total of three or more items from (1), (2), and (3), including at least one item from (1) of criterion A of DSM-IV criteria for autistic disorder.16

Of the 57 PDD children, 18 (12 males, six females) were diagnosed as having autistic disorder (AD), 38 (32 males, six females) as having PDDNOS, and one boy as Asperger's disorder (AS) at age 2. All 38 PDDNOS children also satisfied Buitelaar and van der Gaag's PDDNOS criteria.16 By the time of outcome evaluation, the diagnoses of the children remained what they were at initial evaluation in all but three PDDNOS children. One PDDNOS child switched his diagnosis to AS at age 3, because restricted repetitive patterns of interest became dominant to satisfy AS criteria. The other two PDDNOS children switched their diagnosis to AD later for similar reasons.

The 57 PDD children had neither comorbid medical conditions other than mental retardation (MR), except for one female with AD who developed a fit for the first time during the follow-up period.

Of the 57 PDD children, 39 (MI = 31.7 months, SD = 3.1; MO = 66.2, SD = 6.0; 31 males, eight females; ad 15, PDDNOS 24) had an IQ/C-A DQ (as introduced later) of under 70 at outcome evaluation (called mentally retarded PDD: MRPDD) and 18 (MI = 30.9 months, SD = 3.8; MO = 68.6, SD = 8.7; 14 males, four females; five ad, 11 PDDNOS, two AS) had an IQ/C-A DQ of 70 or over at outcome evaluation (called high-functioning PDD: HFPDD). There was no significant difference in ages at initial and outcome evaluations; sex ratio; the rate of PDD subcategories between the MRPDD and HFPDD groups (Table 1).

Table 1.  Demographics of subjects
 Outcome evaluation
MRPDD (n = 39)HFPDD (n = 18)
  1. AD, Autistic Disorder; AS, Asperger's Disorder; HF, High-Functioning (70 ≤ IQ); MR, Mental Retardation (IQ or C-A DQ < 70); PDD, pervasive developmental disorder; PDDNOS, PDD not otherwise specified.

Female 8 4
AD15 5
AS 0 2
Age (months) at initial evaluation31.7 (SD = 3.1)30.9 (SD = 3.8)
Age (months) at outcome evaluation66.2 (SD = 6.0)68.6 (SD = 8.7)


Tanaka-Binet intelligence test (B-test)

The B-test is the Japanese version of Stanford-Binet standardized and revised in 1987.17 In this study, the B-test was administered to 24 subjects (16 HFPDD, eight MRPDD) at outcome evaluation. Only two subjects (both HFPDD) were administered the Japanese version of WISC-III at outcome evaluation.

Kyoto scale of psychological development (K-test)

The K-test is a widely used developmental test standardized for 0- to 13-year-old 1562 Japanese children, with satisfactory reliability and validity.18 It is also frequently used to assess development of mentally handicapped infants and young children in Japan. The K-test consists of three subtests: posture-movement (P-M), cognition-adaptation (C-A) and language-sociability (L-S). The P-M subtest consists of two items: standing and footstep. The C-A subtest consists of 10 items: toy blocks, block design, task box/square composition, little bell, puzzle/figure discrimination, puzzle/folding paper, drawing, figure drawing, nesting cup/weight comparison, and cups/memory/hitting blocks. The L-S subtest consists of nine items: digit span, counting fingers, counting/calculation, selecting numbers, comparison, pointing by finger/vocabulary/naming, pointing figures/name/omission, gender/right-and-left discrimination, and comprehension/words definition. In each subtest, a score is converted to a developmental age (DA) and full-scale DA is obtained. Developmental quotients (DQ) (full-scale DQ, P-M DQ, C-A DQ, and L-S DQ) are calculated according to rounding off the ratio of each DA to chronological age (CA) to the nearest whole number.

Since the B-test is difficult or even impossible to administer to PDD infants, the K-test was administered to all subjects at initial evaluation and 31 (52.6%) subjects at outcome evaluation. As a whole, the K-test was administered to subjects who were estimated to have developmental age under 30 months, judging from their latest evaluations. Koyama et al. reported that C-A DQ and B-test IQ were reasonably close and highly positively correlated with each other (r = 0.86), suggesting that C-A DQ is usable for B-test IQ in PDD infants.19 In their study, subjects consisted of 75 PDD children (mean age at K-test evaluation = 52.3 months, SD = 14.9, range = 28–103; mean C-A DQ = 62.4, SD = 15.8; 56 male, 19 female; 18 AD, 54 PDDNOS, three AS).

Childhood autism rating scale-Tokyo version (CARS-TV)

For rating autistic symptoms, the Japanese version of the Childhood Autism Rating Scale (CARS),20 the Childhood Autism Rating Scale – Tokyo Version (CARS-TV)21 was used. CARS-TV had satisfactory reliability and validity21 and its cut-offs to distinguish PDD from non-PDD MR and AD from PDDNOS were 25.5/26.0 and 30/30.5, respectively.22 Diagnoses for autism based on the Autism Diagnostic Interview-Revised and CARS were examined in two countries, Iceland and Israel. The agreement between systems reached 66.7 and 85.7%, respectively.23,24

Evaluation of speech and pointing by finger

In order to evaluate the presence of useful languages and pointing at initial evaluation, the items of K-test, ‘vocabulary-three words’ and ‘pointing’ were employed. Whether children had three words or not was judged according to the criteria of K-test as follows: ‘if having the three words or more, a child is judged as passing the item. If utterance of the subject is not confirmed during evaluation, a tester asks the primary caregiver to list the words which the child use and decide passes or fails.’ The criteria for ‘pointing’ is that ‘if pointing is observed not less than once during testing, a subject is judged as passing. Either voluntary pointing or reactive pointing towards a question may be acceptable’.


Data used in the present study were extracted from medical records of the above-mentioned two clinics on the condition that the subjects were children with PDD who received both initial evaluation at age 2 and outcome evaluation at age 5 or later. Values of all the variables at initial evaluation were dichotomized at relevant cutoffs for predictability to conduct logistic regression analysis to estimate the effect of variables at initial evaluation on the binarized outcomes (i.e. with MR or not) at age 5. In DQ, odds ratios were calculated in every 10 points and cutoffs with highest odd rations were employed. In CARS-TV total score, although odds ratio was highest at cutoff of 31/31.5, cutoff of 29.5/30 was employed also for clinical utility. In CARS-TV items, although setting a cutoff at the same value, 2.0/2.5 (i.e. above mild abnormality) might be helpful for clinical use, a cutoff was set in each item individually, since the distribution of item score varied from item to item: cutoffs were 1.5/2.0 in five items (i.e. body use; listening response; taste, touch and smell use; fear or nervousness; and activity level), 2.0/2.5 in eight items (i.e. relationship to people, emotional response, object use, adaptation to change, visual response, non-verbal communication, consistency of intellectual response, and general impressions), and 2.5/3.0 in two items (i.e. imitation and verbal communication).

Statistical analysis

For categorical data, χ2 test was applied to compare MRPDD and HFPDD at outcome evaluation. When necessary, analysis of covariance (ancova) or logistic regression analysis was applied. All statistical analyses were performed with SPSS 11.0 J for Windows with a significant level set at P < 0.05 (two-tailed test).


As expected, there was significant difference in initial C-A DQ between MRPDD and HFPDD (P = 0.000). As shown in Table 2, of 10 children with C-A DQ below 50 at initial evaluation nine remained with an IQ below 50 and no child became high-functioning (HF, IQ = 70) at outcome evaluation. The lowest C-A DQ at initial evaluation in a child who became HF at outcome evaluation was 59; the highest C-A DQ at initial evaluation in a child who turned out to be MR at outcome evaluation was 76; that is, the range of C-A DQ within which there was a change from MR at initial evaluation to HF at outcome evaluation or vice versa was relatively narrow, from 59 to 76.

Table 2.  Relationship of mental retardation levels at initial and outcome evaluations
Initial evaluationOutcome evaluation
  1. Severe: profound, severe, or moderate MR (IQ or C-A DQ < 50); mild: mild MR (50 ≤ IQ or C-A DQ < 70); normal: borderline or normal intelligence (70 ≤ IQ or C-A DQ).

Severe (n = 10) 9 1 0
Mild (n = 29)1313 3
Normal (n = 18) 3 015

The difference in outcome IQ/C-A DQ between the two groups was also tested by ancova with an initial C-A DQ as a covariate. Outcome IQ/C-A DQ was still significantly higher in HFPDD than in MRPDD (P = 0.000).

As shown in Table 3, the low score of full-scale DQ, C-A DQ, and L-S DQ, total CARS-TV score of 30 or over, and scores higher than cutoffs in three items of CARS-TV (i.e. emotional response, listening response, and general impressions) had significant odds ratios for MR at age 5. In multiple logistic regression analysis with the binarized outcome as a dependent variable and the three item scores of CARS-TV, total CARS-TV score and C-A DQ (representing DQs) as independent variables, only C-A DQ was significant.

Table 3.  Association between items evaluated at age 2 and MR (IQ < 70) at age 5 in 57 PDD children
Items at Age 2Odds ratio95% CIP
  1. AD, autistic disorder; C-A, cognition-adaptation; CARS-TV, Childhood Autism Rating Scale-Tokyo Version; CI, confidence interval; DQ, developmental quotient; L-S, language-sociability; MR, mental retardation; PDD, pervasive developmental disorder; PDDNOS, PDD not otherwise specified; P-M, posture-movement.

Full-scale DQ < 6026.073.13–216.970.000
P-M DQ < 70 1.790.43–7.500.541
C-A DQ < 7058.3310.54–322.780.000
L-S DQ < 50 6.961.99–24.370.002
AD 1.350.40–4.620.763
PDDNOS 0.740.22–2.530.763
Sex (male) 1.070.28–4.161.000
Speech 0.700.22–2.240.560
Pointing 0.590.16–2.190.494
CARS-TV: total score ≥ 30.0 5.831.26–27.030.024
Item: Relationship to people ≥ 2.5 2.880.89–9.230.092
Imitation ≥ 3.0 3.030.91–10.160.089
Emotional response ≥ 2.5 4.291.07–17.20.041
Body use ≥ 2.0 2.250.71–7.090.240
Object use ≥ 2.5 1.750.48–6.390.540
Adaptation to change ≥ 2.5 1.750.48–6.390.530
Visual response ≥ 2.5 2.440.68–8.770.240
Listening response ≥ 2.0 4.201.10–15.960.040
Taste, touch and smell use ≥ 2.0 0.780.18–3.391.000
Fear or nervousness ≥ 2.0 1.600.51–5.020.560
Verbal communication ≥ 3.0 2.320.72–7.510.220
Nonverbal communication ≥ 2.5 2.610.65–10.550.260
Activity level ≥ 2.0 1.940.56–6.770.330
Consistency of intellectual response ≥ 2.5 0.690.21–2.320.550
General impressions ≥ 2.5 4.081.14–14.640.043

In 18 HFPDD children at outcome evaluation, the numbers of children who had had ‘vocabulary-three words’ and ‘pointing’ at the initial evaluation were seven (39%) and five (28%), respectively. In 39 MRPDD children, the corresponding numbers at the initial evaluation were 12 (31%) and seven (18%), respectively. As for the presence of useful languages and pointing, there was no significant difference between the two groups. In this study, five children did not pass the item ‘vocabulary-three words’ and four children did not pass the item ‘pointing’ at outcome evaluation.


Outcome IQ/C-A DQ was significantly lower in MRPDD children than in HFPDD children by ancova with initial C-A DQ controlled for. This result indicates that HFPDD made a greater IQ/C-A DQ improvement than MRPDD during preschool years. Total DQ under 60, C-A DQ under 70, L-S DQ under 50, total CARS-TV score of 30 or above, and three item scores of CARS-TV, ‘emotional response’ of 2.5 or above, ‘listening response’ of 2.0 or above, and ‘general impressions’ of 2.5 or above had a significant odds ratio for having mental retardation at age 5.

With the analyses of home videotapes of first year birthday parties of autistic and normally developing children, Osterling and Dawson reported that no behavior of the autistic children at age 1 was predictive of their later cognitive delays.25 Although the methodology was different, this study focusing on a bit older infants showed that development and behavior at age 2 was possible to predict later intellectual competence in PDD children.

In this study, a C-A DQ was a principal prognosticator even within preschool years, which is consistent with the previous studies which reported that IQ at intake is predictive of later IQ.10,13,26 The finding that no item other than C-A DQ was significant in multiple regression analysis indicates a dominant influence of the initial C-A DQ on the outcome IQ/C-A DQ at least in infancy. Our findings on the K-test, which is used only in Japan, may merit for replication with comparable developmental tests in other countries, because the difficulty of administering a standardized intelligence test to PDD infants and the importance of early identification of mentally retarded PDD infants is the same worldwide.

Although less powerful predictors than C-A DQ, scores on the CARS evaluated at age 2 seem to be useful in predicting the outcome of PDD children around the age of entering an elementary school, especially in a clinical situation where relevant developmental test data are not readily available.

According to the CARS manual, the criteria of moderately abnormal affective responses are: definite signs of inappropriate affect, reactions quite inhibited or excessive or often unrelated to the stimulus, grimacing, rigidity. Snow et al. indicated that the autistic children were found to display less positive affect than the delayed children matched for chronological and mental age but they did not mention the relation between IQ levels and affect.27 The results of present study may be interpreted that the severity of intellectual function is related to abnormality of affection, although further study must be addressed.

As for ‘listening response’ on children with autism after preschool, Rutter et al. have already reported that a profound lack of response to sounds so as to be suspected of deafness was poor prognosticator.26 On developmentally retarded infants and children, Mochizuki et al. reported that the more severe the degree of mental retardation, the higher incidence of abnormal auditory brain stem responses were observed.28 If abnormal auditory responses in PDD children are the expression of disturbance in acoustic information processing, it is likely that auditory abnormality hinders intellectual development.

Among 15 items of CARS-TV, general impressions showed the most significant correlation (r = 0.83, P = 0.000) to the CARS-TV total score in this study. Therefore, it is quite natural that general impressions of 2.5 or above increased odds ratio significantly, if a strong correlation between CARS-TV total score and IQ are taken into consideration.

In the present study, neither useful languages nor pointing was a significant predictor possibly due to the fact that almost all PDD children brought to the clinics at age 2 had not yet reached the developmental level to display such functions.

There was significant difference in IQ/C-A DQ change between MRPDD and HFPDD. Almost all of the 2-year-old children with C-A DQ below 50 at initial evaluation remained at that level. This might suggest the existence of a ‘DQ threshold’ around 50. Nevertheless, this should not be taken to suggest that children with C-A DQ lower than 50 at age 2 do not merit early intervention. Even if IQ improvement is difficult in some PDD children, there is much room for therapeutic intervention for them. Children who do not yet produce spoken words can be helped to be more communicative with various means including behavioral therapy to foster desired behaviors and to decrease problem behaviors.3

In conclusion, this study found that low C-A DQ at age 2 was a best prognosticator of mental retardation at age 5 in PDD children and the total score and three subscale scores (‘emotional response’, ‘listening response’, and ‘general impressions’) of CARS-TV evaluated at age 2 were also predictive of MR at age 5 in PDD children. Although these findings may be important clinically, they need to be tested by further study with more rigorous methodology based on a larger number of PDD infants.


We would like to thank Mr Hiromi Ishida, Mr Jyunichi Yukimoto, Mr Takeo Tanaka, and Ms. Mika Tobari for their help in data collection.