Full attainment of oral feeding is one of the most complex tasks of infancy[1, 2] and one of three criteria required for discharge from the neonatal intensive care unit (NICU). Infants in the NICU are often delayed in oral feeding and are at increased risk of oral-motor dysfunction, non-organic failure to thrive, and dysphagia.[4-6] Additionally, poor feeding in neonates has been associated with subsequent developmental delay and feeding difficulties in childhood. Children with feeding complications have higher incidences of brain injury, and often experience long-term motor, behavioral, and cognitive dysfunction.[5-9] Early identification of these problems is important to enable interventions to optimize function, and relies on the availability of valid and reliable assessment tools.
The Neonatal Oral Motor Assessment Scale (NOMAS), developed by Marjorie Meyer Palmer in 1985, is a widely used neonatal feeding evaluation.[2, 7, 11-13] The tool is not commercially available. Raters are certified through a 3-day course, conducted most often by Palmer. As a requirement set by Palmer, participants are certified at the end of the course if they correctly classify five videotaped feedings as ‘normal’, ‘disorganized’, or ‘dysfunctional’. The NOMAS is one of the only available neonatal feeding evaluations that can be used with term or preterm infants and with infants who are breast or bottle-fed.
The NOMAS is a 28-item observational checklist of tongue and jaw movement. Following observation of non-nutritive sucking, the first two minutes of oral feeding are evaluated. Feeding is classified as normal, disorganized, or dysfunctional. Using the NOMAS, normal defines an infant demonstrating coordination of suck–swallow–breathe responses and feeding efficiency. Infants who demonstrate difficulty coordinating suck–swallow–breathe are classified as disorganized. Any abnormal movement interrupting the feeding process leads to a classification of dysfunctional.[11, 12] Dysfunctional characteristics include excessively wide jaw excursions interrupting the seal on the nipple, lateral jaw deviation, a flaccid/retracted tongue, or total absence of movement.
Though the NOMAS was developed over 20 years ago, there is limited research investigating its psychometric properties. One study reported that at age 2 years, 100% of infants classified as normal on the NOMAS will not have developmental delay, and 100% of infants classified as dysfunctional will have developmental delay. The ability of the NOMAS to predict developmental delay in infants categorized as normal or dysfunctional is cited during the certification course. However, this study assessed 17 infants and was retrospective, with many confounding factors, weakening its results and implications. Another study found the NOMAS to have modest internal consistency within the normal and disorganized categories (Cronbach α>0.70) and acceptable convergent validity for infants 32 to 35 weeks postmenstrual age (Spearman's r=0.51–0.69). The study did not include calculations for the dysfunctional category and called for further investigation of the NOMAS. Another study found the interrater reliability to be ‘moderate’ to ‘substantial,’ with kappa (κ)=0.40 to 0.65 and intrarater reliability to be variable at κ=0.33 to 0.95. This was considered unacceptable for a diagnostic tool. The NOMAS has been associated with 12-month outcome; infants categorized as normal performed better on developmental tests than infants persistently categorized as disorganized. However, this study excluded infants with cerebral injury, which is common in preterm infants and other high-risk populations.
The ability of assessments like the NOMAS to identify those in need of early intervention could enhance the short and long-term care of preterm infants, and help minimize the effects of feeding related morbidities. It is therefore necessary to further investigate the psychometric properties of the NOMAS to provide clinicians with accurate information and to work toward developing standardized and reliable tools to assess feeding in high-risk populations. The aim of this study was to evaluate the concurrent and predictive validity of the NOMAS, as well as its interrater and intrarater reliability.
- Top of page
This research was part of a longitudinal study investigating the developmental trajectory of preterm infants. It was approved by the human subjects committee at the study site. Infants were recruited from the 75-bed NICU of St Louis Children's Hospital, MO, USA. Parents provided informed consent for serial neurobehavioral evaluation, feeding assessment, and magnetic resonance imaging (MRI) during hospitalization, as well as developmental assessment at age 2 years corrected age.
Consecutive admissions to the NICU were recruited from 2008 to 2010. Participants were born at or before 30 weeks gestation and recruited within 72 hours of birth. Infants with known or suspected congenital anomalies were excluded from this study. Infants were not excluded based on any research procedures/testing.
With a primary outcome of the NOMAS predicting the presence of developmental delay, a power analysis was conducted. A sample size of 50 would provide 80% power when the true difference in proportions is 0.31, using a Pearson's χ2 test with an alpha of 0.05. A large difference in proportions was chosen because of previous reports of 100% predictive power of the NOMAS, with 100% of infants with dysfunctional feeding havin developmental delay at 2 years corrected age.
Independent variable: NOMAS
At term equivalent age (37–42wks postmenstrual age), an oral feeding session was video-recorded and scored using the NOMAS. Video recordings included a close-up lateral view of the neck, jaw, and mouth. The videos included a clear view of the lips in contact with the nipple. Videotaping commenced prior to placement of the nipple in the infant's mouth and was stopped after 2 minutes of oral feeding. The feeding was conducted by parents/family when present, therapists, or nurses. A single, certified evaluator determined the NOMAS category, based on bedside clinical observations and video analysis. This evaluator understood the study purpose to be investigating the longitudinal development of preterm infants.
Information collected from the medical record included birthweight, gestational age at birth, Critical Risk Index for Babies (CRIB) scores, sex and days of ventilation, continuous positive airway pressure, and total parenteral nutrition. The CRIB score is a measure of medical severity calculated in the first 12 hours of life where a higher score indicates increased medical severity. All medical factors were analyzed for associations with NOMAS scores.
At term equivalent, infants were assessed using the Dubowitz neurological examination (Dubowitz), NICU Network Neurobehavioral Scale (NNNS) and MRI testing.
The Dubowitz is a 34-item neurological assessment yielding an optimality score, which has been shown to decrease with increased severity of cerebral injury.[16, 18] The Dubowitz was used to analyze concurrent validity as a measure of neurological function.
The NNNS is a comprehensive assessment of neurobehavioral function yielding 13 summary scores including the following: habituation, orientation, handling, quality of movement, self-regulation, suboptimal reflexes, stress, arousal, hypertonia, hypotonia, asymmetry, excitability, and lethargy. Summary scores were used to analyze concurrent validity as a measure of term neurobehavior.
Infants underwent sedative-free MRI to obtain anatomical images with an axial magnetization-prepared rapid gradient echo T1-weighted sequence (TR/TE 1500/3ms, voxel size 1 × 0.7 × 1mm3) and a turbo spin echo T2-weighted sequence (TR/TE 8600/160ms, voxel size 1 × 1×1mm3, echo train length 17). Images were interpreted by a trained neuroradiologist. MRI and routine cranial ultrasound results determined the presence of cerebral injury, defined as the presence of grade III–IV intraventricular hemorrhage, cystic periventricular leukomalacia, and/or cerebellar hemorrhage.
Brain metrics data were obtained using reference points on specific MRI slices, with measurements taken by computer analysis. These included bifrontal diameter, biparietal diameter (BPD), bone–biparietal diameter ratio (BBPD), interhemispheric distance (IHD), and transcerebellar diameter. To compare brain and ventricle size, another measure was calculated by dividing the ventricular diameter by the BBPD. This incorporates the IHD and BPD to correlate cerebral mass with skull size.
Diffusion tensor imaging (DTI) measures were obtained from MRI bilaterally in the following regions: anterior limb of the internal capsule, posterior limb of the internal capsule, optic radiation, frontal lobe, cingulum bundle, centrum semiovale, and the corpus callosum. DTI was interpreted by a trained evaluator.
To assess concurrent validity, associations between NOMAS scores and Dubowitz scores, NNNS summary scores, the presence of cerebral injury, brain metrics, and DTI measures were analyzed.
Infants returned for testing at 2 years corrected age, using the Bayley Scales of Infant and Toddler Development, third edition (BSID-III). The BSID-III is a norm-referenced assessment, considered to be the gold standard in developmental evaluation. Composite subscores for cognitive, language, and motor skills were investigated. In addition, a categorical variable of developmental delay was created for analysis. Developmental delay was defined as having any BSID-III composite score less than 70; two standard deviations below the mean (100).[20, 21]
Recent evidence suggests that the BSID-III overestimates developmental scores. Adjustments are suggested to BSID-III scores to account for this difference. Analyses with both the original and adjusted BSID-III scores were conducted.
Interrater and intrarater reliability
Six certified NOMAS raters completed reliability testing. Five NOMAS recordings were randomly selected, presented in a randomized order, and scored. Raters were blinded to medical history. After a minimum two-week period, raters re-scored the same recordings, presented in another randomized order. Scores were collected and processed to determine interrater and intrarater reliability.
Independent samples t-tests and Mann–Whitney U tests were used to determine associations between the NOMAS and the NNNS and Dubowitz optimality scores. MRI outcomes were analyzed using linear regression, while controlling for postmenstrual age at the time of scan. Predictive validity was investigated using χ2 analysis to determine the presence of developmental delay related to NOMAS scores, and independent sample t-tests to determine differences in BSID-III scores across NOMAS categories. A 95% confidence interval (CI) was calculated to represent the true value of dysfunctional infants with developmental delay at age 2 years. Additionally, a comparison between the proportion of dysfunctional infants in this cohort and the proportion of dysfunctional infants in the 1999 Palmer cohort (100%) was made using Fisher's exact test. Cohen's kappa statistics were used to calculate reliability. Analyses were performed using ibm spss 20 software (IBM Corporation, Chicago, IL, USA) and/or sas 9.3 software (SAS Institute, Cary, NC, USA), α=0.05.
- Top of page
Ninety-eight infants were enrolled. Of these, five withdrew and 14 died, leaving 79 infants in the cohort who were discharged from the NICU. NOMAS scores were unavailable for four infants: one early discharge, one transfer, one infant not orally feeding at discharge, and one missing data point. There were no differences in medical factors among those who withdrew and those who remained in the study, among those who had and did not have NOMAS scores, and among those who did and did not return for developmental testing. Data from 75 infants (39 females, 36 males; mean gestational age 26.56wks’ [SD 1.90], range 23–30wks’; mean birthweight 967.33g, [SD 288.54g], range 480–2240g; n=73 bottle-fed, n=2 breastfed during the NOMAS evaluation) were included in analysis. Of these, 44 (59%) were categorized as disorganized feeders and 31 (41%) as dysfunctional feeders, with none categorized as normal feeders. Infants categorized as dysfunctional had longer periods of mechanical ventilation (t=2.60, mean difference 5.11 [CI=2.11–22.67]; p=0.02). No other medical variables were associated with NOMAS scores (see Table 1).
Table 1. Description of the sample
| ||NOMAS category|| || |
|Characteristic||Disorganized (n=44) mean (SD) or n (%)||Dysfunctional (n=31) mean (SD) or n (%)||Mean difference|| p a |
|Gestational age at birth (wks)||27.14 (1.76)||26.06 (2.70)||1.07||0.06|
|Birthweight, g||997.61 (240.46)||923.87 (345.24)||73.74||0.28|
|Females sex||24 (55%)||15 (48%)||–||0.60|
|Total parenteral nutrition (d)||20.73 (15.56)||31.35 (28.04)||−10.63||0.06|
|CPAP (d)||6.57 (8.70)||7.42 (11.27)||−0.85||0.71|
|Days on mechanical ventilation||8.55 (16.41)||20.94 (24.91)||−12.39|| 0.02 |
|Critical Risk Index for Babies||3.32 (3.32)||4.47 (3.87)||−1.15||0.18|
|Cerebral injuryb||8 (18.2%)||6 (19.4%)||–||0.92|
Dysfunctional NOMAS scores were associated with increased signs of stress on the NNNS (t=2.61, mean difference 0.073 [95% CI 0.017–0.129]; p=0.011) and decreased neurological function on the Dubowitz (t=−2.14; mean difference −2.32, [95% CI −0.157 to −4.49]; p=0.036). No other associations were found between NOMAS category and neurobehavior at term (see Table 2).
Table 2. Associations between NOMAS, neurobehavior, and brain structure
|Variable||NOMAS disorganized (n=44) mean (SD)||NOMAS dysfunctional (n=31) mean (SD)|| p b |
|Dubowitz optimality score||19.77 (4.48)||17.45 (4.84)|| 0.04 |
| Habituationa||8.17 (6.50, 9.00)||7.50 (7.00, 9.00)||0.12|
| Orientation||3.32 (1.15)||3.2 (1.43)||0.73|
| Handlinga||0.63 (0.63, 0.75)||0.75 (0.63, 0.81)||0.11|
| Quality of movement||3.60 (0.72)||3.39 (0.84)||0.23|
| Self regulation||4.51 (0.82)||4.20 (0.69)||0.09|
| Suboptimal reflexes||6.48 (2.09)||7.42 (2.46)||0.08|
| Stress||0.33 (0.12)||0.41 (0.13)|| 0.01 |
| Arousal||4.13 (0.86)||4.26 (0.95)||0.55|
| Hypertoniaa||1.00 (1.00, 2.00)||2.00 (1.00, 3.00)||0.25|
| Hypotoniaa||1.00 (0.00, 1.00)||1.00 (0.00, 1.00)||0.42|
| Asymmetrya||2.00 (0.00, 4.00)||3.00 (1.00, 4.00)||0.21|
| Excitability||5.05 (2.57)||6.10 (2.51)||0.08|
| Lethargy||6.77 (2.67)||7.16 (3.15)||0.57|
| Bifrontal diameter||60.00 (6.15)||58.26 (6.26)||0.26|
| Biparietal diameter||70.43 (4.90)||69.08 (6.00)||0.31|
| Interhemispheric distance||3.15 (1.57)||3.39 (1.43)||0.51|
| Transcerebellar diameter||48.59 (3.37)||46.55 (4.27)|| 0.03 |
| Right ventricle: brain size ratio||8.53 (1.99)||7.88 (1.33)||0.20|
| Left ventricle: brain size ratio||8.71 (1.96)||7.84 (1.61)||0.09|
Cerebral injury was present in 18.6% (n=14) of the cohort that survived until NICU discharge and had NOMAS scores. No significant associations between presence of cerebral injury and NOMAS scores were observed. See Table 2 for associations between brain metrics and NOMAS categorization. Dysfunctional NOMAS scores were associated with decreased transcerebellar diameter (t=−2.22, mean difference −2.04, [CI=−3.89 to −0.203]; p=0.03). No other MRI measures were associated with NOMAS scores.
Following discharge, an additional two infants died and one withdrew from the study, leaving 72 infants in the cohort. Fifty-nine (82%) infants returned for developmental testing at 2 years (see Table 3 for mean BSID-III scores by NOMAS category). Seventeen percent (n=10) of the cohort had developmental delay at age 2 years. There were no associations between NOMAS scores and BSID-III cognitive, language, and motor outcome. There was no association between NOMAS scores and developmental delay, using adjusted and unadjusted scores on the BSID-III.
Table 3. Predictive validity: associations between NOMAS and 2-year outcome
| ||NOMAS categories|| || |
|Domain||Disorganized (n=35) mean (SD) or n (%)||Dysfunctional (n=24) mean (SD) or n (%)||Mean difference or odds ratio|| p a |
| Cognitive||84.69 (10.19)||87.71 (10.32)||−3.02||0.27|
| Language||91.62 (12.16)||86.96 (13.25)||4.66||0.17|
| Motor||86.29 (11.28)||84.42 (10.48)||1.87||0.52|
| DD||6 (17.1%)||4 (16.7%)||OR=0.93b||0.62|
Seventeen percent (n=4) of infants categorized as dysfunctional had developmental delay at age 2 years. The confidence interval of this proportion is 0.01 to 0.33. The proportions between the two cohorts (100% from the 1999 cohort of n=6 and 17% in the current cohort) were significantly different (Fisher's exact test p=0.017).
Interrater reliability ranged between κ=−0.43 and 0.62. Intrarater reliability ranged from κ=0.33 to 1.00 (see Table 4).
Table 4. Kappa scores for interrater and intrarater reliability
| ||Rater 1a||Rater 2||Rater 3||Rater 4||Rater 5a||Rater 6a|
|Rater 1a|| ||−0.11||−0.43||0.29||0.17||0.62|
|Rater 2|| || ||0.38||0.06||−0.39||−0.18|
|Rater 3|| || || ||−0.25||−0.23||−0.36|
|Rater 4|| || || || ||0.29||−0.36|
|Rater 5a|| || || || || ||−0.15|
|Years of infant experience||18||0||0||0||10||30|
- Top of page
Key findings include that predictive validity of the NOMAS was not supported, contradicting previous reports of 100% predictive validity for infants with dysfunctional feeding. However, the NOMAS does have some concurrent validity through associations with measures of neurobehavior and brain structure at term. Interrater and intrarater reliability were variable and suboptimal.
Associations between NOMAS scores and days of mechanical ventilation were found. Adverse oral stimuli, such as prolonged mechanical ventilation are suspected to relate to long-term feeding outcomes.[8, 22] This was supported in the current study, manifesting in poorer NOMAS scores. However, no other perinatal exposures were associated with NOMAS outcome.
Some concurrent validity of the NOMAS was established through associations with measures of early neurobehavior. Infants categorized as dysfunctional on the NOMAS demonstrated lower Dubowitz scores, which incorporates motor-based items, including assessment of suck and tone.15 As efficient oral feeding requires efficient oral-motor control and skills,[23, 24] associations with the NOMAS provide some rationale that they are testing similar domains of function.
Infants categorized as dysfunctional on the NOMAS also demonstrated more signs of stress on the NNNS. Stress signs may include an inability to achieve an awake state, startles/tremors, cyanosis, and other disruptive behaviors. Stress signs may interrupt the feeding process and manifest in poor performance on the NOMAS. No other study to date has investigated concurrent associations of the NOMAS with neurobehavior; therefore, these associations are new contributions to the literature and merit further investigation.
The NOMAS was associated with one variable in cerebral structural analysis. Dysfunctional feeders demonstrated decreased transcerebellar diameter. The cerebellum is associated with fine and gross motor coordination and in infancy plays a vital role in the coordination of reflexive feeding movements.[23-25] Studies have established that the cerebellum is particularly vulnerable in preterm infants. Cerebellar volume in preterm infants correlates with cerebral injury and volume loss, later head circumference measurements, and overall growth. This may be related to later feeding difficulties, which have been linked with neonatal cerebral injury and long-term neurological conditions. No other study has investigated associations between NOMAS scores and concurrent brain development. The results of this study demonstrate that although NOMAS scores are not associated with many types of cerebral injury, the cerebellum can have important associations with feeding performance.
Predictive validity of the NOMAS was not supported because of lack of associations with measures of functional outcome at age 2 years. Less than 20% of dysfunctional feeders demonstrated developmental delay. This contradicts earlier research establishing 100% predictive validity of the dysfunctional category. Feeding problems have been associated with functional impairment;[5, 8, 24] however, we have demonstrated low predictive validity of this tool in a prospective study with high-risk preterm infants.
Reliability values were variable and substandard. NOMAS-certified raters in this study failed to demonstrate adequate interrater reliability. This contradicts research describing moderate to substantial interrater reliability of the NOMAS. Some values for interrater reliability were negative. These are interpreted as equivalent to zero, with decreased likelihood that the results are due to chance. Failure to achieve adequate interrater and intrarater reliability jeopardizes the clinical utility of the tool. The rater who determined NOMAS category for the entire cohort achieved an intrarater κ of 1.00 and had 18 years of experience in the NICU. The other rater with the highest agreement (κ=0.62) was another therapist with significant feeding experience in the NICU. Those with less experience in neonatal feeding, including students and clinicians not primarily working with neonates, are represented in the lower range of κ values. This may indicate that the course itself is not sufficient to prepare raters to use the NOMAS reliably. Clinical experience, in conjunction with NOMAS training, may enhance the utility of the tool and the ability of clinicians to assess infants appropriately. This merits further investigation to fully discern factors influencing reliability of the NOMAS.
This study investigated a diverse sample, i.e. those influenced postdischarge by various factors, including young maternal age, family structure, socio-economic status, and therapeutic interventions. All of these factors may influence outcome measured at follow-up. The NOMAS cannot account for these factors, which may impede its ability to predict later function.
This is the first study to investigate psychometrics of the NOMAS prospectively with this population. Previous studies included small sample sizes (n≤35), were retrospective, and/or confounded.[1, 13, 14] This study included a larger sample size and compared the NOMAS to medical factors in the NICU, neurobehavior, brain structure, and developmental outcomes at 2 years corrected age. Comparing the NOMAS to these factors is both a novel addition to the literature and critical to understanding the utility of the NOMAS.
The NOMAS is considered one of the best available tools for the evaluation of neonatal feeding. Some changes that may improve the utility of the NOMAS include establishing objective scoring criteria in a manual, which is not currently available, and changes to the certification, emphasizing clinical experience beyond the course. The NOMAS has not been recently updated, and modifications may improve its clinical utility.
There were some limitations to this study. Though this study was conducted using a larger and more diverse sample than previous investigations, it remains lower than ideal for a psychometric investigation. However, adequate power was achieved to state that the NOMAS dysfunctional category failed to predict developmental delay 100% of the time, as previously reported. Interrater and intrarater reliability were determined using a convenience sample of raters. Although certified to score the NOMAS, three raters were occupational therapy students and had limited NICU clinical experience. However, the upper range of interrater and intrarater reliability is represented in the statistics and remains suboptimal. Finally, none of the infants in the cohort were categorized as normal on the NOMAS. All participants were preterm infants born at or below 30 weeks gestation with a high risk of feeding impairment and developmental problems, which lends credibility to the test as it discerned abnormalities in all infants, as expected. As many factors were examined at term, an increased likelihood existed of multiple findings.
Future directions for research include further and more detailed evaluations of NOMAS items (vs categories) and investigation into cerebral structural differences that were found in this study. Further investigation into developmental differences between NOMAS categories observed in this cohort is warranted. The BSID-III is one of the best available tools for measuring developmental function. It was expected that any existing differences between NOMAS groups would be detected at follow-up. More research is needed to analyze why differences were not found.
The NOMAS appears to have some clinical utility, but modifications to the tool and/or certification may better prepare evaluators to administer and score the assessment appropriately. Currently, the NOMAS lacks adequate evidence to support the claim of predicting future function. With very few structured assessments available to use in the NICU, an enhanced tool may improve early identification of feeding problems to enable services to optimize outcome.