• emigration and immigration;
  • Surveillance, Epidemiology, and End Results (SEER) program;
  • multiple imputation;
  • Hispanic Americans;
  • health status disparities


Although birthplace data are routinely collected in the participating Surveillance, Epidemiology, and End Results (SEER) registries, such data are missing in a nonrandom manner for a large percentage of cases. This hinders analysis of nativity-related cancer disparities. In the current study, the authors evaluated multiple imputation of nativity status among Hispanic patients diagnosed with cervical, prostate, and colorectal cancer and demonstrated the effect of multiple imputation on apparent nativity disparities in survival.


Multiple imputation by logistic regression was used to generate nativity values (US-born vs foreign-born) using a priori-defined variables. The accuracy of the method was evaluated among a subset of cases. Kaplan-Meier curves were used to illustrate the effect of imputation by comparing survival among US-born and foreign-born Hispanics, with and without imputation of nativity.


Birthplace was missing for 31%, 49%, and 39%, respectively, of cases of cervical, prostate, and colorectal cancer. The sensitivity of the imputation strategy for detecting foreign-born status was ≥ 90% and the specificity was ≥ 86%. The agreement between the true and imputed values was ≥ 0.80 and the misclassification error was ≤ 10%. Kaplan-Meier survival curves indicated different associations between nativity and survival when nativity was imputed versus when cases with missing birthplace were omitted from the analysis.


Multiple imputation using variables available in the SEER data file can be used to accurately detect foreign-born status. This simple strategy may help researchers to disaggregate analyses by nativity and uncover important nativity disparities in regard to cancer diagnosis, treatment, and survival. Cancer 2014;120:1203–1211. © 2014 American Cancer Society.