The first 2 authors contributed equally to this work.
Uncovering nativity disparities in cancer patterns: Multiple imputation strategy to handle missing nativity data in the Surveillance, Epidemiology, and End Results data file
Article first published online: 16 JAN 2014
© 2013 American Cancer Society
Volume 120, Issue 8, pages 1203–1211, 15 April 2014
How to Cite
Montealegre, J. R., Zhou, R., Amirian, E. S. and Scheurer, M. E. (2014), Uncovering nativity disparities in cancer patterns: Multiple imputation strategy to handle missing nativity data in the Surveillance, Epidemiology, and End Results data file. Cancer, 120: 1203–1211. doi: 10.1002/cncr.28533
The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the Cancer Prevention and Research Institute of Texas.
- Issue published online: 8 APR 2014
- Article first published online: 16 JAN 2014
- Manuscript Accepted: 12 NOV 2013
- Manuscript Revised: 22 OCT 2013
- Manuscript Received: 7 JUN 2013
- emigration and immigration;
- Surveillance, Epidemiology, and End Results (SEER) program;
- multiple imputation;
- Hispanic Americans;
- health status disparities
Although birthplace data are routinely collected in the participating Surveillance, Epidemiology, and End Results (SEER) registries, such data are missing in a nonrandom manner for a large percentage of cases. This hinders analysis of nativity-related cancer disparities. In the current study, the authors evaluated multiple imputation of nativity status among Hispanic patients diagnosed with cervical, prostate, and colorectal cancer and demonstrated the effect of multiple imputation on apparent nativity disparities in survival.
Multiple imputation by logistic regression was used to generate nativity values (US-born vs foreign-born) using a priori-defined variables. The accuracy of the method was evaluated among a subset of cases. Kaplan-Meier curves were used to illustrate the effect of imputation by comparing survival among US-born and foreign-born Hispanics, with and without imputation of nativity.
Birthplace was missing for 31%, 49%, and 39%, respectively, of cases of cervical, prostate, and colorectal cancer. The sensitivity of the imputation strategy for detecting foreign-born status was ≥ 90% and the specificity was ≥ 86%. The agreement between the true and imputed values was ≥ 0.80 and the misclassification error was ≤ 10%. Kaplan-Meier survival curves indicated different associations between nativity and survival when nativity was imputed versus when cases with missing birthplace were omitted from the analysis.
Multiple imputation using variables available in the SEER data file can be used to accurately detect foreign-born status. This simple strategy may help researchers to disaggregate analyses by nativity and uncover important nativity disparities in regard to cancer diagnosis, treatment, and survival. Cancer 2014;120:1203–1211. © 2014 American Cancer Society.