Multiple imputation: an alternative to top coding for statistical disclosure control


Roderick J. A. Little, Department of Biostatistics, University of Michigan, 1420 Washington Heights M4045, Ann Arbor, MI 48109-2029, USA.


Summary.  Top coding of extreme values of variables like income is a common method of statistical disclosure control, but it creates problems for the data analyst. The paper proposes two alternative methods to top coding for statistical disclosure control that are based on multiple imputation. We show in simulation studies that the multiple-imputation methods provide better inferences of the publicly released data than top coding, using straightforward multiple-imputation methods of analysis, while maintaining good statistical disclosure control properties. We illustrate the methods on data from the 1995 Chinese household income project.