A web service for the supernova classification method used in this paper can be found at http://supernovaclass.info/.
Semi-supervised learning for photometric supernova classification★
Article first published online: 28 OCT 2011
© 2011 The Authors Monthly Notices of the Royal Astronomical Society © 2011 RAS
Monthly Notices of the Royal Astronomical Society
Volume 419, Issue 2, pages 1121–1135, January 2012
How to Cite
Richards, J. W., Homrighausen, D., Freeman, P. E., Schafer, C. M. and Poznanski, D. (2012), Semi-supervised learning for photometric supernova classification. Monthly Notices of the Royal Astronomical Society, 419: 1121–1135. doi: 10.1111/j.1365-2966.2011.19768.x
- Issue published online: 16 DEC 2011
- Article first published online: 28 OCT 2011
- Accepted 2011 September 4. Received 2011 September 2; in original form 2011 March 30
- methods: data analysis;
- methods: statistical;
- techniques: photometric;
- supernovae: general
We present a semi-supervised method for photometric supernova typing. Our approach is to first use the non-linear dimension reduction technique diffusion map to detect structure in a data base of supernova light curves and subsequently employ random forest classification on a spectroscopically confirmed training set to learn a model that can predict the type of each newly observed supernova. We demonstrate that this is an effective method for supernova typing. As supernova numbers increase, our semi-supervised method efficiently utilizes this information to improve classification, a property not enjoyed by template-based methods. Applied to supernova data simulated by Kessler et al. to mimic those of the Dark Energy Survey, our methods achieve (cross-validated) 95 per cent Type Ia purity and 87 per cent Type Ia efficiency on the spectroscopic sample, but only 50 per cent Type Ia purity and 50 per cent efficiency on the photometric sample due to their spectroscopic follow-up strategy. To improve the performance on the photometric sample, we search for better spectroscopic follow-up procedures by studying the sensitivity of our machine-learned supernova classification on the specific strategy used to obtain training sets. With a fixed amount of spectroscopic follow-up time, we find that, despite collecting data on a smaller number of supernovae, deeper magnitude-limited spectroscopic surveys are better for producing training sets. For supernova Ia (II-P) typing, we obtain a 44 per cent (1 per cent) increase in purity to 72 per cent (87 per cent) and 30 per cent (162 per cent) increase in efficiency to 65 per cent (84 per cent) of the sample using a 25th (24.5th) magnitude-limited survey instead of the shallower spectroscopic sample used in the original simulations. When redshift information is available, we incorporate it into our analysis using a novel method of altering the diffusion map representation of the supernovae. Incorporating host redshifts leads to a 5 per cent improvement in Type Ia purity and 13 per cent improvement in Type Ia efficiency.