• methods: data analysis;
  • methods: observational;
  • methods: statistical


We present an application of unsupervised machine learning – the self-organized map (SOM) – as a tool for visualizing, exploring and mining the catalogues of large astronomical surveys. Self-organization culminates in a low-resolution representation of the ‘topology’ of a parameter volume, and this can be exploited in various ways pertinent to astronomy. Using data from the Cosmological Evolution Survey (COSMOS), we demonstrate two key astronomical applications of the SOM: (i) object classification and selection, using galaxies with active galactic nuclei as an example, and (ii) photometric redshift estimation, illustrating how SOMs can be used as totally empirical predictive tools. With a training set of ∼3800 galaxies with zspec≤ 1, we achieve photometric redshift accuracies competitive with other (mainly template fitting) techniques that use a similar number of photometric bands [σ(Δz) = 0.03 with a ∼2 per cent outlier rate when using u* band to 8 inline imagem photometry]. We also test the SOM as a photo-z tool using the PHoto-z Accuracy Testing (PHAT) synthetic catalogue of Hildebrandt et al., which compares several different photo-z codes using a common input/training set. We find that the SOM can deliver accuracies that are competitive with many of the established template fitting and empirical methods. This technique is not without clear limitations, which are discussed, but we suggest it could be a powerful tool in the era of extremely large –‘petabyte’– data bases where efficient data mining is a paramount concern.