[1308.4145] The first analytical expression to estimate photometric redshifts suggested by a machine

Authors:  A. Krone-Martins, E.E.O. Ishida, R. S. de Souza
Abstract:  We report the first analytical expression purely constructed by a machine to determine photometric redshifts ($z_{\rm phot}$) of galaxies. A simple and reliable functional form is derived using $41,214$ galaxies from the Sloan Digital Sky Survey Data Release 10 (SDSS-DR10) spectroscopic sample. The method automatically dropped the $u$ and $z$ bands, relying only on $g$, $r$ and $i$ for the final solution. Applying this expression to other $1,417,181$ SDSS-DR10 galaxies, with measured spectroscopic redshifts ($z_{\rm spec}$), we achieved a mean $\langle (z_{\rm phot} - z_{\rm spec})/(1+z_{\rm spec})\rangle\lesssim 0.0086$ and a scatter $\sigma_{(z_{\rm phot} - z_{\rm spec})/(1+z_{\rm spec})}\lesssim 0.045$ when averaged up to $z \lesssim 1.0$. This work is the first use of symbolic regression in cosmology, representing a leap forward in astronomy-data-mining connection.
[PDF]  [PS]  [BibTex]  [Bookmark]

Discussion related to specific recent arXiv papers
Post Reply
brian nord
Posts: 1
Joined: July 15 2006
Affiliation: Fermi National Accelerator Laboratory

[1308.4145] The first analytical expression to estimate pho

Post by brian nord » February 06 2014

This work uses a machine-learning software technology to estimate photometric redshifts: more importantly, it estimates the functional form of the relationships between photometry and redshift. The technology performs a search over an increasingly complicated functional space, using a software called EUREQA, which has also been developed into both educationally and commercially licensed software. It appears very mature.

Does a method like sufficiently sample the space of possible relationships between photometry and redshift?

Is there a probabilistic interpretation of their results? I.e., how would one obtain a p-value or significance of the output? Here, the question is regarding assigning objects to classes probabilistically ... e.g., types of galaxies.

How reliable is this work's methodology? Are any major drawbacks not discussed? If not, why are more people not using this method?

Maciej Bilicki
Posts: 21
Joined: May 12 2010
Affiliation: Center for Theoretical Physics PAS, Warsaw

[1308.4145] The first analytical expression to estimate phot

Post by Maciej Bilicki » March 21 2014

You can easily imagine that the best-fit formula will depend on the training data as well as any other variables such as the passbands/colors to use.

Same happens with any empirical methods: you need to retrain your e.g. neural networks if the properties of your input data change.

After all, there's no one fixed relation between magnitudes and redshifts: galaxies are not all the same.

Post Reply