
CosmoCoffee

Authors:  Eric V. Linder, Ramon Miquel 
Abstract:  Interpretation of cosmological data to determine the number and values of
parameters describing the universe must not rely solely on statistics but
involve physical insight. Statistical techniques such as "model selection" or
"integrated survey optimization" blindly apply Occam's Razor  this can lead to
painful results. We emphasize that the sensitivity to prior probabilities and
to the number of models compared can lead to "prior selection" rather than
robust model selection. A concrete example demonstrates that Information
Criteria can in fact misinform over a large region of parameter space. 

[PDF]
[PS] [BibTex] [Bookmark]

View previous topic :: View next topic 
Author 
Message 
John Peacock
Joined: 02 Mar 2007 Posts: 3 Affiliation: University of Edinburgh

Posted: May 09 2007 


Moncy Vilavinal John wrote:
Quote: 
As I understand, the authors and also Peacock are worried about whether a competing model in a model comparison can get undue advantage by picking a suitable prior for some new parameter. But this anxiety is unfounded and can be dispelled once we recognize that Bayesian model comparison is not a onetime exercise. The posterior for that parameter, obtained in that analysis, must be used as prior in the future observation

I agree with this  but it's *NOT* what happens in model selection. Your original prior is never forgotten. Let's say your prior was n_{s} varies between 0 and 1000 with a uniform probability. Then you make a measurement which yields a posterior that is a Gaussian peak near n_{s}=1 with width 0.1. You might want to adopt this Gaussian as the prior for the next time you do model comparison, but the question is are you allowed to renormalise the probability under it to unity, rather than 0.0001? If you do so, then the Evidence ratio always ends up just as the likelihood ratio. This seems sensible to me, but it's not the official approach. 

Back to top 


Roberto Trotta
Joined: 27 Sep 2004 Posts: 18 Affiliation: Imperial College London

Posted: May 09 2007 


John,
I'm not sure anybody is in a position to define "an official approach". My personal point of view on this is that the prior updating business is absolutely fine for parameter inference – after all, our current knowledge becomes our starting point for the next measurement. I think nobody disputes that.
However, when we are talking about model selection I'd rather take the view that the prior ought to be motivated by theory, and hence ought not to be updated every time. This is because the point of the whole exercice is to compare models, ie theories. So you want to define your prior, ideally motivated by the theory or physical model you have in mind, and then compare it with successively improved data sets, while keeping the same prior all of the time. After all, your theory (ie, your model prior) does not change when new data come along (and if it does then it's because it has become a different theory in response to the data, and to be fair you should compare this new theory with the old one, ie do model comparison!).
In this sense you are right, that your original prior does not go away. But if you do this consistently (and of course if the original prior was sensible and physically motivated), successive layers of data will shave away with Occam's razor unnecessary theories, thus leaving you in the end with the best (in terms of the data) model from the original collection you started with. 

Back to top 


John Peacock
Joined: 02 Mar 2007 Posts: 3 Affiliation: University of Edinburgh

Posted: May 10 2007 


Kate Land wrote: 
I understand the notion of updating priors, etc. But this leads to the question of what is the first ever prior that you use? ie. When you have absolutely no information whatsoever, and no theoretical ideas either. There is no suitable prior in this case!

I also agree with this. It's easy to think of cases where there is a correct prior, but you have no idea what it is. A simple example is where you want to use a noisy flux measurement to put some confidence limits on the flux from some astronomical object. You will get sensible answers if you use the actual number counts for that population, but if you insist on using the Jeffreys prior (1/flux in this case), you get nonsense. If you don't know what the counts are (maybe this is the first member of this class of object ever observed), it's silly to guess, since you are bound to guess wrong. I refer to this as the "Bayesian 5th amendment": a statistical procedure shouldn't force you to incriminate yourself by picking a prior when you have no idea how to make a sensible choice. 

Back to top 


Roberto Trotta
Joined: 27 Sep 2004 Posts: 18 Affiliation: Imperial College London

Posted: May 10 2007 


John Peacock wrote: 
I also agree with this. It's easy to think of cases where there is a correct prior, but you have no idea what it is. A simple example is where you want to use a noisy flux measurement to put some confidence limits on the flux from some astronomical object. 
This seems to me like a parameter inference problem, not a model selection one. I also dispute the notion of there being "a correct prior"  priors can be more or less meaningful depending on how fatihfully they describe your state of knowledge (and this requires physical insight). But from a Bayesian persepctive they do not reflect a physical property of the system under study, rather the state of knowledge of the observer (based on the full amount of her scientific knowledge, expertise etc so far) before she collects the data.
Jaynes argues that there is no such a thing as "perfect ignorance", and that ultimately our understanding of the physical properties of the object under scrutiny must allow to set some (possibly loose) boundaries for your prior.
In the case you are describing, it seems to me that a Jeffreys' prior is adequate, since it gives equal weight to every order of magnitude (between some sensibly defined minimum and maximum value) of the flux. Actually Jaynes shows that such a prior is the only consistent one in such a case, since every other choice would lead different scientists sharing the same state of knowledge to different conclusions, eg by simply using different units to measure the flux (which is clearly paradoxical).
Parameter inference then proceeds to update this through the likelihood and I have little doubt that the correct magnitude for the flux will be singled out in the posterior (if you have "informative" data, of course; if not, then your posterior is equal to the prior and you have learned that your experiment is rubbish at constraining the flux). I don't see how this procedure could give you "nonsense". But I stress once more that this is not model selection. but parameter inference. 

Back to top 




You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum

