This must be a dumb question, but I can't find an easy answer: if I have sets of MCMC chains from two different experiments how do I combine them optimally? I know that it would be best to just run new chains with the joint likelihood, but you know the old story about estimating parameters with the army you have rather than the one you would like to have.
I can think of ways to do it that involve either further likelihood calculations or some kind of binning. The first seems unnecessary and could be computationally prohibitive and the second is obviously suboptimal, especially in a highdimensional space. I've been trying to use Poisson statistics and treating each point in the chains as a Poisson sample of the underlying distribution but a few minutes of doodling has not led to enlightenment.
Anyone know of a good reference on this?
combining independent MCMC chains

 Posts: 27
 Joined: September 25 2004
 Affiliation: McGill University

 Posts: 183
 Joined: September 24 2004
 Affiliation: Brookhaven National Laboratory
 Contact:
combining independent MCMC chains
What you could do is to importance one chain with another. So you write your own importance sampler, that reads points from one chain and then weights them according to the importance sampling rule  to do this you of course need chi2 at that point from the other chain's data  you can interpolate that using a glass interpolator (I can mail you one) if there are a few samples close enough and you set weight to zero if there are no samples close enough (as this means that region is forbidden by data2 anyway). Then you do this again by importance sampling chain2 with chain1 and at the end just concatenate the resulting chains.
Two chains must use completely distjoint datasets for this to work, otherwise I think there is no way from escaping the likelihood recalculation (because you have simply lost the information on how constraing are the common datasets)
Two chains must use completely distjoint datasets for this to work, otherwise I think there is no way from escaping the likelihood recalculation (because you have simply lost the information on how constraing are the common datasets)