[astro-ph/0608007] Estimation of Polarized Power Spectra by Gibbs sampling

Authors:  D. L. Larson, H. K. Eriksen, B. D. Wandelt, K. M. Gorski, Greg Huey, J. B. Jewell, I. J. O'Dwyer
Abstract:  Earlier papers introduced a method of accurately estimating the angular cosmic microwave background (CMB) temperature power spectrum based on Gibbs sampling. Here we extend this framework to polarized data. All advantages of the Gibbs sampler still apply, and exact analysis of mega-pixel polarized data sets is thus feasible. These advantages may be even more important for polarization measurements than for temperature measurements. While approximate methods can alias power from the larger E-mode spectrum into the weaker B-mode spectrum, the Gibbs sampler (or equivalently, exact likelihood evaluations) allows for a clean separation of these modes. To demonstrate the method, we analyze two simulated data sets: 1) a hypothetical future CMBPol mission, with the focus on B-mode estimation; and 2) a Planck-like mission, to highlight the computational feasibility of the method.
[PDF]  [PS]  [BibTex]  [Bookmark]

Discussion related to specific recent arXiv papers
Post Reply
Antony Lewis
Posts: 1941
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

[astro-ph/0608007] Estimation of Polarized Power Spectra by

Post by Antony Lewis » August 02 2006

A nice paper implementing the Gibbs sampling technique to CMB polarization.

I think some of the comments about E/B separation are potentially confusing. There are two issues here: 1. the fundamental property than on an incomplete sky there exist ambiguous E/B modes; E/B cannot be separated exactly (e.g. astro-ph/0305545 and refs); 2. the problems that arise mixing E/B power when estimating the power spectra (for which naive Pseudo-C_l is not very good). Gibbs certainly helps with the latter, but they do not appear to have implemented E/B separation for the former in this paper.

Consider a noise free sky. By my understanding of the Gibbs method, with a sky cut the full sky samples should include non-zero B modes even if the true sky has B=0. This is just because there are ambiguous modes, so the method will sample all possible combinations of E and B that give the same observed ambiguous modes on the observed section. To this extent the method exhibits E/B mixing: B modes are present in the posterior samples even when there were none originally. I think one could if desired do E/B-separation by studying the covariance of the posterior full sky E_lm and B_lm samples.

Kendrick Smith
Posts: 3
Joined: August 03 2006
Affiliation: University of Chicago

[astro-ph/0608007] Estimation of Polarized Power Spectra by

Post by Kendrick Smith » August 03 2006

Hi Antony,

My intuition says the opposite: that Gibbs sampling should do an optimal job of separating EB because it samples the exact likelihood, which should give optimal separation (e.g. maximum likelihood estimators saturate Cramer-Rao). The likelihood should be the same in the end whether ambiguous modes are separated out explicitly or the separation is hidden in the linear algebra. It would be great to see this demonstrated true or false, say for an azimuthally symmetric survey where the Cramer-Rao bound can be computed feasibly because all covariance matrices are diagonal in m (I tried this comparison for pseudo-Cl in appendix F of astro-ph/0511629).

David Larson
Posts: 11
Joined: September 25 2004
Affiliation: Johns Hopkins University

[astro-ph/0608007] Estimation of Polarized Power Spectra by

Post by David Larson » August 03 2006

Just trying to make sure I understand the question...

In some sense, the Gibbs sampling does not have to deal with cut sky issues, since we sample full skies. No code is needed to deal with cut sky issues, since we do our harmonic transforms on the full sky.

However, for the present paper, we ignore all the data inside the mask (since we consider it to be unusably contaminated by foregrounds). Because there exist B modes that are nonzero only inside the mask, then the data do not constrain these B modes, even with zero noise outside the mask. Thus the Gibbs sampler would sample those B modes, and would not set the
[tex]C^{BB}_{\ell}[/tex] spectrum to zero. I believe this should be the desired behavior.

Our goal is to do as much E/B separation as the data will allow, and no more. :)

Antony Lewis
Posts: 1941
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: [astro-ph/0608007] Estimation of Polarized Power Spectra

Post by Antony Lewis » August 03 2006

I agree likelihoods should be optimal, but I don't think the algorithm in itself is doing any E/B mode separation (all full sky noise free maps will have some non-zero B mode everywhere, including in the observed region). To separate the modes you'd need to project the full sky sampled E_lm and B_lm into the linear combinations of the full sky E and B modes that are pure B and pure E over the observed area?

I'm not saying Gibbs is not a good method - I just don't see that it is doing E/B mode separation (which is redundant for getting exact likelihoods if everything is as expected).

Antony Lewis
Posts: 1941
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: [astro-ph/0608007] Estimation of Polarized Power Spectra

Post by Antony Lewis » August 03 2006

I would also like to know what happens when C^B_l = 0 (or pure lensing) - this is the case of most worry for E/B mixing/convergence problems. In particular, how does the method arrange to give finite probability at C^B_l =0 given that C^{-1} is singular there?

I would be quite surprised if the method does not have some convergence problems for low C^B_l, as this is rather like the troublesome low signal to noise regime. I'd therefore be interested to see if the method works if r=0 is assumed in the simulation (so C^E >> C^B at low l), or whether it takes a very long convergence time for low C^B to be sampled correctly. If you imaging measuring the polarization at a single point, there is an exact degeneracy between possible E and B full-sky samples, and hence potentially just the kind of correlated distribution Gibbs sampling becomes very bad at exploring.

It is also worth noting that the C^B_l due to lensing is quite non-Gaussian and correlated, hence use of the Gaussian uncorrelated results is not strictly justified (though does not of course prevent you from constructing non-Gaussian sky samples if the observed sky is). For a really convincing demonstration one could generate a non-Gaussian lensed sky (e.g. using LensPix) and see if the method still works or can easily be adapted to give correct results.

Hans Kristian Eriksen
Posts: 60
Joined: September 25 2004
Affiliation: ITA, University of Oslo
Contact:

Re: [astro-ph/0608007] Estimation of Polarized Power Spectra

Post by Hans Kristian Eriksen » August 03 2006

Antony Lewis wrote:I would also like to know what happens when C^B_l = 0 (or pure lensing) - this is the case of most worry for E/B mixing/convergence problems. In particular, how does the method arrange to give finite probability at C^B_l =0 given that C^{-1} is singular there?
It's actually never singular, because C^B_l will always be strictly positive, due to noise. Remember that we are not trying to find the single "true" value, but rather a full distribution of possible values. So for a realization that truly has B=0, we will get a distribution that peaks at 0, and then drops off quickly at high values.
I would be quite surprised if the method does not have some convergence problems for low C^B_l, as this is rather like the troublesome low signal to noise regime. I'd therefore be interested to see if the method works if r=0 is assumed in the simulation (so C^E >> C^B at low l), or whether it takes a very long convergence time for low C^B to be sampled correctly. If you imaging measuring the polarization at a single point, there is an exact degeneracy between possible E and B full-sky samples, and hence potentially just the kind of correlated distribution Gibbs sampling becomes very bad at exploring.
This, on the other hand, is true. However, this is only a problem with the current implementation of the Gibbs sampler, where the step size is given by cosmic variance (which of course is non-zero even for B=0, because of the noise contribution discussed above) and the noise is big. We are currently investigating other sampling schemes, and these will hopefully resolve this. It's important to distinguish between the framework as such and the current implementation, I think -- I have a hard time identifying any other methods that can do what the "Gibbs sampling framework" as such can do, but I have no problem identifying issues with the current implementation.
It is also worth noting that the C^B_l due to lensing is quite non-Gaussian and correlated, hence use of the Gaussian uncorrelated results is not strictly justified (though does not of course prevent you from constructing non-Gaussian sky samples if the observed sky is). For a really convincing demonstration one could generate a non-Gaussian lensed sky (e.g. using LensPix) and see if the method still works or can easily be adapted to give correct results.
Sure. Yet another thing on our to-do list.. :-)

David Larson
Posts: 11
Joined: September 25 2004
Affiliation: Johns Hopkins University

[astro-ph/0608007] Estimation of Polarized Power Spectra by

Post by David Larson » August 03 2006

I agree likelihoods should be optimal, but I don't think the algorithm in itself is doing any E/B mode separation (all full sky noise free maps will have some non-zero B mode everywhere, including in the observed region). To separate the modes you'd need to project the full sky sampled Elm and Blm into the linear combinations of the full sky E and B modes that are pure B and pure E
I don't see why we would need to project the full sky sampled Elm and Blm into pure B and pure E modes outside the mask. Working with pure E and B modes outside the mask is a useful technique if you don't already have the Elm and Blm coefficients for the full sky. However, we already do have those coefficients, so we can go directly from there to estimates of the power spectra: [tex]C_\ell^{EE}[/tex] and [tex]C_\ell^{BB}[/tex] (and [tex]C_\ell^{EB}[/tex], if you like).
The simplest way to do that would be to use formulas like
[tex]\hat{C}_\ell^{EE} = \frac{1}{2\ell+1}\sum_m |a^{E}_{\ell m}|^2[/tex]

I guess I have been assuming we are primarily interested in the power spectrum. If we want to know what the actual B modes are, perhaps to map out the weak lensing potential, then Gibbs sampling still samples realizations of the B modes. (We would, however, have to modify our assumption of isotropic Gaussianity for the B modes, since weak lensing doesn't satisfy that.)

It is true that our E-B separation happens after the fact. All the Gibbs sampling code does is sample full-sky polarized realizations according to their likelihood. However, our point is that when you have a full-sky realizations of the polarization, then breaking that up into the E and B parts is trivial (now that the HEALPix code has been written, anyway).
(all full sky noise free maps will have some non-zero B mode everywhere, including in the observed region)
I don't understand this statement. It's possible to determine the "curl" of a polarization field with a differential operator on the sphere (act with edthbar twice and take the imaginary part). This curl operator determines how much B mode there is at a given location. If the Gibbs sampler were run on a zero-noise masked sky that contained no B modes at all, then I believe the resulting samples would only have "curl" or B-ness inside the mask, not everywhere.

Let me know if I've misunderstood what you said.

Benjamin Wandelt
Posts: 12
Joined: September 24 2004
Affiliation: IAP, UPMC, Sorbonne Universites, and Illinois
Contact:

[astro-ph/0608007] Estimation of Polarized Power Spectra by

Post by Benjamin Wandelt » August 03 2006

One short comment: it is easy to compute the posterior mean of the E and B maps using a Gibbs code. These should be the least square optimal maps of E and B modes separately given the data. So in that sense the Gibbs codes do perform E and B mode separation. This will certainly be small in the masked region, but use the power spectrum information obtained on the rest of the sky for the reconstruction, so it will use smoothness of the large scale modes to reconstruct them partially into the mask.

We decided to show the separation of E and B using a single all-sky reconstruction, but given this discussion I think it would be very useful to add a figure that shows the posterior mean E and B maps.

Ben

Kendrick Smith
Posts: 3
Joined: August 03 2006
Affiliation: University of Chicago

[astro-ph/0608007] Estimation of Polarized Power Spectra by

Post by Kendrick Smith » August 04 2006

Hi Antony,

Now I see what you were saying in your original post (sorry, I think I responded to it without reading it carefully!) You're just saying (correct me if I'm wrong...) that language like "clean separation" of E and B is confusing because these are not actually decoupled in the Gibbs sampler; the E_{lm}'s do affect the posterior distribution of the B_{lm}'s on a cut sky. Maybe something like "optimal separation" (in the same sense as maximum likelihood estimation) would be clearer?

Antony Lewis
Posts: 1941
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: [astro-ph/0608007] Estimation of Polarized Power Spectra

Post by Antony Lewis » August 04 2006

I don't see why we would need to project the full sky sampled Elm and Blm into pure B and pure E modes outside the mask.
Absolutely, you don't need to. I'm not saying you should, I'm just querying your claim to be doing E/B mode separation. To my mind E/B mode separation requires generating modes which are pure B and pure E in the observed patch, whereas you are generating ensembles of full sky samples each of which has different amounts of E and B in the patch.
I don't understand this statement. It's possible to determine the "curl" of a polarization field with a differential operator on the sphere (act with edthbar twice and take the imaginary part). This curl operator determines how much B mode there is at a given location. If the Gibbs sampler were run on a zero-noise masked sky that contained no B modes at all, then I believe the resulting samples would only have "curl" or B-ness inside the mask, not everywhere.
The problem is the boundary integrals when you have an incomplete sky; you cannot take derivatives of the polarization field near the boundary unless it is band limited. The presence of small scale E-mode power (or noise) means you effectively cannot do it (see proofs and refs in astro-ph/0305545). This is why there are ambiguous modes for non-band-limited partial skies.
It's actually never singular, because CBl will always be strictly positive, due to noise. Remember that we are not trying to find the single "true" value, but rather a full distribution of possible values. So for a realization that truly has B=0, we will get a distribution that peaks at 0, and then drops off quickly at high values.
But what happens to the Blackwell-Rao estimator? This involves C^{-1}, and I might want to know the likelihood of a B=0 model.
One short comment: it is easy to compute the posterior mean of the E and B maps using a Gibbs code. These should be the least square optimal maps of E and B modes separately given the data.
Maybe something like "optimal separation" (in the same sense as maximum likelihood estimation) would be clearer?
Doing this sounds like a good idea, giving a Wiender filtered E/B separation given the model. Assuming everything works well I think this should be 'better', but less general, than doing direct mode separation. (less general because there might be some unexpected non-Gaussian B mode that doesn't fit the assumed Gaussian model at all well - direct mode separation makes no assumptions about the distribution of the modes; better because it includes modes with only very small E contamination - though this does mean it still isn't clean separation)
We are currently investigating other sampling schemes, and these will hopefully resolve this
I'd be very interested to know what non-Gibbs sampling methods you think might work. (Hobson also has some ideas, see his Sussex talk)

Hans Kristian Eriksen
Posts: 60
Joined: September 25 2004
Affiliation: ITA, University of Oslo
Contact:

Re: [astro-ph/0608007] Estimation of Polarized Power Spectra

Post by Hans Kristian Eriksen » August 04 2006

But what happens to the Blackwell-Rao estimator? This involves C^{-1}, and I might want to know the likelihood of a B=0 model.
Right. If you want to study B=0 models, I think the best solution is to start out the analysis with that constraint. It's very straightforward to enforce C_l^B = 0 everywhere, and sample just T and E. (The signal sampling step goes through just as before, while the C_l sampling step is reduced to a 2x2 matrix problem.) You don't get a full posterior that way, but rather a conditional one, P(C_l|d, C_l^B = 0). In fact, this has already been implemented in our codes.

Antony Lewis
Posts: 1941
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: [astro-ph/0608007] Estimation of Polarized Power Spectra

Post by Antony Lewis » September 12 2006

I think the following is a much better way to get sensible results for low C_B, and also a good way to resolve convergence issues at low signal to noise and get more robust Blackwell-Rao estimators (at least in toy problems like the one considered in this paper where the noise is close to uniform apart from a sky cut).

Basically: just add in the isotropic component of the noise with the signal. ie. define X_i = X + n_i, where X is the pure CMB sky and n_i is Gaussian isotropic noise with covariance N_i. The total noise covariance is N_i + N_a where N_a is not isotropic. Then sample from

[tex]
-2\log P(S,X_i|d) = (d-X_i)^\dagger N_a^{-1}(d-X_i) + X_i^\dagger(S+N_i)^{-1}X_i + \log|S+N_i|
[/tex]

using the standard method, where d is the data and S is the signal covariance. The distribution of S +N_i given X_i is now nice and broad, so Gibbs iterations converge quickly. Furthermore the Blackwell-Rao likelihoods are well behaved for small S. The only cost of doing this is that N_a is now more anisotropic, so conjugate gradient solutions are somewhat slower. Using someting like N_i = 0.9 min(N) should usually be a significant net help.

For previous discussion relating to parameter esimation see

http://cosmocoffee.info/viewtopic.php?t=137#324

Post Reply