[1003.3451] Primordial NonGaussianity and the NRAO VLA Sky Survey
Authors:  JunQing Xia, Matteo Viel, Carlo Baccigalupi, Gianfranco De Zotti, Sabino Matarrese, Licia Verde 
Abstract:  The NRAO VLA Sky Survey (NVSS) is the only dataset that allows an accurate determination of the autocorrelation function (ACF) on angular scales of several degrees for Active Galactic Nuclei (AGNs) at typical redshifts $z \simeq 1$. Surprisingly, the ACF is found to be positive on such large scales while, in the framework of the standard hierarchical clustering scenario with Gaussian primordial perturbations it should be negative for a redshiftindependent effective halo mass of order of that found for opticallyselected quasars. We show that a small primordial nonGaussianity can add sufficient power on very large scales to account for the observed NVSS ACF. The bestfit value of the parameter $f_{\rm NL}$, quantifying the amplitude of primordial nonGaussianity of local type is $f_{\rm NL}=62 \pm 27$ ($1\,\sigma$ error bar) and $25<f_{\rm NL}<117$ ($2\,\sigma$ confidence level), corresponding to a detection of nonGaussianity significant at the $\sim 3\,\sigma$ confidence level. The minimal halo mass of NVSS sources is found to be $M_{\rm min}=10^{12.47\pm0.26}h^{1}M_{\odot}$ ($1\,\sigma$) strikingly close to that found for optically selected quasars. We discuss caveats and possible physical and systematic effects that can impact on the results. 
[PDF] [PS] [BibTex] [Bookmark] 

 Posts: 16
 Joined: November 06 2004
 Affiliation: CITA
 Contact:
[1003.3451] Primordial NonGaussianity and the NRAO VLA Sky
This paper claims to find ~3 sigma evidence for localform nonGaussianity
based on a measurement that the NVSS correlation function does not go to zero
by separations ~8 degrees, when it should have been consistent with zero by
~23 degrees for standard Gaussian LCDM (given their error bars).
My first question when seeing any correlation function that "does not go to
zero" is "doesn't any measured correlation function *need* to become
consistent with zero as you go to large separations, and why haven't the
authors shown me that theirs does??" In other words: you may or may not have
built in an integral constraint that forces your points to go negative
somewhere, but in any case you should have a term in the covariance
matrix corresponding to adding a constant to the correlation function, which
guarantees that any fit results will not be sensitive to a shift in the mean
of the survey (or scales much larger than the ones you think you can control).
It looks to me like the authors' measurement is essentially completely based on
this kind of constant term in the correlation function, i.e., if I am allowed
to add a constant to the Gaussian case, it doesn't look like it can be
distinguished from the nonGaussian case. If the authors have some reason to
believe this is not a problem, it should be clarified in the paper.
That is a general observational consideration (you can't measure fluctuations
in the mean of your survey), but there is a related theoretical issue:
Naively, the constant contribution to the correlation function is
infinite in this model for nonGaussianity (although they may be
leaving out the part of the calculation that gives that), so it is not clear
what their predictions that look like they are going flat at large separations
mean. There is an easy solution to both problems: marginalize over a free
constant added to the correlation function prediction. It seems pretty clear
though that they will find nothing of any significance when doing this.
This is really a general problem with the correlation function. Looking at it
at a given separation does not really correspond to what one is intuitively
thinking when thinking about a certain "scale", e.g., the value at a relatively
small separation is affected by fluctuations in something as largescaled as
the mean of the survey. Usually this doesn't matter, however, because one has
measurements showing that the correlation goes to zero on scales larger than
the ones of interest (to better precision than the changes of interest on
your scale), which implicitly constrain the kind of constant term you should
generally marginalize over.
based on a measurement that the NVSS correlation function does not go to zero
by separations ~8 degrees, when it should have been consistent with zero by
~23 degrees for standard Gaussian LCDM (given their error bars).
My first question when seeing any correlation function that "does not go to
zero" is "doesn't any measured correlation function *need* to become
consistent with zero as you go to large separations, and why haven't the
authors shown me that theirs does??" In other words: you may or may not have
built in an integral constraint that forces your points to go negative
somewhere, but in any case you should have a term in the covariance
matrix corresponding to adding a constant to the correlation function, which
guarantees that any fit results will not be sensitive to a shift in the mean
of the survey (or scales much larger than the ones you think you can control).
It looks to me like the authors' measurement is essentially completely based on
this kind of constant term in the correlation function, i.e., if I am allowed
to add a constant to the Gaussian case, it doesn't look like it can be
distinguished from the nonGaussian case. If the authors have some reason to
believe this is not a problem, it should be clarified in the paper.
That is a general observational consideration (you can't measure fluctuations
in the mean of your survey), but there is a related theoretical issue:
Naively, the constant contribution to the correlation function is
infinite in this model for nonGaussianity (although they may be
leaving out the part of the calculation that gives that), so it is not clear
what their predictions that look like they are going flat at large separations
mean. There is an easy solution to both problems: marginalize over a free
constant added to the correlation function prediction. It seems pretty clear
though that they will find nothing of any significance when doing this.
This is really a general problem with the correlation function. Looking at it
at a given separation does not really correspond to what one is intuitively
thinking when thinking about a certain "scale", e.g., the value at a relatively
small separation is affected by fluctuations in something as largescaled as
the mean of the survey. Usually this doesn't matter, however, because one has
measurements showing that the correlation goes to zero on scales larger than
the ones of interest (to better precision than the changes of interest on
your scale), which implicitly constrain the kind of constant term you should
generally marginalize over.

 Posts: 1
 Joined: March 19 2010
 Affiliation: U. of Michigan
[1003.3451] Primordial NonGaussianity and the NRAO VLA Sky
The authors do claim to take into account the integral constraint.
At the end of Sec. 4 they state that "the correction proposed by Wands & Slozar (2009) to account for the infrared divergence of the nonGaussian halo correlation function is fully negligible for our bestfit fnl value".
The Wands & Slozar correction is just the sample variance at the scale of the survey.
But, considering how large their bestfit fnl is and the volume of the survey. it is surprising that they find that the correction is negligible.
At the end of Sec. 4 they state that "the correction proposed by Wands & Slozar (2009) to account for the infrared divergence of the nonGaussian halo correlation function is fully negligible for our bestfit fnl value".
The Wands & Slozar correction is just the sample variance at the scale of the survey.
But, considering how large their bestfit fnl is and the volume of the survey. it is surprising that they find that the correction is negligible.

 Posts: 16
 Joined: November 06 2004
 Affiliation: CITA
 Contact:
[1003.3451] Primordial NonGaussianity and the NRAO VLA Sky
The basic problem doesn't really have anything to do with f_NL, and isn't really accessible to a calculation of corrections. It is basically just: if your correlation function is positive at every separation you say you trust, you necessarily have sensitivity to constant shifts coming from larger scale structure, whether it is real largescale structure or just systematic errors in the measurement.

 Posts: 22
 Joined: January 02 2005
 Affiliation: SISSA, Italy
[1003.3451] Primordial NonGaussianity and the NRAO VLA Sky
Unfortunately, in the small space for a Letter we could not spend too
many words on reviewing the literature on the NVSS correlation
function in general.
Here's some additional information that may be useful to understand
the issue.
The radio sources and in particular NVSS auto correlation function
has been studied exahustively. References include:
Cress et al 1996, Cress & Kamionkowski 1998, Magliocchetti et al
1998, Blake and Wall 2002a (Mon.Not.Roy.Astron.Soc. 329 (2002) L37
L41), Blake and Wall 2002b (Mon.Not.Roy.Astron.Soc. 337 (2002) 993),
Negrello et al 2006(Mon.Not.Roy.Astron.Soc.377:15571568,2007) and
(Mon.Not.Roy.Astron.Soc.368:935942,2006) )
Both FIRST and NVSS surveys since the first analyses showed that
their correlation function did not have the same shape as expected
in a standard LCDM model; one would have to play with the sources
redshift distribution and biasing to make them match (for a heatto
head comparison of FIRST and NVSS see Blake and Wall 2002b).
Blake and Wall 2002a present probably the first robust determination
of the NVSS correlation function. There they show that the
correlation function is well described by two power laws: at small
scales (separations of less than 0.1 degrees) one sees the effect of
the size distribution of the sources (these are not points at these
frequences and these resolutions) and at larger scales one sees the
sources clustering.
In our paper we exclude the small scales as we are not interested on
the sources sizes.
The large scales, >0.1 deg <10 deg,(where "large" is still small
compared to the survey size), where one should see the clustering
properties, may be to some extent affected by systematic effects
involving nonuniform source sampling.
In Blake & Wall 2002a sec 2.3 it is discussed how the effect of
varying source density may enhance the correlation function. There
they quantify that the effect depends on the flux cut, it is very
small for sources >10mJy but increases rapidly below that flux cut.
We work at the high flux cut and anyway report what happens when we
apply a possible correction (the 10^4 mentioned). This is a
conservative choice, see sec 4.3 of Blake and Wall 2002b and 2.3 of
Blake and Wall 2002a.
The correlation function we measure is fully consistent with the
Blake \& Wall one (see discussion at the end of page 4 where we
swapped our data and errors with the Blake&Wall ones).
The part of the correlation function least affected by possible
density gradients (arising from the difficulty in calibrating and
matching the different configurations of the arraysee discussion in
Blake & Wall) is at separation <10 deg.
In order to zoom in on the interesting part of the correlation
function we use and show only scales up to 8 degrees separation.
We are throwing away data and information at larger separation,
but scales <10 deg are most reliable. The survey's large sky
coverage allows us to compute the correlation to much larger
separations. In fact the mean of the survey is computed from the full
survey but then only correlations at separations <9 deg are shown.
To see an example of the correlation function to larger separations
see for example fig 10 of HernandezMonteagudo 2009 (0909.4294),
concentrating on the blue/green symbols.
In addition to that it may still be that nonzero fnl includes extra
fluctuations (see discussion in Wands and Slosar) on the survey size. We
compute that this effect is fully subdominant compared to the ununiform
source density effect and anyway does not change the results (that is
the meaning of the sentence "the correction proposed by Wands & Slozar
(2009) to account for the infrared divergence of the nonGaussian halo
correlation function is fully negligible for our bestfit fnl value".).
Indeed the theoretical variance on the scale of the survey is
5e5 \times f_{NL}/f_{NL}(best fit), which is smaller than the effect due
to density gradients for all f_{NL}<100.
So to summarize: the correlation function does go to zero on large
(>10 deg) scales but , whatever we do, a) these are larger than the
scales of interests here b) it does not do it as sharply and at
the scales predicted by LCDM ( ~2 degrees) for a biasing model for
the sources in agreement with that of their optical counterpart.
There are several ways out, each of them involve some "nonstandard"
solution:
a) a bias model which decreases with z, based on a redshift dependent
hosting halo mass (but then radio loud and radio quiet sources are not
the same beast) see papers by Massardi et al 2010 and refs there.
b) primordial nongaussianity (but then you will have do be willing
to drop gaussianity)
c) some other systematic error in the survey that we could not find
or think of
d) a combination of different effects (misestimate of off diagonal
covariance terms +strange bias evolution + strange Mmin+ bigger
correction due to source density gradients than quantified by Blake
and Wall) , each of them making our errorbars slightly
underestimated 'til the integrated effect of all gives an fnl
consistent with zero.
The literature so far had concentrated on option a), so we hope we
have opened the discussion about b).
It is surely early to draw any definitive conclusions, but still one
can conclude as we do that "our work should be seen as a “proof of
principle”, indicating that future surveys probing scales ~ 100
Mpc at substantial redshifts can put stringent constraints on
primordial nonGaussianity".
I hope this answers your questions, but please ask if you need
further clarifications.
many words on reviewing the literature on the NVSS correlation
function in general.
Here's some additional information that may be useful to understand
the issue.
The radio sources and in particular NVSS auto correlation function
has been studied exahustively. References include:
Cress et al 1996, Cress & Kamionkowski 1998, Magliocchetti et al
1998, Blake and Wall 2002a (Mon.Not.Roy.Astron.Soc. 329 (2002) L37
L41), Blake and Wall 2002b (Mon.Not.Roy.Astron.Soc. 337 (2002) 993),
Negrello et al 2006(Mon.Not.Roy.Astron.Soc.377:15571568,2007) and
(Mon.Not.Roy.Astron.Soc.368:935942,2006) )
Both FIRST and NVSS surveys since the first analyses showed that
their correlation function did not have the same shape as expected
in a standard LCDM model; one would have to play with the sources
redshift distribution and biasing to make them match (for a heatto
head comparison of FIRST and NVSS see Blake and Wall 2002b).
Blake and Wall 2002a present probably the first robust determination
of the NVSS correlation function. There they show that the
correlation function is well described by two power laws: at small
scales (separations of less than 0.1 degrees) one sees the effect of
the size distribution of the sources (these are not points at these
frequences and these resolutions) and at larger scales one sees the
sources clustering.
In our paper we exclude the small scales as we are not interested on
the sources sizes.
The large scales, >0.1 deg <10 deg,(where "large" is still small
compared to the survey size), where one should see the clustering
properties, may be to some extent affected by systematic effects
involving nonuniform source sampling.
In Blake & Wall 2002a sec 2.3 it is discussed how the effect of
varying source density may enhance the correlation function. There
they quantify that the effect depends on the flux cut, it is very
small for sources >10mJy but increases rapidly below that flux cut.
We work at the high flux cut and anyway report what happens when we
apply a possible correction (the 10^4 mentioned). This is a
conservative choice, see sec 4.3 of Blake and Wall 2002b and 2.3 of
Blake and Wall 2002a.
The correlation function we measure is fully consistent with the
Blake \& Wall one (see discussion at the end of page 4 where we
swapped our data and errors with the Blake&Wall ones).
The part of the correlation function least affected by possible
density gradients (arising from the difficulty in calibrating and
matching the different configurations of the arraysee discussion in
Blake & Wall) is at separation <10 deg.
In order to zoom in on the interesting part of the correlation
function we use and show only scales up to 8 degrees separation.
We are throwing away data and information at larger separation,
but scales <10 deg are most reliable. The survey's large sky
coverage allows us to compute the correlation to much larger
separations. In fact the mean of the survey is computed from the full
survey but then only correlations at separations <9 deg are shown.
To see an example of the correlation function to larger separations
see for example fig 10 of HernandezMonteagudo 2009 (0909.4294),
concentrating on the blue/green symbols.
In addition to that it may still be that nonzero fnl includes extra
fluctuations (see discussion in Wands and Slosar) on the survey size. We
compute that this effect is fully subdominant compared to the ununiform
source density effect and anyway does not change the results (that is
the meaning of the sentence "the correction proposed by Wands & Slozar
(2009) to account for the infrared divergence of the nonGaussian halo
correlation function is fully negligible for our bestfit fnl value".).
Indeed the theoretical variance on the scale of the survey is
5e5 \times f_{NL}/f_{NL}(best fit), which is smaller than the effect due
to density gradients for all f_{NL}<100.
So to summarize: the correlation function does go to zero on large
(>10 deg) scales but , whatever we do, a) these are larger than the
scales of interests here b) it does not do it as sharply and at
the scales predicted by LCDM ( ~2 degrees) for a biasing model for
the sources in agreement with that of their optical counterpart.
There are several ways out, each of them involve some "nonstandard"
solution:
a) a bias model which decreases with z, based on a redshift dependent
hosting halo mass (but then radio loud and radio quiet sources are not
the same beast) see papers by Massardi et al 2010 and refs there.
b) primordial nongaussianity (but then you will have do be willing
to drop gaussianity)
c) some other systematic error in the survey that we could not find
or think of
d) a combination of different effects (misestimate of off diagonal
covariance terms +strange bias evolution + strange Mmin+ bigger
correction due to source density gradients than quantified by Blake
and Wall) , each of them making our errorbars slightly
underestimated 'til the integrated effect of all gives an fnl
consistent with zero.
The literature so far had concentrated on option a), so we hope we
have opened the discussion about b).
It is surely early to draw any definitive conclusions, but still one
can conclude as we do that "our work should be seen as a “proof of
principle”, indicating that future surveys probing scales ~ 100
Mpc at substantial redshifts can put stringent constraints on
primordial nonGaussianity".
I hope this answers your questions, but please ask if you need
further clarifications.

 Posts: 183
 Joined: September 24 2004
 Affiliation: Brookhaven National Laboratory
 Contact:
[1003.3451] Primordial NonGaussianity and the NRAO VLA Sky
I am still confused about some of the issues here. The correlation function must do more than just go to zero at large scales, it must integrate to zero, which means it must necessarily go below zero at large separations.
At what distance does your correlation function actually go through zero? Could we se xi(theta)theta^2 plotted for some large values of theta?
I also disagree with Pat on the importance of this effect. What one is doing is essentially forcing integral constraint by setting mean n to the mean n of the survey, which is different from the mean n of the universe. This essentially ignores modes larger than survey. But it doesn't matter for large surveys because cosmic mean n is going to be pretty close to measured mean n. But true, it is still the safest to marginalise over it.
Also, in Wands and Slosar (for definition of Slozar (note z) , see http://www.urbandictionary.com/define.php?term=slozar ) we worked out, how to calculate xi to be compared with observations, because xi, when calculated from P(k) is formally divergent for any r. So in that sense it is not a correction, because it takes you from infinity to a finite value. But I agree that this is not the only possible way to calculate theory to be compared with your observations and I tend to trust Matarrese and Verde....
At what distance does your correlation function actually go through zero? Could we se xi(theta)theta^2 plotted for some large values of theta?
I also disagree with Pat on the importance of this effect. What one is doing is essentially forcing integral constraint by setting mean n to the mean n of the survey, which is different from the mean n of the universe. This essentially ignores modes larger than survey. But it doesn't matter for large surveys because cosmic mean n is going to be pretty close to measured mean n. But true, it is still the safest to marginalise over it.
Also, in Wands and Slosar (for definition of Slozar (note z) , see http://www.urbandictionary.com/define.php?term=slozar ) we worked out, how to calculate xi to be compared with observations, because xi, when calculated from P(k) is formally divergent for any r. So in that sense it is not a correction, because it takes you from infinity to a finite value. But I agree that this is not the only possible way to calculate theory to be compared with your observations and I tend to trust Matarrese and Verde....

 Posts: 16
 Joined: November 06 2004
 Affiliation: CITA
 Contact:
[1003.3451] Primordial NonGaussianity and the NRAO VLA Sky
If you don't trust scales >R, it seems like the most often correct thing to do
would be to highpass filter both the theory and observations to remove the
power on scales >R. This would remove a constant from the correlation function
at separations <<R, and have a similar effect on fits as marginalizing over a
constant. This is basically what you were saying in Wands & Slosar about the
scale of the full survey, except I'm saying if you don't trust the data on
scales smaller than the full survey, you have to bring that smoothing scale
down to the scale where you are confident in the data. It's all about the
potential systematic errors  if you can't show a model that fits the full
correlation function, it becomes extremely important to explain why it is that
the part of the function you do fit is reliable, while the rest isn't. It is
hard to imagine a systematic error that affects the correlation at 20 deg.,
yet does not enter smaller scales as even a constant  maybe such a thing is
possible, but it isn't what you get for generic largescale contamination.
The "possible correction" of Blake and Wall that they apply does remove the
statistical significance of the f_NL measurement (to 1.4 sigma), so it
becomes critical to know how "conservative" that is supposed to be (and if it
is really safe to assume any systematic is a constant, etc.).
Maybe all this can be explained, and just isn't as well as I'd like it to be
in the paper.
The broad point I was trying to make is that, to make this kind of measurement
where the signal amounts to very large scale power believable, it will be
completely critical to demonstrate that systematic errors taking the form of
largescale power are under control. Just chopping a correlation function at
some separation is generally *not* the best way to separate
"clean, small scale" from "unclean, largescale"  it is important to
understand the form of the systematics and how they change the correlation
function. We don't need a databased proof that this can work in principle *if
there were no systematics*  that's what Fisher matrices are for :)
would be to highpass filter both the theory and observations to remove the
power on scales >R. This would remove a constant from the correlation function
at separations <<R, and have a similar effect on fits as marginalizing over a
constant. This is basically what you were saying in Wands & Slosar about the
scale of the full survey, except I'm saying if you don't trust the data on
scales smaller than the full survey, you have to bring that smoothing scale
down to the scale where you are confident in the data. It's all about the
potential systematic errors  if you can't show a model that fits the full
correlation function, it becomes extremely important to explain why it is that
the part of the function you do fit is reliable, while the rest isn't. It is
hard to imagine a systematic error that affects the correlation at 20 deg.,
yet does not enter smaller scales as even a constant  maybe such a thing is
possible, but it isn't what you get for generic largescale contamination.
The "possible correction" of Blake and Wall that they apply does remove the
statistical significance of the f_NL measurement (to 1.4 sigma), so it
becomes critical to know how "conservative" that is supposed to be (and if it
is really safe to assume any systematic is a constant, etc.).
Maybe all this can be explained, and just isn't as well as I'd like it to be
in the paper.
The broad point I was trying to make is that, to make this kind of measurement
where the signal amounts to very large scale power believable, it will be
completely critical to demonstrate that systematic errors taking the form of
largescale power are under control. Just chopping a correlation function at
some separation is generally *not* the best way to separate
"clean, small scale" from "unclean, largescale"  it is important to
understand the form of the systematics and how they change the correlation
function. We don't need a databased proof that this can work in principle *if
there were no systematics*  that's what Fisher matrices are for :)