GetDist Error with disttest.ini

Use of Cobaya. camb, CLASS, cosmomc, compilers, etc.
Post Reply
Victor Buza
Posts: 4
Joined: July 11 2013
Affiliation: Harvard, CfA

GetDist Error with disttest.ini

Post by Victor Buza » September 18 2013

Hi,

I am trying to use Cosmomc/GetDist to go through the test example. After performing the following line "mpirun -np 8 ./cosmomc test.ini", I get

Code: Select all

 Number of MPI processes:           8
 Random seeds:  7991, 26877 rand_inst:   6
 Random seeds:  7560, 26884 rand_inst:   1
 Using clik with likelihood file ./data/clik/CAMspec_v6.2TN_2013_02_26_dist.clik
 Random seeds:  8331, 26891 rand_inst:   8
 Random seeds:  8296, 26897 rand_inst:   7
 Random seeds:  7869, 26904 rand_inst:   2
 Random seeds:  8232, 26911 rand_inst:   5
 Random seeds:  8200, 26918 rand_inst:   4
 Random seeds:  8163, 26924 rand_inst:   3

 Using clik with likelihood file data/clik/CAMspec_v6.2TN_2013_02_26_dist.clik 
---- 
clik version 5887 MAKEFILE 
  CAMspec e61cec87-3a37-43ca-8ed1-edcfcaf5c00a 
Checking likelihood 'data/clik/CAMspec_v6.2TN_2013_02_26_dist.clik' on test data. got -3908.71 expected -3908.71 (diff -3.76099e-08) 
---- 
   TT from l=0 to l=        2500 
 Clik will run with the following nuisance parameters: 
 A_ps_100 
 A_ps_143 
 A_ps_217 
 A_cib_143 
 A_cib_217 
 A_sz 
 r_ps 
 r_cib 
 n_Dl_cib 
 cal_100 
 cal_217 
 xi_sz_cib 
 A_ksz 
 Bm_1_1 
 Using clik with likelihood file data/clik/commander_v4.1_lm49.clik 
---- 
clik version 5887 MAKEFILE 
  gibbs d462e865-e178-449a-ac29-5c16ab9b38f5 
Checking likelihood 'data/clik/commander_v4.1_lm49.clik' on test data. got 3.2784 expected 3.2784 (diff -2.55579e-10) 
---- 
   TT from l=0 to l=          49 
 Using clik with likelihood file data/clik/lowlike_v222.clik 
 Initializing Planck low-likelihood, version v2.1 
---- 
clik version 5887 MAKEFILE 
  lowlike "lowlike v222" 
Checking likelihood 'data/clik/lowlike_v222.clik' on test data. got -1007.04 expected -1007.04 (diff -1.97381e-05) 
---- 
   TT from l=0 to l=          32 
   EE from l=0 to l=          32 
   BB from l=0 to l=          32 
   TE from l=0 to l=          32 
 adding parameters for: lowlike_v222.clik 
 adding parameters for: commander_v4.1_lm49.clik 
 adding parameters for: CAMspec_v6.2TN_2013_02_26_dist.clik 
 WARNING: zero padding ext cls in LoadFiducialHighLTemplate 
 Computing tensors: F 
 Doing CMB lensing: T 
 Doing non-linear Pk:           0 
 lmax              = 6500 
 lmax_computed_cl  = 2500 
 max_eta_k         =    6625.00000000000      
 transfer kmax     =   0.800000011920929      
 Number of C_ls =    4 
 Fast divided into            1  blocks 
 Varying 20 parameters ( 6 slow ( 2 semi-slow), 14 fast ( 0 semi-fast)) 
 starting Monte-Carlo 
 Initialising BBN Helium data... 
I reduced the redundant lines, since the process had 8 nodes. When I do just 1 node, after "Initialising BBN Helium data.." I start getting lines that I don't get with the 8 node version, that are of the following form:

Code: Select all

Initialising BBN Helium data...
 Chain:0 drag accpt:  0.4054054     fast/slow   44.40298     slow:          67
 Chain:0 drag accpt:  0.4316547     fast/slow   49.45312     slow:         128
 Chain:0 drag accpt:  0.4265403     fast/slow   50.03608     slow:         194
 Chain1, MPI done 'burn', Samples =243, like =    4905.653
 Time:    2036.12491989136      output lines=         101
 slow changes         156 semi-slow changes          66
 MPI_Min_Sample_Update          88         243
           1 all_burn done
           1 DoUpdates
 Chain:0 drag accpt:  0.4137931     fast/slow   50.20076     slow:         264
I wouldn't have paid attention to this, but after I let the 8 node version execute, and try to run

Code: Select all

./getdist disttest.ini
I get the following error

Code: Select all

./getdist disttest.ini
 skipped unused params: omegak mnu nnu yhe Alens nrun r
 reading chains/test_1.txt
 reading chains/test_2.txt
 reading chains/test_3.txt
 reading chains/test_4.txt
 reading chains/test_5.txt
 reading chains/test_6.txt
 reading chains/test_7.txt
 reading chains/test_8.txt
 outlier fraction   9.0909094E-02
 Number of chains used =             8
 WARNING: Gelman-Rubin covariance not invertible
 RL: Not enough samples to estimate convergence stats
forrtl: info (58): format syntax error at or near f8.3,"  \Omega_b h^2")
forrtl: severe (62): syntax error in format, unit 40, file /n/home01/vbuza/cosmomc/cosmomc/test.converge
Image              PC                Routine            Line        Source
getdist            0000000000543E3E  Unknown               Unknown  Unknown
getdist            00000000005428D6  Unknown               Unknown  Unknown
getdist            00000000004F0462  Unknown               Unknown  Unknown
getdist            00000000004A12FC  Unknown               Unknown  Unknown
getdist            00000000004A081C  Unknown               Unknown  Unknown
getdist            00000000004E030B  Unknown               Unknown  Unknown
getdist            000000000046AA37  Unknown               Unknown  Unknown
getdist            000000000045178A  Unknown               Unknown  Unknown
getdist            000000000040A8DC  Unknown               Unknown  Unknown
libc.so.6          00000036D6E1ECDD  Unknown               Unknown  Unknown
getdist            000000000040A7D9  Unknown               Unknown  Unknown
Sometimes the

Code: Select all

"WARNING: Gelman-Rubin covariance not invertible" 
line changes to a statement about the value of R-1, but aside from that I get the same error. The problem seems to be similar to the one in this thread http://cosmocoffee.info/viewtopic.php?t ... ht=getdist
but unfortunately I don't understand the solution. Could you please let me know if it's the same problem, or if the fact that there's a difference between 1 nodes and 8 nodes makes the difference? I would appreciate any pointers.

Thank you in advance.

Björn Sörgel
Posts: 13
Joined: April 05 2013
Affiliation: Institute of Astronomy/Kavli Institute for Cosmology Cambridge

GetDist Error with disttest.ini

Post by Björn Sörgel » September 18 2013

Hi,
the error you have seems independent of running 1 or 8 chains to me.
Try the solution by Antony in the other thread following these steps
1) find the problematic lines in source/GetDist.f90
2) replace them by the new code
3) rerun "make"

The "R-1" or the "Warning: Gelman-Rubin ..." is only related to chain convergence. I assume if you cancel a run after a short time or analyze it directly after starting, it won't be able to estimate the R-1 convergence criterion. The standard convergence criterion is R-1 = 0.02, when this is reached the chains stop automatically.

Cheers,
Bjoern

Victor Buza
Posts: 4
Joined: July 11 2013
Affiliation: Harvard, CfA

GetDist Error with disttest.ini

Post by Victor Buza » September 18 2013

Hi Bjorn,

Thanks for suggesting where to look. Having checked the GetDist.f90 file though, the condition for maxoff is already set to being >0. As Antony suggested in that old post, it has been included in the new releases.

Would you have any other suggestions?

Antony Lewis
Posts: 1943
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: GetDist Error with disttest.ini

Post by Antony Lewis » September 19 2013

If you only run one chain it can't perform multi-chain convergence tests, hence you will not get R-1 errors. It doesn't mean that is it is actually converging any better. Or perhaps you just didn't run either chain long enough to get a reasonable number of samples.

Post Reply