CosmoMC crashing with action=2

Use of Healpix, camb, CLASS, cosmomc, compilers, etc.
Post Reply
Levon Pogosian
Posts: 25
Joined: September 25 2004
Affiliation: Simon Fraser University
Contact:

CosmoMC crashing with action=2

Post by Levon Pogosian » June 19 2020

I wonder if anyone else has encountered the same issue when running CosmoMC with action=2 (version downloaded and installed on March 9th, 2020).

I am trying to get the best fit theory Cls for the vanilla LCDM with Planck18+lensing, so I set action=2 and use a *.ini file that calls the default batch3/common.ini. I run it on a cluster (Niagara on Compute Canada) using 4 nodes with 40 CPUs per node and it seems to run for 6-7 minutes with no issues, then it exists with a segmentation fault (exit code was 174). The mpi_output_* file generated by the cluster seems to suggest that the task was completed but then some problem cause the crush. But I could be wrong with this interpretation.

I tried running an older version of CosmoMC with action=2 using Planck2015, using a very similar *.ini file, and it completes fine in under 10 minutes, producing all the files as it should.

Any idea what could be causing the issue? I saw some posts about Seg Fault issues when running action=2 with varying neutrino masses, but nothing else. My neutrino mass is fixed at 0.06 eV.

I have asked a colleague to try an action=2 run using their independent installation of CosmoMC on a different cluster and it also crushed. This makes me think it could be a bug.

If anyone reading this has run CosmoMC successfully with action=2, would you please share your parameters.ini file for me to try?

Any help is much appreciated!

Levon

Antony Lewis
Posts: 1610
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: CosmoMC crushing with action=2

Post by Antony Lewis » June 19 2020

I've heard this, and I think seen it myself with some ifort versions. Are you using ifort? Is it crashing on this%CP=P in camb/fortran/results.f90?

Levon Pogosian
Posts: 25
Joined: September 25 2004
Affiliation: Simon Fraser University
Contact:

Re: CosmoMC crashing with action=2

Post by Levon Pogosian » June 19 2020

Thanks, Antony!

Yes, I am using ifort, intel/2018.2.

This must be a dumb question, but how can I tell if it crushes on this%CP=P in results.f90? If it was crushing there, is there a around it other than using a different compiler? The same cluster also has intel/2017.7 and intel/2018.1, but the only version of intelmpi is 2018.2. Also not sure if Planck18 would work with gfortran (I hear it does with gcc9?).

Any suggestions?

Antony Lewis
Posts: 1610
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: CosmoMC crashing with action=2

Post by Antony Lewis » June 21 2020

You can make and run cosmomc_debug to build a debug version, or do a code bisection to find where it crashed.

You can try using minimize_mcmc_refine_num=0 which I think avoids the problem code. Or use Cobaya.

Levon Pogosian
Posts: 25
Joined: September 25 2004
Affiliation: Simon Fraser University
Contact:

Re: CosmoMC crashing with action=2

Post by Levon Pogosian » June 22 2020

OK, I'll give it a try, thanks.

Levon Pogosian
Posts: 25
Joined: September 25 2004
Affiliation: Simon Fraser University
Contact:

Re: CosmoMC crashing with action=2

Post by Levon Pogosian » June 22 2020

Setting minimize_mcmc_refine_num=0 resolved the issue!

Thanks again!

Levon

Post Reply