CosmoMC crashing with action=2
-
- Posts: 30
- Joined: September 25 2004
- Affiliation: Simon Fraser University
- Contact:
CosmoMC crashing with action=2
I wonder if anyone else has encountered the same issue when running CosmoMC with action=2 (version downloaded and installed on March 9th, 2020).
I am trying to get the best fit theory Cls for the vanilla LCDM with Planck18+lensing, so I set action=2 and use a *.ini file that calls the default batch3/common.ini. I run it on a cluster (Niagara on Compute Canada) using 4 nodes with 40 CPUs per node and it seems to run for 6-7 minutes with no issues, then it exists with a segmentation fault (exit code was 174). The mpi_output_* file generated by the cluster seems to suggest that the task was completed but then some problem cause the crush. But I could be wrong with this interpretation.
I tried running an older version of CosmoMC with action=2 using Planck2015, using a very similar *.ini file, and it completes fine in under 10 minutes, producing all the files as it should.
Any idea what could be causing the issue? I saw some posts about Seg Fault issues when running action=2 with varying neutrino masses, but nothing else. My neutrino mass is fixed at 0.06 eV.
I have asked a colleague to try an action=2 run using their independent installation of CosmoMC on a different cluster and it also crushed. This makes me think it could be a bug.
If anyone reading this has run CosmoMC successfully with action=2, would you please share your parameters.ini file for me to try?
Any help is much appreciated!
Levon
I am trying to get the best fit theory Cls for the vanilla LCDM with Planck18+lensing, so I set action=2 and use a *.ini file that calls the default batch3/common.ini. I run it on a cluster (Niagara on Compute Canada) using 4 nodes with 40 CPUs per node and it seems to run for 6-7 minutes with no issues, then it exists with a segmentation fault (exit code was 174). The mpi_output_* file generated by the cluster seems to suggest that the task was completed but then some problem cause the crush. But I could be wrong with this interpretation.
I tried running an older version of CosmoMC with action=2 using Planck2015, using a very similar *.ini file, and it completes fine in under 10 minutes, producing all the files as it should.
Any idea what could be causing the issue? I saw some posts about Seg Fault issues when running action=2 with varying neutrino masses, but nothing else. My neutrino mass is fixed at 0.06 eV.
I have asked a colleague to try an action=2 run using their independent installation of CosmoMC on a different cluster and it also crushed. This makes me think it could be a bug.
If anyone reading this has run CosmoMC successfully with action=2, would you please share your parameters.ini file for me to try?
Any help is much appreciated!
Levon
-
- Posts: 1944
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: CosmoMC crushing with action=2
I've heard this, and I think seen it myself with some ifort versions. Are you using ifort? Is it crashing on this%CP=P in camb/fortran/results.f90?
-
- Posts: 30
- Joined: September 25 2004
- Affiliation: Simon Fraser University
- Contact:
Re: CosmoMC crashing with action=2
Thanks, Antony!
Yes, I am using ifort, intel/2018.2.
This must be a dumb question, but how can I tell if it crushes on this%CP=P in results.f90? If it was crushing there, is there a around it other than using a different compiler? The same cluster also has intel/2017.7 and intel/2018.1, but the only version of intelmpi is 2018.2. Also not sure if Planck18 would work with gfortran (I hear it does with gcc9?).
Any suggestions?
Yes, I am using ifort, intel/2018.2.
This must be a dumb question, but how can I tell if it crushes on this%CP=P in results.f90? If it was crushing there, is there a around it other than using a different compiler? The same cluster also has intel/2017.7 and intel/2018.1, but the only version of intelmpi is 2018.2. Also not sure if Planck18 would work with gfortran (I hear it does with gcc9?).
Any suggestions?
-
- Posts: 1944
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: CosmoMC crashing with action=2
You can make and run cosmomc_debug to build a debug version, or do a code bisection to find where it crashed.
You can try using minimize_mcmc_refine_num=0 which I think avoids the problem code. Or use Cobaya.
You can try using minimize_mcmc_refine_num=0 which I think avoids the problem code. Or use Cobaya.
-
- Posts: 30
- Joined: September 25 2004
- Affiliation: Simon Fraser University
- Contact:
Re: CosmoMC crashing with action=2
OK, I'll give it a try, thanks.
-
- Posts: 30
- Joined: September 25 2004
- Affiliation: Simon Fraser University
- Contact:
Re: CosmoMC crashing with action=2
Setting minimize_mcmc_refine_num=0 resolved the issue!
Thanks again!
Levon
Thanks again!
Levon