CosmoMC segmentation fault

Use of Cobaya. camb, CLASS, cosmomc, compilers, etc.
Post Reply
Akhilesh Nautiyal(akhi)
Posts: 72
Joined: June 13 2007
Affiliation: Malaviya National Institute of Technology Jaipur

CosmoMC segmentation fault

Post by Akhilesh Nautiyal(akhi) » March 16 2024

Hello everyone,

I am trying to run cosmomc on a virtual machine having Ubuntu 22.04 and gfortran version 11.4.0. The MPI version is mpiexec (OpenRTE) 4.1.2.
The compilation of cosmomc is OK without any error. When I am running it, I am getting segmentation fault. I got the following output after running cosmomc_debug.

Code: Select all

   export OMP_NUM_THREADS=0
   nohup mpiexec -np 4 ./cosmomc_debug test_planck.ini >out.txt &
   
    Number of MPI processes:           4
 file_root:test
 Random seeds: 26005, 25593 rand_inst:   2
 Random seeds: 26105, 25593 rand_inst:   3
 Random seeds: 26205, 25593 rand_inst:   4
 Random seeds: 25905, 25593 rand_inst:   1
 Using clik with likelihood file ./data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik
----
clik version plc_3.1
  smica
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
Initializing SimAll
----
clik version plc_3.1
  simall simall_EE_BB_TE
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
----
clik version plc_3.1
  smica
----
clik version plc_3.1
  smica
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
Initializing SimAll
Initializing SimAll
----
clik version plc_3.1
  simall simall_EE_BB_TE
----
clik version plc_3.1
  simall simall_EE_BB_TE
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
----
clik version plc_3.1
  smica
Checking likelihood './data/clik_14.0/hi_l/plik/plik_rd12_HM_v22b_TTTEEE.clik' on test data. got -1172.47 expected -1172.47 (diff -4.34056e-07)
----
   TT from l=0 to l=        2508
   EE from l=0 to l=        2508
   TE from l=0 to l=        2508
 Clik will run with the following nuisance parameters:
 
A_cib_217^@
 cib_index^@
 xi_sz_cib^@
 A_sz^@
 ps_A_100_100^@
 ps_A_143_143^@
 ps_A_143_217^@
 ps_A_217_217^@
 ksz_norm^@
 gal545_A_100^@
 gal545_A_143^@
 gal545_A_143_217^@
 gal545_A_217^@
 galf_EE_A_100^@
 galf_EE_A_100_143^@
 galf_EE_A_100_217^@
 galf_EE_A_143^@
 galf_EE_A_143_217^@
 galf_EE_A_217^@
 galf_EE_index^@
 galf_TE_A_100^@
 galf_TE_A_100_143^@
 galf_TE_A_100_217^@
 galf_TE_A_143^@
 galf_TE_A_143_217^@
 galf_TE_A_217^@
 galf_TE_index^@
 A_cnoise_e2e_100_100_EE^@
 A_cnoise_e2e_143_143_EE^@
 A_cnoise_e2e_217_217_EE^@
 A_sbpx_100_100_TT^@
 A_sbpx_143_143_TT^@
 A_sbpx_143_217_TT^@
 A_sbpx_217_217_TT^@
 A_sbpx_100_100_EE^@
 A_sbpx_100_143_EE^@
 A_sbpx_100_217_EE^@
 A_sbpx_143_143_EE^@
 A_sbpx_143_217_EE^@
 A_sbpx_217_217_EE^@
 calib_100T^@
 calib_217T^@
 calib_100P^@
 calib_143P^@
 calib_217P^@
 A_pol^@
 A_planck^@
 Using clik with likelihood file ./data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik
----
clik version plc_3.1
  gibbs_gauss b13c8fda-1837-41b5-ae2d-78d6b723fcf1
Checking likelihood './data/clik_14.0/low_l/commander/commander_dx12_v3_2_29.clik' on test data. got -11.6257 expected -11.6257 (diff -1.07424e-09)
----
   TT from l=0 to l=          29
 Clik will run with the following nuisance parameters:
 A_planck^@
 Using clik with likelihood file ./data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik
Initializing SimAll
----
clik version plc_3.1
  simall simall_EE_BB_TE
Checking likelihood './data/clik_14.0/low_l/simall/simall_100x143_offlike5_EE_Aplanck_B.clik' on test data. got -197.99 expected -197.99 (diff -4.1778e-08)
----
   EE from l=0 to l=          29
 Clik will run with the following nuisance parameters:
 A_planck^@
 read jla dataset data/Pantheon/full_long.dataset
 reading WL data set: DES_1YR_final
 read jla dataset data/Pantheon/full_long.dataset
 reading WL data set: DES_1YR_final
 read jla dataset data/Pantheon/full_long.dataset
 reading WL data set: DES_1YR_final
 read jla dataset data/Pantheon/full_long.dataset
 reading BAO data set: 6DF
 reading BAO data set: MGS
 reading BAO data set: DR12BAO
 reading WL data set: DES_1YR_final
 Doing non-linear Pk: T
 Doing CMB lensing: T
 Doing non-linear lensing: T
 TT lmax =  2508
 EE lmax =  2508
 ET lmax =  2508
 BB lmax =  2500
 PP lmax =  2500
 lmax_computed_cl  =  2508
 Computing tensors: F
 max_eta_k         =    14000.0000
 transfer kmax     =    10.1999998
 adding parameters for: smicadx12_Dec5_ftl_mv2_ndclpp_p_teb_consext8
 adding parameters for: 6DF
 adding parameters for: JLA
 adding parameters for: DR12BAO
 adding parameters for: MGS
 adding parameters for: commander_dx12_v3_2_29
 adding parameters for: simall_100x143_offlike5_EE_Aplanck_B
 adding parameters for: BK15_dust
 adding parameters for: plik_rd12_HM_v22b_TTTEEE
 adding parameters for: DES_1YR_final
 Fast divided into            3  blocks
 Block breaks at:           15          35
 54 parameters ( 7 slow ( 0 semi-slow), 47 fast ( 0 semi-fast))
 
 Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x14fb05f72960 in ???
#1  0x14fb05f71ac5 in ???
#2  0x14fb05c1551f in ???
        at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3  0x55685f96f47e in __fileutils_MOD_writeitemtxt
        at ../FileUtils.f90:1572
#4  0x55685f97029d in __fileutils_MOD_writeinlineitem
        at ../FileUtils.f90:926
#5  0x55685f970360 in __fileutils_MOD_writeinlineitems
        at ../FileUtils.f90:903
#6  0x55685f9702fa in __fileutils_MOD_writeitemstxt
        at ../FileUtils.f90:917
#7  0x55685f55ee56 in __paramnames_MOD_paramnames_writefile
        at /home/user/akhilesh/CosmoMC-master/source/ObjectParamNames.f90:398
#8  0x55685f5924c7 in __baseparameters_MOD_tbaseparameters_outputparamnames
        at /home/user/akhilesh/CosmoMC-master/source/BaseParameters.f90:264
#9  0x55685f77499b in cosmomc
        at /home/user/akhilesh/CosmoMC-master/source/driver.F90:210
#10  0x55685f775f8c in main
        at /home/user/akhilesh/CosmoMC-master/source/driver.F90:3
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 0 on node ubuntu204ltsserver exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------


   
     
I tried with mpich, but the same error occured.
I tried with running the command

Code: Select all

ulimit -s unlimited   
,
before running cosmomc, but the error remains.

Can anyone please help me to resolve this issue.

Thanks.

Antony Lewis
Posts: 1943
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: CosmoMC segmentation fault

Post by Antony Lewis » March 18 2024

Don't know, seems to be crashing in general code suggesting earlier memory corruption (or possibly compiler bug). If it works without clik, it may be a clik issue.

Akhilesh Nautiyal(akhi)
Posts: 72
Joined: June 13 2007
Affiliation: Malaviya National Institute of Technology Jaipur

Re: CosmoMC segmentation fault

Post by Akhilesh Nautiyal(akhi) » March 18 2024

Dear Antony,

Thanks for the reply.
The issue remains even after running without Planck Likelihood code and MPI.
Here is the output.

Code: Select all

./cosmomc test.ini

Code: Select all

file_root:test
 Random seeds:  8583, 29119 rand_inst:   0
 read jla dataset data/Pantheon/full_long.dataset
 reading BAO data set: 6DF
 reading BAO data set: MGS
 reading BAO data set: DR12BAO
 reading WL data set: DES_1YR_final
 Doing non-linear Pk: T
 Doing CMB lensing: T
 Doing non-linear lensing: T
 TT lmax =  2500
 EE lmax =  2500
 ET lmax =  2500
 BB lmax =  2500
 PP lmax =  2500
 lmax_computed_cl  =  2500
 Computing tensors: F
 max_eta_k         =    14000.0000    
 transfer kmax     =    10.1999998    
 adding parameters for: smicadx12_Dec5_ftl_mv2_ndclpp_p_teb_consext8
 adding parameters for: MGS
 adding parameters for: DR12BAO
 adding parameters for: 6DF
 adding parameters for: JLA
 adding parameters for: BK15_dust
 adding parameters for: DES_1YR_final
 Fast divided into            2  blocks
 Block breaks at:           15
 34 parameters ( 7 slow ( 0 semi-slow), 27 fast ( 0 semi-fast))

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x14fe3215a960 in ???
#1  0x14fe32159ac5 in ???
#2  0x14fe31dfd51f in ???
	at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
#3  0x5607159ec336 in __fileutils_MOD_writeitemtxt
	at ../FileUtils.f90:1572
#4  0x5607159ed155 in __fileutils_MOD_writeinlineitem
	at ../FileUtils.f90:926
#5  0x5607159ed218 in __fileutils_MOD_writeinlineitems
	at ../FileUtils.f90:903
#6  0x5607159ed1b2 in __fileutils_MOD_writeitemstxt
	at ../FileUtils.f90:917
#7  0x5607155e8d60 in __paramnames_MOD_paramnames_writefile
	at /home/user/akhilesh/CosmoMC-master/source/ObjectParamNames.f90:398
#8  0x56071561c27a in __baseparameters_MOD_tbaseparameters_outputparamnames
	at /home/user/akhilesh/CosmoMC-master/source/BaseParameters.f90:264
#9  0x5607157f1c76 in cosmomc
	at /home/user/akhilesh/CosmoMC-master/source/driver.F90:210
#10  0x5607157f31ac in main
	at /home/user/akhilesh/CosmoMC-master/source/driver.F90:3
Segmentation fault (core dumped)


Antony Lewis
Posts: 1943
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: CosmoMC segmentation fault

Post by Antony Lewis » March 19 2024

You'll have to debug what string is causing the issue when writing. (or use Cobaya, or try another compiler)

Akhilesh Nautiyal(akhi)
Posts: 72
Joined: June 13 2007
Affiliation: Malaviya National Institute of Technology Jaipur

Re: CosmoMC segmentation fault

Post by Akhilesh Nautiyal(akhi) » March 20 2024

Dear Antony,

Thanks for the reply.

I will try that.

Post Reply