Mpi error while running cosmomc
-
Priscilla Linda Larasati
- Posts: 2
- Joined: November 11 2014
- Affiliation: Bandung Institute of Technology
Post
by Priscilla Linda Larasati » November 14 2014
Hello, when i tried to run the cosmomc, i found this error :
Code: Select all
cilla@ubuntu:~$ source /home/cilla/Cilla/Programs/intel/bin/compilervars.sh intel64
cilla@ubuntu:~$ cd Cilla/workspace/cosmomc/
cilla@ubuntu:~/Cilla/workspace/cosmomc$ ./cosmomc test.ini
Number of MPI processes: 1
file_root:test
NOTE: use_CMB now set internally from likelihoods
Random seeds: 7510, 6531 rand_inst: 1
compile with CLIK to use clik - see Makefile
MpiStop: 0
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 807209920.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
I am using ubuntu 14.04 LTS (64 bit) with ifort 2015, and I've installed openmpi.
Does anyone know how to solve this problem?
Thank you in advance.
-
Jason Dossett
- Posts: 97
- Joined: March 19 2010
- Affiliation: The University of Texas at Dallas
-
Contact:
Post
by Jason Dossett » November 14 2014
It looks like you haven't installed the planck likelihood code. If you want to just run a test without it, then in your test.ini file comment out:
Code: Select all
DEFAULT(batch1/CAMspec_defaults.ini)
DEFAULT(batch1/lowLike.ini)
DEFAULT(batch1/lowl.ini)
-
Priscilla Linda Larasati
- Posts: 2
- Joined: November 11 2014
- Affiliation: Bandung Institute of Technology
Post
by Priscilla Linda Larasati » November 18 2014
Thank you for the reply. I've run cosmomc with comment out, so without Planck likelihood it gives me an error like this :
Code: Select all
cilla@ubuntu:~/Cilla/workspace/cosmomc$ mpirun -np 2 ./cosmomc test.ini
Number of MPI processes: 2
file_root:test
Random seeds: 26149, 3896 rand_inst: 1
Random seeds: 26248, 3896 rand_inst: 2
reading BAO data set: DR11CMASS
reading BAO data set: DR11CMASS
reading BAO data set: DR11LOWZ
Doing non-linear Pk: F
transfer kmax = 0.8000000
adding parameters for: DR11LOWZ
adding parameters for: DR11CMASS
Fast divided into 1 blocks
6 parameters ( 6 slow ( 2 semi-slow), 0 fast ( 0 semi-fast))
skipped unused params: aps100 aps143 aps217 acib143 acib217 asz143 psr cibr ncib cal0 cal2 xi aksz bm_1_1
1 Reading checkpoint from chains/test_1.chk
reading BAO data set: DR11LOWZ
skipped unused params: aps100 aps143 aps217 acib143 acib217 asz143 psr cibr ncib cal0 cal2 xi aksz bm_1_1
starting Monte-Carlo
Initialising BBN Helium data...
2 Reading checkpoint from chains/test_2.chk
Done. Interpolation table is 48 by 13
Initialising BBN Helium data...
Done. Interpolation table is 48 by 13
Reionization_zreFromOptDepth: Did not converge to optical depth
tau = 0.496454721692255 optical_depth = 0.523473177237487
50.0000000000000 49.9984741210938
MpiStop: 1
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 0.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
cosmomc 00000000006E7BD1 Unknown Unknown Unknown
cosmomc 00000000006E6327 Unknown Unknown Unknown
cosmomc 0000000000689F74 Unknown Unknown Unknown
cosmomc 0000000000689D86 Unknown Unknown Unknown
cosmomc 0000000000621AC4 Unknown Unknown Unknown
cosmomc 000000000062A77B Unknown Unknown Unknown
libpthread.so.0 00007F813E400340 Unknown Unknown Unknown
cosmomc 00000000006DAA7B Unknown Unknown Unknown
cosmomc 00000000005EE685 Unknown Unknown Unknown
cosmomc 00000000005DF663 Unknown Unknown Unknown
cosmomc 00000000005F158D Unknown Unknown Unknown
cosmomc 000000000058636E Unknown Unknown Unknown
cosmomc 00000000005D6195 Unknown Unknown Unknown
cosmomc 00000000004E59B3 Unknown Unknown Unknown
cosmomc 00000000004E3A3F Unknown Unknown Unknown
cosmomc 0000000000555AD5 Unknown Unknown Unknown
cosmomc 0000000000553219 Unknown Unknown Unknown
cosmomc 00000000004BED2B Unknown Unknown Unknown
cosmomc 00000000004C003F Unknown Unknown Unknown
cosmomc 00000000004C0ED5 Unknown Unknown Unknown
cosmomc 00000000004BF1AA Unknown Unknown Unknown
cosmomc 00000000004D88CD Unknown Unknown Unknown
cosmomc 000000000055D1E0 Unknown Unknown Unknown
cosmomc 000000000040B22E Unknown Unknown Unknown
libc.so.6 00007F813E04BEC5 Unknown Unknown Unknown
cosmomc 000000000040B139 Unknown Unknown Unknown
However, I already installed Planck because I intend to use it in my work (also with cluster gas later).
Do you know why this is happening and how fix it?
Thanks in advance.
-
Jason Dossett
- Posts: 97
- Joined: March 19 2010
- Affiliation: The University of Texas at Dallas
-
Contact:
Post
by Jason Dossett » November 19 2014
That error usually comes from a bad parameter combination that causes the reionization module of CAMB to fail. Try changing the following in batch1/common_batch1.ini
to
You will still see lines like:
Code: Select all
Reionization_zreFromOptDepth: Did not converge to optical depth
tau = 0.496454721692255 optical_depth = 0.523473177237487
50.0000000000000 49.9984741210938
but the job should still run just fine.
-Jason
-
Antony Lewis
- Posts: 1943
- Joined: September 23 2004
- Affiliation: University of Sussex
-
Contact:
Post
by Antony Lewis » November 19 2014
Or just put some sensible prior or data constraint on the optical depth.