CosmoMC -- MPI_TYPE_FREE - Invalid Datatype argument

Use of Cobaya. camb, CLASS, cosmomc, compilers, etc.
Post Reply
Subharthi Ray
Posts: 1
Joined: April 14 2005
Affiliation: University of Kwazulu Natal, Durban

CosmoMC -- MPI_TYPE_FREE - Invalid Datatype argument

Post by Subharthi Ray » March 28 2007

Hi,

I am frequently facing this problem of the cosmomc being aborted after running considerably for a few hours. Did anyone of you face the same problem, and suggest me the remedy ? Ours is a HP Cluster, with 64 bit 8x4 AlphaServer ES45 68/1250 Systems @ 1.25 GHz, and 8 GB memory per node. The fortran is also Digital/Compaq Fortran.

Below, I am copying the last few lines of the screen output.

Best regards,
Subharthi


*******************************************************************************

2 rat: 0.3771705 in 7026 (M) best: 1780.353
0 rat: 0.3739511 in 10964 (M) best: 1780.384
3 rat: 0.3800676 in 7104 (M) best: 1780.495
1 rat: 0.3702122 in 7023 (M) best: 1780.352
4 rat: 0.3629092 in 6751 (M) best: 1780.360
Chain 6 MPI communicating
Chain 5 MPI communicating
Chain 3 MPI communicating
Chain 4 MPI communicating
Chain 2 MPI communicating
Chain 1 MPI communicating
[ 3] MPID Die - ump2chck.c:91 "ump_wait failure" (-1)
[ 4] MPID Die - ump2chck.c:91 "ump_wait failure" (-1)
[ 2] MPID Die - ump2chck.c:91 "ump_wait failure" (-1)
UMP_W_CLOSE event arrived while awaiting long dataUMP_W_CLOSE event arrived while awaiting long data[ 1] MPID Die - ump2chck.c:280 "ump_wait failure" (-1)
Current convergence R-1 = 1.1793140E-02 chain steps = 11032
param 1 lim err 0.1343304
param 1 lim err 0.1511654
param 2 lim err 0.1771570
param 2 lim err 0.1512259
param 3 lim err 2.4209348E-02
param 3 lim err 0.1409388
param 4 lim err 8.0031656E-02
param 4 lim err 0.1537758
param 5 lim err 0.1232639
param 5 lim err 0.1846910
param 6 lim err 9.1186695E-02
param 6 lim err 0.1527823
Current worst limit error = 0.1846910
for parameter 5 samps = 21612
Total time: 22286 ( 6.19063032892015 hours)
Slow proposals: 11032
0 - MPI_TYPE_FREE : Invalid datatype argument: datatype argument is not a valid datatype
Special bit pattern 20022c90 in datatype is incorrect. May indicate an
out-of-order argument or a deleted datatype
[0] Aborting program !
MPI process 1928472 exited with status 3
MPI process 3489773 exited with status 255
MPI process 3929590 exited with status 255
MPI process 1373436 exited with status 255
MPI process 2515532 exited with status 255

**********************************************************************************

Post Reply