cosmoMC with MPI error

Use of Healpix, camb, CLASS, cosmomc, compilers, etc.
Post Reply
Ana Vasile
Posts: 25
Joined: March 26 2006
Affiliation: Institute for Space Sciences
Contact:

cosmoMC with MPI error

Post by Ana Vasile » June 22 2006

Hi!

I am running cosmoMC on 8 nodes and after a while I am getting this error message:

[cli_2]: aborting job:
Fatal error in MPI_Testall: Other MPI error, error stack:
MPI_Testall(237)..........................: MPI_Testall(count=7, req_array=0xb4\
e3ff38, flag=0xbff9ca70, status_array=0xbff9ca90) failed
MPIDI_CH3_Progress_test(102)..............: an error occurred while handling an\
event returned by MPIDU_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(422):
MPIDU_Socki_handle_read(649)..............: connection failure (set=0,sock=7,er\
rno=104:(strerror() not found))
rank 7 in job 2 wn-1-2.spacescience.ro_34766 caused collective abort of all \
ranks
exit status of rank 7: killed by signal 9
rank 2 in job 2 wn-1-2.spacescience.ro_34766 caused collective abort of all \
ranks
exit status of rank 2: killed by signal 9


Could you please tell me what is the problem?

Post Reply