CosmoMC: incomplete checkpoint files

Use of Cobaya. camb, CLASS, cosmomc, compilers, etc.
Post Reply
Geraint Harker
Posts: 3
Joined: March 15 2013
Affiliation: University College London

CosmoMC: incomplete checkpoint files

Post by Geraint Harker » March 15 2013

Is anyone else having problems with incomplete checkpoint files in CosmoMC MPI runs? Some are written fully, but others are incomplete; the file just ends prematurely. This happens whether the job is killed by the job scheduler because it hit a time limit, or CosmoMC terminates because the requested number of samples has been obtained. It also happens for quite small test runs, whether I write the checkpoints to home space or to fast, parallel scratch space.

Examining the checkpoint files by hand, they seem to be written correctly right up to the point where they just end.

I'm running CosmoMC as a generic sampler with num_hard = 54.

I'm compiling using the Intel compiler in ICS 12.1.4 and OpenMPI 1.4. I've tried replacing the call to flush() in the FlushFile subroutine in utils.F90 with a call to the COMMITQQ function provided by ifort, and which should operate like FLUSH but in a blocking mode. No improvement though.

Thanks in advance for any help!

Geraint

Post Reply