I have compiled CosmoMC as a generic MCMC sampler, and have been having a problem with checkpointing. Any single-chain job will generate the .chk file, no problem, but whenever I try to run a job using MPI for more than one chain, no .chk files are ever generated. (Tried anywhere from 2 to 144 chains--no checkpointing except for single-chain jobs.) Does anyone have any advice on a workaround or a fix for this? It is much to my advantage to run shorter jobs, so I need to get checkpointing working for larger numbers of chains. Thanks!
-Gavin
CosmoMC checkpointing with MPI
-
- Posts: 1
- Joined: August 20 2013
- Affiliation: Michigan State University