CosmoMc: Scaling with openMP & MPI

Use of Cobaya. camb, CLASS, cosmomc, compilers, etc.
Post Reply
Björn Sörgel
Posts: 13
Joined: April 05 2013
Affiliation: Institute of Astronomy/Kavli Institute for Cosmology Cambridge

CosmoMc: Scaling with openMP & MPI

Post by Björn Sörgel » September 18 2013

Dear all,
has anyone tested the scaling of the new CosmoMC with openMP and MPI? In general, is it more efficient to run more chains or to assign more CPUs to a small number of chains?
What would be the most efficient combination of MPI x openMP using up to 12 or 16 cores? 16 is the maximum for a single job on our cluster.
If I want to use more than 16 cores, I have to submit more than one job and the different jobs cannot communicate via MPI. Does it make sense to submit more than one job and merge them "by hand"? Again, what would the optimal combination of MPI x openMP be?

Thanks a lot.
Cheers,
Bjoern

Antony Lewis
Posts: 1943
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: CosmoMc: Scaling with openMP & MPI

Post by Antony Lewis » September 19 2013

If depends what you mean by efficient. In terms of numerical cost (i.e. energy or total dollars), it's better to use a small number of cores per chains and wait (most chains will still converge in well under a day, with e.g. 2-4 cores per chain). But it scales moderately well to 8-16 more cores per chain as long as the likelihoods are fast. Increasing the number of chains above a few does not speed burn in of each chain much, and hence is inefficient when you go to large numbers. I usually recommend running 4-8 chains, each on 2-8 CPUs; there's rarely any point generating more than 8 chains.

If 16 core are available I'd run 4 chains on 4 cores each if you are using likelihoods that are fast or paralellize well; this is a confirugration that works well on many systems. If any likelihoods you are using do not parallelieze well and are slow then do 8 chains on 2 cores.

Björn Sörgel
Posts: 13
Joined: April 05 2013
Affiliation: Institute of Astronomy/Kavli Institute for Cosmology Cambridge

CosmoMc: Scaling with openMP & MPI

Post by Björn Sörgel » September 19 2013

Hi Antony,
thanks for the quick reply. Let me add one more question:
How much do the MPI features like MPI_Learn_Propose speed the code up? Or asking the other way round: Is it a waste of CPU time to run e.g. on 2 nodes with 2 chains per node, and MPI communication is only possible among chains on the same node?
At the moment I'd rather minimize wall clock time, but of course without wasting a big amount of CPU time.

Thanks a lot.
Cheers,
Bjoern

Antony Lewis
Posts: 1943
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: CosmoMc: Scaling with openMP & MPI

Post by Antony Lewis » September 20 2013

MPI options will help a lot unless you already have a good .covmat covariance file for your posterior. MPI should work between nodes in a cluster.

Post Reply