CAMB: Running on 2 cores is not faster?

Use of Healpix, camb, CLASS, cosmomc, compilers, etc.
Post Reply
Savvas Nesseris
Posts: 70
Joined: April 05 2005
Affiliation: UAM/IFT
Contact:

CAMB: Running on 2 cores is not faster?

Post by Savvas Nesseris » October 17 2008

Hi,
I'm trying to run CAMB on 2 cores (under Cygwin with Mpich2), but I seem to have a problem. Despite the fact that I can see from the task manager that both cores share the load, CAMB takes the same amount of time to run as if it was on one core.

The relevant parts of the makefile are (nothing else is changed):
#G95 compiler
F90C = g95
FFLAGS = -O2 -DMPI -mno-cygwin -I/cygdrive/c/MPICH2/include
LINKFFLAGS=-L/cygdrive/c/MPICH2/lib -lfmpich2g -lfmpe -lmpi

camb: \$(CAMBOBJ) \$(DRIVER)
\$(F90C) \$(F90FLAGS) \$(CAMBOBJ) \$(DRIVER) \$(LINKFFLAGS) -o \$@

I have successfully run on both cores simple examples, eg that calculate Pi=3.1415..., and I have confirmed that it takes about the half amount of time to run on both cores compared to one, so both g95 and Mpich2 must be working properly.

I'm obviously missing something... Any ideas?
Thanks

PS g95 does not need a switch like -openmp

Savvas Nesseris
Posts: 70
Joined: April 05 2005
Affiliation: UAM/IFT
Contact:

CAMB: Running on 2 cores is not faster?

Post by Savvas Nesseris » October 20 2008

First of all I have to admit that posting this thread was a mistake on my behalf. Being a noobie on parallel programming I didn't know the difference between OpenMP and MPI.

Searching and reading a lot revealed (the now obvious to me fact) that the reason the running on 2 cores is not faster with MPICH2 is that CAMB is written for OpenMP (which is implemented by the compilers with a -openmp switch or similar) while MPICH2 is a specific implementation of MPI (along with Openmpi). The correct (for gfortran under Cygwin) entries on the makefile are (at least that's what worked for me)
#Gfortran compiler: if pre v4.3 add -D__GFORTRAN__
F90C = gfortran
FFLAGS = -O2 -fopenmp -D__GFORTRAN__
LINKFFLAGS= -lgomp -lpthread

camb: \$(CAMBOBJ) \$(DRIVER)
\$(F90C) \$(F90FLAGS) \$(CAMBOBJ) \$(DRIVER) \$(LINKFFLAGS) -o $@
Then, for 2 processors (dual core) run
export OMP_NUM_THREADS=2
./camb ./params.ini

Now, CAMB runs properly and there is a substantial decrease in the amount time it takes to run!

Finally, regarding the comment that g95 doesn't need a -fopenmp switch, the reason is that it doesn't support openmp, and as it seems it won't in the future:
http://groups.google.com/group/gg95/bro ... 55966ba9ea

Post Reply