[SOLVED] Segfault

Use of Cobaya. camb, CLASS, cosmomc, compilers, etc.
Post Reply
Will Kinney
Posts: 15
Joined: September 14 2005
Affiliation: Univ. at Buffalo, SUNY
Contact:

[SOLVED] Segfault

Post by Will Kinney » May 08 2013

OK, a search of the recent cosmomc threads doesn't turn up anything that looks like this. Running cosmomc using test.ini gives segfaults with malloc() errors indicating corrupted memory. Blech. Using intel/13.0, intel-mpi/4.1.0, mkl/10.3. Any thoughts?

Code: Select all

[whkinney@k07n14:cosmomc]more pbs-cpi-bono.out 
Job 3975901.d15n41.ccr.buffalo.edu has requested 8 cores/processors per node.
working directory = /projects/whkinney/Planck_WHK/cosmomc
/util/intel/impi/4.1.0.024/intel64/bin/mpiexec
running mpdallexit on d16n07
LAUNCHED mpd on d16n07  via  
RUNNING: mpd on d16n07
LAUNCHED mpd on d16n06  via  d16n07
RUNNING: mpd on d16n06
d16n07
d16n06
 Number of MPI processes:                    16
 Random seeds:  1866,  5732 rand_inst:   2
 Random seeds:  3417,  5733 rand_inst:   4
 Random seeds:  3926,  5734 rand_inst:   3
 Random seeds:  3462,  5732 rand_inst:  11
 Random seeds:  4597,  5734 rand_inst:   5
 Random seeds:  4723,  5734 rand_inst:   6
 Random seeds:  3630,  5733 rand_inst:  10
 Random seeds:  3770,  5733 rand_inst:  12
 Random seeds:  4660,  5735 rand_inst:   1
 WMAP options (beam TE TT) T T T
 Using clik with likelihood file 
 /projects/whkinney/Planck_WHK/cosmomc/data/clik/CAMspec_v6.2TN_2013_02_26_dist.
 clik
 Random seeds:  4193,  5733 rand_inst:   9
 Random seeds:  5667,  5735 rand_inst:   8
 Random seeds:  5074,  5734 rand_inst:  16
 Random seeds:  6040,  5735 rand_inst:   7
 Random seeds:  5205,  5734 rand_inst:  14
 Random seeds:  5768,  5734 rand_inst:  15
 Random seeds:  5894,  5735 rand_inst:  13
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source      
       
libclik.so         00002B5245C1A6D0  Unknown               Unknown  Unknown
libclik.so         00002B5245C1A9BD  Unknown               Unknown  Unknown
libclik.so         00002B5245C1A8F3  Unknown               Unknown  Unknown
libclik.so         00002B5245C0ADFE  Unknown               Unknown  Unknown
libclik_f90.so     00002B5244888D4F  Unknown               Unknown  Unknown
libclik_f90.so     00002B524488CE33  Unknown               Unknown  Unknown
cosmomc            000000000048C460  Unknown               Unknown  Unknown
cosmomc            000000000048AF45  Unknown               Unknown  Unknown
cosmomc            00000000004B189B  Unknown               Unknown  Unknown
cosmomc            00000000004D312B  Unknown               Unknown  Unknown
cosmomc            00000000004F9CB9  Unknown               Unknown  Unknown
cosmomc            000000000040E6CC  Unknown               Unknown  Unknown
libc.so.6          000000318201ECDD  Unknown               Unknown  Unknown
cosmomc            000000000040E5C9  Unknown               Unknown  Unknown
*** glibc detected *** ./cosmomc: malloc(): memory corruption: 0x0000000003ff872
0 ***
*** glibc detected *** ./cosmomc: malloc(): memory corruption: 0x0000000003ff872
0 ***
rank 0 in job 1  d16n07_35710   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9 
All Done!

Will Kinney
Posts: 15
Joined: September 14 2005
Affiliation: Univ. at Buffalo, SUNY
Contact:

[SOLVED] Segfault

Post by Will Kinney » May 10 2013

We solved the problem by moving my MKL line to a "Single Dynamic Library" instead of a standard send of dynamically linked libraries.

Linking like this causes segfaults in the Planck likelihood:

Code: Select all

#LAPACKL =   -L$(MKLROOT)/lib/intel64 -lmkl_intel_ilp64 -lmkl_sequential -lmkl_core -lpthread -lm
Linking like this does not segfault:

Code: Select all

LAPACKL =  -L$(MKLROOT)/lib/intel64 $(MKLROOT)/lib/intel64/libmkl_lapack95_ilp64.a -lmkl_rt -lpthread -lm
Thanks to Azadeh Moradinezhad for figuring this out.

Post Reply