SegFault in CosmoMC

Use of Healpix, camb, CLASS, cosmomc, compilers, etc.
Post Reply
Vinicius Miranda
Posts: 9
Joined: August 20 2014
Affiliation: Upenn
Contact:

SegFault in CosmoMC

Post by Vinicius Miranda » October 13 2016

Dear Prof Lewis,

My name is Vinicius Miranda, and I am facing the following issue in CosmoMC

Code: Select all

forrtl: severe (173): A pointer passed to DEALLOCATE points to an object that cannot be deallocated
Image              PC                Routine            Line        Source             
cosmomc_debug      0000000000D702E2  Unknown               Unknown  Unknown
cosmomc_debug      00000000004749A4  objectlists_mp_fr         272  ObjectLists.f90
cosmomc_debug      0000000000483579  objectlists_mp_th         494  ObjectLists.f90
cosmomc_debug      00000000007192C3  samplecollector_m         302  SampleCollector.f90
cosmomc_debug      000000000072029A  samplecollector_m         446  SampleCollector.f90
cosmomc_debug      00000000006CF30D  montecarlo_mp_tch         148  MCMC.f90
cosmomc_debug      000000000072F763  generalsetup_mp_t         137  GeneralSetup.f90
cosmomc_debug      00000000009E945E  MAIN__                    292  driver.F90
cosmomc_debug      00000000004129DE  Unknown               Unknown  Unknown
libc.so.6          00007F0FB9FCDD5D  Unknown               Unknown  Unknown
cosmomc_debug      0000000000412869  Unknown               Unknown  Unknown
I am using the latest CosmoMC - modified to include extra reionization and inflation parameters - so it is possible that I introduced a bug somewhere, somehow that caused this. However, I never faced a similar problem, and so I am created this thread to ask for some guidance. My chains are very long with a very large number of parameters - and it took almost two weeks of continuous run for this bug to appear for the first time. Now when I restart the same chain, the bug appears in a matter of minutes. Non-linear lensing is off and the semi-fast sampler is on. The likelihood is Planck non-binned 2015 TT only both at high-l and low-l (the chains with other Planck likelihood choices- including polarization - have not had this problem but they have not been running for that long). I can send more information if you need.

Thanks a lot in advance.
Best Regards
Vinicius Miranda

Vinicius Miranda
Posts: 9
Joined: August 20 2014
Affiliation: Upenn
Contact:

SegFault in CosmoMC

Post by Vinicius Miranda » October 14 2016

The problem seems to be related with this piece of code at SampleCollector.f90, subroutine TMpiChainCollector_UpdateCovAndCheckConverge. All my chains had errors around the same execution time which is compatible with the common requisite of Count > 500000. As a workaround I increased this number, but given that my chains will run for quite some time - at some point I will run out of memory.

Code: Select all

 if (this%Samples%Count > 500000) then
   !Try not to blow memory by storing too many samples 
   call this%Samples%Thin(2)
   this%Mpi%MPI_thin_fac = this%Mpi%MPI_thin_fac*2
 end if

Antony Lewis
Posts: 1659
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: SegFault in CosmoMC

Post by Antony Lewis » October 15 2016

This may be related to some ifort quirks, e.g.

https://software.intel.com/en-us/forums ... pic/390944

(note you should rarely need to get that many samples, usually there's an issue if it doesn't converge with many fewer)

Vinicius Miranda
Posts: 9
Joined: August 20 2014
Affiliation: Upenn
Contact:

SegFault in CosmoMC

Post by Vinicius Miranda » October 15 2016

Thank you Anthony.

There is a good reason why I need that many samples that is related to the Principal Components method I used here (https://arxiv.org/abs/1609.04788) and here (https://arxiv.org/abs/1411.5956).

So the answer is we need intel to fix the bug?

Antony Lewis
Posts: 1659
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: SegFault in CosmoMC

Post by Antony Lewis » October 15 2016

You can just increase the 500000 number. In practice you're not likely to run out of memory.

Vinicius Miranda
Posts: 9
Joined: August 20 2014
Affiliation: Upenn
Contact:

SegFault in CosmoMC

Post by Vinicius Miranda » October 17 2016

Thank you.

Post Reply