Cosmomc running slow

Use of Healpix, camb, CLASS, cosmomc, compilers, etc.
Post Reply
Akhilesh Nautiyal(akhi)
Posts: 68
Joined: June 13 2007
Affiliation: Malaviya National Institute of Technology Jaipur

Cosmomc running slow

Post by Akhilesh Nautiyal(akhi) » July 05 2019

Hi,

I am running cosmomc, downloaded recently from Git, on my workstation using gfortran-7 and mpich. the OS is CentOS. The cosmomc is taking long time to finish.
I started it on 14th June using

Code: Select all

 nohup mpiexec -np 4 ./cosmomc test_planck.ini >output.txt &
, but it is still running and has reached to

Code: Select all

Current convergence R-1 =    2.83716731E-02  chain steps =       15365
 slow changes       13590  power changes           0
 updating proposal density
.
I have set convergence stop to 0.01. I am running it on this system first time and I have set propose_matrix blank.
The configuration of my system is

Code: Select all

[akhilesh@cosmos CosmoMC-master]$ lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 85
Model name:            Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
Stepping:              4
CPU MHz:               2101.000
CPU max MHz:           2101.0000
CPU min MHz:           1000.0000
BogoMIPS:              4200.00
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              1024K
L3 cache:              22528K
NUMA node0 CPU(s):     0-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 intel_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear spec_ctrl intel_stibp flush_l1d
I have 32 GN RAm and 15 GB Swap. I will be grateful if someone can let me know how I can increase the speed and get results faster.

thanks,
akhilesh

Antony Lewis
Posts: 1485
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: Cosmomc running slow

Post by Antony Lewis » July 05 2019

Having blank propose_matrix will certainly slow it down. Check it is actually running the four chain processes at once, each using 4 openmp threads. Chains should run in 12-48 hrs in most cases.

Akhilesh Nautiyal(akhi)
Posts: 68
Joined: June 13 2007
Affiliation: Malaviya National Institute of Technology Jaipur

Re: Cosmomc running slow

Post by Akhilesh Nautiyal(akhi) » July 08 2019

thanks for the reply.

I think it is running four chains at a time. Here is the output of top command.

Code: Select all

[akhilesh@cosmos CosmoMC-master]$ top

top - 11:29:10 up 23 days, 16:37,  2 users,  load average: 111.84, 115.10, 115.3
Tasks: 493 total,   5 running, 488 sleeping,   0 stopped,   0 zombie
%Cpu(s): 99.8 us,  0.2 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 32396640 total, 21606228 free,  5182228 used,  5608184 buff/cache
KiB Swap: 16318460 total, 16318460 free,        0 used. 26543620 avail Mem 

   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND    
 24421 akhilesh  20   0 3193164 717864   6184 R  1082  2.2 271835:57 cosmomc    
 24419 akhilesh  20   0 3193164 720492   6168 R  1002  2.2 271755:50 cosmomc    
 24418 akhilesh  20   0 3193160 720660   6408 R 963.4  2.2 272629:29 cosmomc    
 24420 akhilesh  20   0 3193164 717584   6192 R 132.7  2.2 271897:46 cosmomc    
 10845 polkitd   20   0  723984  19512   5424 S   4.0  0.1   1123:44 polkitd    
 22944 akhilesh  20   0 6167772 246296  55768 S   2.0  0.8   9:31.69 gnome-she+ 
 11956 root      20   0  848188  67056  47708 S   1.0  0.2   1:33.40 X          
 10819 dbus      20   0   71904   6016   1940 S   0.7  0.0 126:42.78 dbus-daem+ 
163820 akhilesh  20   0  670940  29800  16556 S   0.7  0.1   0:00.70 gnome-ter+ 
 10837 root      20   0  396484   4264   3256 S   0.3  0.0  80:41.32 accounts-+ 
 84347 root      20   0       0      0      0 S   0.3  0.0   0:17.72 kworker/2+ 
144558 root      20   0       0      0      0 S   0.3  0.0   0:01.31 kworker/7+ 
147916 root      20   0       0      0      0 S   0.3  0.0   0:01.04 kworker/1+ 
163098 root      20   0       0      0      0 S   0.3  0.0   0:00.15 kworker/2+ 
165512 akhilesh  20   0  162376   2668   1596 R   0.3  0.0   0:00.23 top        
     1 root      20   0  196068   9236   4208 S   0.0  0.0   2:13.11 systemd    
     2 root      20   0       0      0      0 S   0.0  0.0   0:02.01 kthreadd   

Code: Select all

top - 11:29:56 up 23 days, 16:38,  2 users,  load average: 113.09, 115.00, 115.3
Tasks: 493 total,   5 running, 488 sleeping,   0 stopped,   0 zombie
%Cpu0  : 99.7 us,  0.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu1  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  : 99.7 us,  0.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  : 99.3 us,  0.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu8  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu9  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu11 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu12 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu13 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu14 : 97.7 us,  2.3 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu15 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu16 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu17 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu18 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu19 :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st  
I will be grateful if you can kindly let me know how can I check the number of threads.

Antony Lewis
Posts: 1485
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: Cosmomc running slow

Post by Antony Lewis » July 08 2019

The "%CPU " should be about the same for all instances;' check you have OMP_NUM_THREADS set appropriately (and possibly MPI placement options).

Antony Lewis
Posts: 1485
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: Cosmomc running slow

Post by Antony Lewis » July 08 2019

The "%CPU " should be about the same for all instances;' check you have OMP_NUM_THREADS set appropriately (and possibly MPI placement options).

Akhilesh Nautiyal(akhi)
Posts: 68
Joined: June 13 2007
Affiliation: Malaviya National Institute of Technology Jaipur

Re: Cosmomc running slow

Post by Akhilesh Nautiyal(akhi) » July 17 2019

I tried

Code: Select all

 export OMP_NUM_THREADS=8  
command and used num_threads = 8 in test .ini. Then I executed the command

Code: Select all

  nohup mpiexec -np 4 ./cosmomc test_planck.ini >output.txt &    
.
But, still it took 167.54036 hours. Here is the final output

Code: Select all

 Chain           3  MPI communicating
 Chain           4  MPI communicating
 Chain           2  MPI communicating
 Chain           1  MPI communicating
 Current convergence R-1 =    7.68585829E-03  chain steps =       38404
 slow changes       33930  power changes           0
              omegabh2 lim err   0.050
              omegabh2 lim err   0.032
              omegach2 lim err   0.064
              omegach2 lim err   0.113
                 theta lim err   0.077
                 theta lim err   0.025
                   tau lim err   0.119
                   tau lim err   0.080
                  logA lim err   0.123
                  logA lim err   0.079
                    ns lim err   0.078
                    ns lim err   0.113
             calPlanck lim err   0.077
             calPlanck lim err   0.107
                BBdust lim err   0.026
                BBdust lim err   0.065
                BBsync lim err   0.007
                BBsync lim err   0.095
           BBalphadust lim err   0.009
           BBalphadust lim err   0.027
            BBbetadust lim err   0.038
            BBbetadust lim err   0.059
           BBalphasync lim err   0.015
           BBalphasync lim err   0.010
            BBbetasync lim err   0.024
            BBbetasync lim err   0.026
        BBdustsynccorr lim err   0.031
        BBdustsynccorr lim err   0.045
               acib217 lim err   0.043
               acib217 lim err   0.021
                    xi lim err   0.014
                    xi lim err   0.005
                asz143 lim err   0.019
                asz143 lim err   0.030
                aps100 lim err   0.064
                aps100 lim err   0.042
                aps143 lim err   0.029
                aps143 lim err   0.026
             aps143217 lim err   0.025
             aps143217 lim err   0.053
                aps217 lim err   0.033
                aps217 lim err   0.050
                  aksz lim err   0.008
                  aksz lim err   0.025
               kgal100 lim err   0.063
               kgal100 lim err   0.010
               kgal143 lim err   0.015
               kgal143 lim err   0.030
            kgal143217 lim err   0.016
            kgal143217 lim err   0.076
               kgal217 lim err   0.049
               kgal217 lim err   0.039
                  cal0 lim err   0.052
                  cal0 lim err   0.041
                  cal2 lim err   0.026
                  cal2 lim err   0.051
                DES_b1 lim err   0.015
                DES_b1 lim err   0.035
                DES_b2 lim err   0.037
                DES_b2 lim err   0.048
                DES_b3 lim err   0.062
                DES_b3 lim err   0.041
                DES_b4 lim err   0.033
                DES_b4 lim err   0.029
                DES_b5 lim err   0.021
                DES_b5 lim err   0.069
                DES_m1 lim err   0.060
                DES_m1 lim err   0.069
                DES_m2 lim err   0.031
                DES_m2 lim err   0.065
                DES_m3 lim err   0.049
                DES_m3 lim err   0.044
                DES_m4 lim err   0.016
                DES_m4 lim err   0.048
               DES_AIA lim err   0.022
               DES_AIA lim err   0.070
           DES_alphaIA lim err   0.010
           DES_alphaIA lim err   0.032
              DES_DzL1 lim err   0.045
              DES_DzL1 lim err   0.049
              DES_DzL2 lim err   0.051
              DES_DzL2 lim err   0.043
              DES_DzL3 lim err   0.027
              DES_DzL3 lim err   0.024
              DES_DzL4 lim err   0.047
              DES_DzL4 lim err   0.047
              DES_DzL5 lim err   0.017
              DES_DzL5 lim err   0.047
              DES_DzS1 lim err   0.034
              DES_DzS1 lim err   0.036
              DES_DzS2 lim err   0.042
              DES_DzS2 lim err   0.027
              DES_DzS3 lim err   0.019
              DES_DzS3 lim err   0.017
              DES_DzS4 lim err   0.036
              DES_DzS4 lim err   0.054
Current limit err = 0.1231 for logA; samps = 38404
 Requested limit convergence achieved
Total time:  603145  ( 167.54036 hours  )
  
It will be of great help I one can suggest some further modification.
Thanks,
akhilesh

Antony Lewis
Posts: 1485
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: Cosmomc running slow

Post by Antony Lewis » July 18 2019

Do you really want both DES and bicep/keck? The more nasty nuisance parameters you have the longer it will take.

Akhilesh Nautiyal(akhi)
Posts: 68
Joined: June 13 2007
Affiliation: Malaviya National Institute of Technology Jaipur

Re: Cosmomc running slow

Post by Akhilesh Nautiyal(akhi) » July 31 2019

Thanks. Without DES and BK15,

Code: Select all

Current limit err = 0.1674 for r; samps = 22885
 Requested limit convergence achieved
Total time:  123061  (  34.18356 hours  )
  

Post Reply