Which Criterion of convergence on a MPI run of cosmoMC to end up ?

Use of Cobaya. camb, CLASS, cosmomc, compilers, etc.
Post Reply
Dournac Fabien
Posts: 65
Joined: May 18 2019
Affiliation: IRAP
Contact:

Which Criterion of convergence on a MPI run of cosmoMC to end up ?

Post by Dournac Fabien » February 25 2022

Hello,

I have an issue with a MPI run of cosmomc which seems to not end up.

1) I have at work a 64 AMD cores/128 htreads and a cosmoc compiled with Intel and qopenmp Intel : with this configuration, which is, from your point of view, the best compromise to apply between the number of MPI processses tha I can launch while not penalizing the OpenMP needed by each call of CAMB.

For the moment, I did :

Code: Select all

$ nohup mpirun -np 32 ./cosmomc des_lensing_curvature.ini &
with setting OMP_NUM_THREADS=128

Here the params*.ini files I have used (it is about DES survey in order plot Omega_k vs HO or HO vs Omega_m) :

des_lensing_curvature :

Code: Select all

#general settings, test file without require Planck likelihood code to be installed.
#see test_planck.ini for test file including Planck clik likelihoods

#DES 1-yr joint
DEFAULT(batch3/DES_lensing.ini)

#Planck 2018 lensing (native code, does not require Planck likelihood code)
DEFAULT(batch3/lensing.ini)

#general settings
DEFAULT(batch3/common.ini)

param[omegak] = 0. -0.5 0.5 0.01 0.01

#high for new runs, to start learning new proposal matrix quickly
MPI_Max_R_ProposeUpdate = 30

propose_matrix= planck_covmats/base_TT_lensing_lowE_lowl_plikHM.covmat

#Folder where files (chains, checkpoints, etc.) are stored
root_dir = chains_des_lensing_curvature/

#Root name for files produced
file_root=des_lensing_omegak
#action= 0 runs chains, 1 importance samples, 2 minimizes
#use action=4 just to quickly test likelihoods
action = 0

#Don't need to set this if OMP_NUM_THREADS set appropriately
num_threads = 0

#if you want to get theory cl for test point
#test_output_root = output_cl_root

start_at_bestfit =F
feedback=1
use_fast_slow = T

#turn on checkpoint for real runs where you want to be able to continue them
checkpoint = T

#sampling_method=7 is a new fast-slow scheme good for Planck
sampling_method = 7
dragging_steps  = 3
propose_scale = 2

#Set >0 to make data files for importance sampling
indep_sample=10

#these are just small speedups for testing
get_sigma8=T
You can see whe have inside the file :

Code: Select all

batch3/DES_lensing.ini
:

Code: Select all

DEFAULT(DES.ini)
wl_dataset[DES,used_data_types]=xip xim

wl_use_weyl = T

param[DES_DzL1] = 0
param[DES_DzL2] = 0
param[DES_DzL3] = 0
param[DES_DzL4] = 0
param[DES_DzL5] = 0
prior[DES_DzL1] =
prior[DES_DzL2] =
prior[DES_DzL3] =
prior[DES_DzL4] =
prior[DES_DzL5] =

param[DES_b1] = 1
param[DES_b2] = 1
param[DES_b3] = 1
param[DES_b4] = 1
param[DES_b5] = 1
and also the file :

Code: Select all

batch3/lensing.ini
:

Code: Select all

cmb_dataset[lensing]=  %DATASETDIR%planck_lensing_2018/smicadx12_Dec5_ftl_mv2_ndclpp_p_teb_consext8.dataset
cmb_dataset_speed[lensing] =-1

use_nonlinear_lensing = T

##block_semi_fast = F

#avoid really ruled out models
H0_min=40

DEFAULT(planck_calibration.ini)
and finally the file

Code: Select all

batch3/common.ini
:

Code: Select all

##Sample file of common parameters for baseline Planck set of runs

batch_name = batch3

local_dir = %LOCALDIR%
#directory, e.g. window functions in directory windows under data_dir
data_dir = %LOCALDIR%data/

INCLUDE(likelihood.ini)
INCLUDE(params_CMB_defaults.ini)

#whether to include prior on a parameter if it has non-varying value
include_fixed_parameter_priors = F

#Feedback level ( 2=lots,1=chatty,0=none)
feedback = 1

#Force computation of sigma_8 even if use_mpk = F
get_sigma8 = T

#This only has a small effect at very high L
use_nonlinear_lensing = T
#if using non-linear lensing, better turn of power spectrum fast/slow since is now non-linear
block_semi_fast = F

#Temperature at which to Monte-Carlo
temperature = 1

#Maximum number of chain steps
samples = 4000000

#Scale of proposal relative to covariance; 2.4 is recommended by astro-ph/0405462 for Gaussians
#If propose_matrix is much broader than the new distribution, make proportionately smaller
#Generally make smaller if your acceptance rate is too low
propose_scale = 1.9

#Increase to oversample fast parameters more, e.g. if space is odd shape
oversample_fast = 1

#if non-zero number of steps between sample info dumped to file file_root.data
#WANT THIS ON so we can do importance sampling runs quickly later for likelihood updates
#Off to save lots of disk space
indep_sample = 10

#number of samples to disgard at start; usually set to zero and remove later
burn_in = 0

#If zero set automatically
num_threads = 0

#MPI mode multi-chain options (recommended)
#MPI_Converge_Stop is a (variance of chain means)/(mean of variances) parameter that can be used to stop the chains
#Set to a negative number not to use this feature. Does not guarantee good accuracy of confidence limits.
MPI_Converge_Stop = 0.01


#Do initial period of slice sampling; may be good idea if 
#cov matrix or widths are likely to be very poor estimates
MPI_StartSliceSampling  = F

#Can optionally also check for convergence of confidence limits (after MPI_Converge_Stop reached)
#Can be good idea as small value of MPI_Converge_Stop does not (necessarily) imply good exploration of tails
MPI_Check_Limit_Converge = T

#if MPI_Check_Limit_Converge = T, give tail fraction to check (checks both tails):
MPI_Limit_Converge = 0.025
#permitted quantile chain variance in units of the standard deviation (small values v slow):
MPI_Limit_Converge_Err = 0.2
#which parameters tails to check. If zero, check all parameters:
MPI_Limit_Param = 0

#if MPI_LearnPropose = T, the proposal density is continally updated from the covariance of samples so far (since burn in)
MPI_LearnPropose = T
#can set a value of converge at which to stop updating covariance (so that it becomes rigorously Markovian)
#e.g. MPI_R_StopProposeUpdate = 0.4 will stop updating when (variance of chain means)/(mean of variances) < 0.4
MPI_R_StopProposeUpdate = 0

#If have covmat, R to reach before updating proposal density (increase if covmat likely to be poor)
#Only used if not varying new parameters that are fixed in covmat
MPI_Max_R_ProposeUpdate = 3
#As above, but used if varying new parameters that were fixed in covmat
MPI_Max_R_ProposeUpdateNew = 50

#Initial power spectrum amplitude pivots (Mpc^{-1})
#if tensor_pivot_k/=pivot_k then r is defined so to that P_t(tensor_pivot_k)=r P_s(tensor_pivot_k)
pivot_k = 0.05
#tensor_pivot_k defaults to same as pivot_k
#tensor_pivot_k = 0.05

#Whether the CMB should be lensed (slows a lot unless also computing matter power)
CMB_lensing = T
accuracy_level = 1

high_accuracy_default = T

#1: Simple Metropolis, 2: slice sampling, 3: slice sampling fast parameters, 4: directional gridding
#7 is new dragging method
sampling_method = 7

dragging_steps  = 3
use_fast_slow = T

##Rest are fairly irrelevant


#if sampling_method =4, iterations per gridded direction
directional_grid_steps = 20

#action = 0:  MCMC, action=1: postprocess .data file, action=2: find best fit point only
action = 0


#If propose_matrix is blank (first run), can try to use numerical Hessian to 
#estimate a good propose matrix. As a byproduct you also get an approx best fit point
estimate_propose_matrix = F

#when estimating best fit point (action=2 or estimate_propose_matrix), 
#required relative accuracy of each parameter in units of the covariance width
max_like_radius = 0.05
max_like_iterations = 5000
minimization_points_factor = 2
minimize_loglike_tolerance = 0.05
minimize_separate_fast = T
#if non-zero do some low temperature MCMC steps to check minimum stable
minimize_mcmc_refine_num = 20
minimize_refine_temp = 0.01
minimize_temp_scale_factor = 5
minimize_random_start_pos = T

#max_like_radius = 0.002
#max_like_iterations = 40000
#minimization_points_factor = 6
#minimize_loglike_tolerance=0.05

#if blank this is set from system clock
rand_seed = 

#If true, generate checkpoint files and terminated runs can be restarted using exactly the same command
#and chains continued from where they stopped
#With checkpoint=T note you must delete all chains/file_root.* files if you want new chains with an old file_root
checkpoint = T

#whether to stop on CAMB error, or continue ignoring point
stop_on_error=  T

#If action = 1
redo_likelihoods = T
redo_theory = F
redo_cls = F
redo_pk = F
redo_skip = 0
redo_outroot = 
redo_thin = 1
redo_add = F
redo_from_text = F
#If large difference in log likelihoods may need to offset to give sensible weights
#for exp(difference in likelihoods)
redo_likeoffset =  0
Unfortunatley, by launching this MCMC run (

Code: Select all

$ mpirun -np 32 ./cosmomc des_lensing_curvature.ini
, the run never stops.

So I conclude the criterion of convergence is never reached.

Maybe my prior on omegak is too large ( see the line

Code: Select all

param[omegak] = 0. -0.5 0.5 0.01 0.01
).

I attach the nohup.log associated to my last run : here is the repeated message which appears about tau reionization :

Code: Select all

TTanhReionization_zreFromOptDepth: Did not converge to optical depth
 tau =  0.404508363477996      optical_depth =   0.407329941538398
   33.3251953125000        33.3190917968750
 (If running a chain, have you put a constraint on tau?)
and I can't understand why I have this message : have I got to put a prior on "
tau
" or "
optical_depth
" ?

2) From a general point of view, among all these parameters in these files, which is the main criterion parameter which controls all the ending up of the MCMC run ? There are so many parameters that may change things.

Any clarification is welcome.

Best regards
Attachments
nohup.dat.gz
Log Results of my MCMC run with des_lensing_curvature.ini
(14.5 KiB) Downloaded 13 times

Antony Lewis
Posts: 1765
Joined: September 23 2004
Affiliation: University of Sussex
Contact:

Re: Which Criterion of convergence on a MPI run of cosmoMC to end up ?

Post by Antony Lewis » February 25 2022

You should fix tau if not using CMB data. Usually not much point running many more than 8 chains. This case is no doubt highly degenerate.

OMP_NUM_THREADS should be the number of cores divided by the number of mpi processes.

Dournac Fabien
Posts: 65
Joined: May 18 2019
Affiliation: IRAP
Contact:

Re: Which Criterion of convergence on a MPI run of cosmoMC to end up ?

Post by Dournac Fabien » February 26 2022

Thanks Antony,

Which line to add to fix tau in the file
des_curvature.ini
below :

Code: Select all

#general settings, test file without require Planck likelihood code to be installed.
#see test_planck.ini for test file including Planck clik likelihoods

#DES 1-yr joint
DEFAULT(batch3/DES_lensing.ini)

#general settings
DEFAULT(batch3/common.ini)

param[omegak] = 0. -0.5 0.5 0.01 0.01

#high for new runs, to start learning new proposal matrix quickly
MPI_Max_R_ProposeUpdate = 30

propose_matrix= planck_covmats/base_TT_lensing_lowE_lowl_plikHM.covmat

#Folder where files (chains, checkpoints, etc.) are stored
root_dir = chains/

#Root name for files produced
file_root=des_omegak
#action= 0 runs chains, 1 importance samples, 2 minimizes
#use action=4 just to quickly test likelihoods
action = 0

#Don't need to set this if OMP_NUM_THREADS set appropriately
num_threads = 0

#if you want to get theory cl for test point
#test_output_root = output_cl_root

start_at_bestfit =F
feedback=1
use_fast_slow = T

#turn on checkpoint for real runs where you want to be able to continue them
checkpoint = T

#sampling_method=7 is a new fast-slow scheme good for Planck
sampling_method = 7
dragging_steps  = 3
propose_scale = 2

#Set >0 to make data files for importance sampling
indep_sample=10

#these are just small speedups for testing
get_sigma8=T
??

I can't see where I use Planck in this file.

I have seen the line : propose_matrix= planck_covmats/base_TT_lensing_lowE_lowl_plikHM.covmat

Is this the line to remove ?

Best regards

Dournac Fabien
Posts: 65
Joined: May 18 2019
Affiliation: IRAP
Contact:

Re: Which Criterion of convergence on a MPI run of cosmoMC to end up ?

Post by Dournac Fabien » February 26 2022

I have added into file
des_curvature.ini
:

Code: Select all

param[tau_reio] = 0.05430842
I will see if this is the right thing to do.

Post Reply