I have an issue with a MPI run of cosmomc which seems to not end up.
1) I have at work a 64 AMD cores/128 htreads and a cosmoc compiled with Intel and qopenmp Intel : with this configuration, which is, from your point of view, the best compromise to apply between the number of MPI processses tha I can launch while not penalizing the OpenMP needed by each call of CAMB.
For the moment, I did :
Code: Select all
$ nohup mpirun -np 32 ./cosmomc des_lensing_curvature.ini &
Here the params*.ini files I have used (it is about DES survey in order plot Omega_k vs HO or HO vs Omega_m) :
des_lensing_curvature :
Code: Select all
#general settings, test file without require Planck likelihood code to be installed.
#see test_planck.ini for test file including Planck clik likelihoods
#DES 1-yr joint
DEFAULT(batch3/DES_lensing.ini)
#Planck 2018 lensing (native code, does not require Planck likelihood code)
DEFAULT(batch3/lensing.ini)
#general settings
DEFAULT(batch3/common.ini)
param[omegak] = 0. -0.5 0.5 0.01 0.01
#high for new runs, to start learning new proposal matrix quickly
MPI_Max_R_ProposeUpdate = 30
propose_matrix= planck_covmats/base_TT_lensing_lowE_lowl_plikHM.covmat
#Folder where files (chains, checkpoints, etc.) are stored
root_dir = chains_des_lensing_curvature/
#Root name for files produced
file_root=des_lensing_omegak
#action= 0 runs chains, 1 importance samples, 2 minimizes
#use action=4 just to quickly test likelihoods
action = 0
#Don't need to set this if OMP_NUM_THREADS set appropriately
num_threads = 0
#if you want to get theory cl for test point
#test_output_root = output_cl_root
start_at_bestfit =F
feedback=1
use_fast_slow = T
#turn on checkpoint for real runs where you want to be able to continue them
checkpoint = T
#sampling_method=7 is a new fast-slow scheme good for Planck
sampling_method = 7
dragging_steps = 3
propose_scale = 2
#Set >0 to make data files for importance sampling
indep_sample=10
#these are just small speedups for testing
get_sigma8=T
Code: Select all
batch3/DES_lensing.ini
Code: Select all
DEFAULT(DES.ini)
wl_dataset[DES,used_data_types]=xip xim
wl_use_weyl = T
param[DES_DzL1] = 0
param[DES_DzL2] = 0
param[DES_DzL3] = 0
param[DES_DzL4] = 0
param[DES_DzL5] = 0
prior[DES_DzL1] =
prior[DES_DzL2] =
prior[DES_DzL3] =
prior[DES_DzL4] =
prior[DES_DzL5] =
param[DES_b1] = 1
param[DES_b2] = 1
param[DES_b3] = 1
param[DES_b4] = 1
param[DES_b5] = 1
Code: Select all
batch3/lensing.ini
Code: Select all
cmb_dataset[lensing]= %DATASETDIR%planck_lensing_2018/smicadx12_Dec5_ftl_mv2_ndclpp_p_teb_consext8.dataset
cmb_dataset_speed[lensing] =-1
use_nonlinear_lensing = T
##block_semi_fast = F
#avoid really ruled out models
H0_min=40
DEFAULT(planck_calibration.ini)
Code: Select all
batch3/common.ini
Code: Select all
##Sample file of common parameters for baseline Planck set of runs
batch_name = batch3
local_dir = %LOCALDIR%
#directory, e.g. window functions in directory windows under data_dir
data_dir = %LOCALDIR%data/
INCLUDE(likelihood.ini)
INCLUDE(params_CMB_defaults.ini)
#whether to include prior on a parameter if it has non-varying value
include_fixed_parameter_priors = F
#Feedback level ( 2=lots,1=chatty,0=none)
feedback = 1
#Force computation of sigma_8 even if use_mpk = F
get_sigma8 = T
#This only has a small effect at very high L
use_nonlinear_lensing = T
#if using non-linear lensing, better turn of power spectrum fast/slow since is now non-linear
block_semi_fast = F
#Temperature at which to Monte-Carlo
temperature = 1
#Maximum number of chain steps
samples = 4000000
#Scale of proposal relative to covariance; 2.4 is recommended by astro-ph/0405462 for Gaussians
#If propose_matrix is much broader than the new distribution, make proportionately smaller
#Generally make smaller if your acceptance rate is too low
propose_scale = 1.9
#Increase to oversample fast parameters more, e.g. if space is odd shape
oversample_fast = 1
#if non-zero number of steps between sample info dumped to file file_root.data
#WANT THIS ON so we can do importance sampling runs quickly later for likelihood updates
#Off to save lots of disk space
indep_sample = 10
#number of samples to disgard at start; usually set to zero and remove later
burn_in = 0
#If zero set automatically
num_threads = 0
#MPI mode multi-chain options (recommended)
#MPI_Converge_Stop is a (variance of chain means)/(mean of variances) parameter that can be used to stop the chains
#Set to a negative number not to use this feature. Does not guarantee good accuracy of confidence limits.
MPI_Converge_Stop = 0.01
#Do initial period of slice sampling; may be good idea if
#cov matrix or widths are likely to be very poor estimates
MPI_StartSliceSampling = F
#Can optionally also check for convergence of confidence limits (after MPI_Converge_Stop reached)
#Can be good idea as small value of MPI_Converge_Stop does not (necessarily) imply good exploration of tails
MPI_Check_Limit_Converge = T
#if MPI_Check_Limit_Converge = T, give tail fraction to check (checks both tails):
MPI_Limit_Converge = 0.025
#permitted quantile chain variance in units of the standard deviation (small values v slow):
MPI_Limit_Converge_Err = 0.2
#which parameters tails to check. If zero, check all parameters:
MPI_Limit_Param = 0
#if MPI_LearnPropose = T, the proposal density is continally updated from the covariance of samples so far (since burn in)
MPI_LearnPropose = T
#can set a value of converge at which to stop updating covariance (so that it becomes rigorously Markovian)
#e.g. MPI_R_StopProposeUpdate = 0.4 will stop updating when (variance of chain means)/(mean of variances) < 0.4
MPI_R_StopProposeUpdate = 0
#If have covmat, R to reach before updating proposal density (increase if covmat likely to be poor)
#Only used if not varying new parameters that are fixed in covmat
MPI_Max_R_ProposeUpdate = 3
#As above, but used if varying new parameters that were fixed in covmat
MPI_Max_R_ProposeUpdateNew = 50
#Initial power spectrum amplitude pivots (Mpc^{-1})
#if tensor_pivot_k/=pivot_k then r is defined so to that P_t(tensor_pivot_k)=r P_s(tensor_pivot_k)
pivot_k = 0.05
#tensor_pivot_k defaults to same as pivot_k
#tensor_pivot_k = 0.05
#Whether the CMB should be lensed (slows a lot unless also computing matter power)
CMB_lensing = T
accuracy_level = 1
high_accuracy_default = T
#1: Simple Metropolis, 2: slice sampling, 3: slice sampling fast parameters, 4: directional gridding
#7 is new dragging method
sampling_method = 7
dragging_steps = 3
use_fast_slow = T
##Rest are fairly irrelevant
#if sampling_method =4, iterations per gridded direction
directional_grid_steps = 20
#action = 0: MCMC, action=1: postprocess .data file, action=2: find best fit point only
action = 0
#If propose_matrix is blank (first run), can try to use numerical Hessian to
#estimate a good propose matrix. As a byproduct you also get an approx best fit point
estimate_propose_matrix = F
#when estimating best fit point (action=2 or estimate_propose_matrix),
#required relative accuracy of each parameter in units of the covariance width
max_like_radius = 0.05
max_like_iterations = 5000
minimization_points_factor = 2
minimize_loglike_tolerance = 0.05
minimize_separate_fast = T
#if non-zero do some low temperature MCMC steps to check minimum stable
minimize_mcmc_refine_num = 20
minimize_refine_temp = 0.01
minimize_temp_scale_factor = 5
minimize_random_start_pos = T
#max_like_radius = 0.002
#max_like_iterations = 40000
#minimization_points_factor = 6
#minimize_loglike_tolerance=0.05
#if blank this is set from system clock
rand_seed =
#If true, generate checkpoint files and terminated runs can be restarted using exactly the same command
#and chains continued from where they stopped
#With checkpoint=T note you must delete all chains/file_root.* files if you want new chains with an old file_root
checkpoint = T
#whether to stop on CAMB error, or continue ignoring point
stop_on_error= T
#If action = 1
redo_likelihoods = T
redo_theory = F
redo_cls = F
redo_pk = F
redo_skip = 0
redo_outroot =
redo_thin = 1
redo_add = F
redo_from_text = F
#If large difference in log likelihoods may need to offset to give sensible weights
#for exp(difference in likelihoods)
redo_likeoffset = 0
Code: Select all
$ mpirun -np 32 ./cosmomc des_lensing_curvature.ini
So I conclude the criterion of convergence is never reached.
Maybe my prior on omegak is too large ( see the line
Code: Select all
param[omegak] = 0. -0.5 0.5 0.01 0.01
I attach the nohup.log associated to my last run : here is the repeated message which appears about tau reionization :
Code: Select all
TTanhReionization_zreFromOptDepth: Did not converge to optical depth
tau = 0.404508363477996 optical_depth = 0.407329941538398
33.3251953125000 33.3190917968750
(If running a chain, have you put a constraint on tau?)
" or "tau
" ?optical_depth
2) From a general point of view, among all these parameters in these files, which is the main criterion parameter which controls all the ending up of the MCMC run ? There are so many parameters that may change things.
Any clarification is welcome.
Best regards