COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Hello,
I am writing a likelihood in cobaya that uses an emulator to obtain the theoretical data vector prediction. Since the emulator also provides some derived parameters, I would like to implement this as a separate Theory class, so that I can directly assign these parameters to the emulator theory, rather than have them attached to a particular dataset likelihood. In order to get the theory prediction, the emulator needs cosmology parameters and a set of galaxy bias parameters.
I ultimately want to run this for several measurements at once, each with their own galaxy bias parameters (so, for example, I want to run this for two independent datasets A and B with b1 biases b1A and b1B). In the likelihood I dealt with this by reading in the b1 names with load_data for each of the datasets from their corresponding .yaml files (so that they can be called as params_values.get(self.b1_name)), however, I do not know how to pass this on to the theory code.
As I understand, in order to access a sampled parameter in calculate() via self.provider.get_param, this parameter first needs to be specified in get_requirements, which is called immediately after initialising the theory. Since in this case the bias parameters will have different names for each dataset, I am not sure how to specify the bias parameters that are required (their correct, dataset-specific, names). I have tried passing the renames from the likelihood by asking for them through must_provide, but that seems to be ran after get_requirements. I have also tried implementing something similar to “renames” that Cobaya’s Boltzmann codes have and that are read in during initialisation, but these seem to not be dataset-specific but rather global renames (i.e. I can’t specify a theory with renames for each dataset in its .yaml file separately).
I would be very grateful for any tips on how to best go about this!
I am writing a likelihood in cobaya that uses an emulator to obtain the theoretical data vector prediction. Since the emulator also provides some derived parameters, I would like to implement this as a separate Theory class, so that I can directly assign these parameters to the emulator theory, rather than have them attached to a particular dataset likelihood. In order to get the theory prediction, the emulator needs cosmology parameters and a set of galaxy bias parameters.
I ultimately want to run this for several measurements at once, each with their own galaxy bias parameters (so, for example, I want to run this for two independent datasets A and B with b1 biases b1A and b1B). In the likelihood I dealt with this by reading in the b1 names with load_data for each of the datasets from their corresponding .yaml files (so that they can be called as params_values.get(self.b1_name)), however, I do not know how to pass this on to the theory code.
As I understand, in order to access a sampled parameter in calculate() via self.provider.get_param, this parameter first needs to be specified in get_requirements, which is called immediately after initialising the theory. Since in this case the bias parameters will have different names for each dataset, I am not sure how to specify the bias parameters that are required (their correct, dataset-specific, names). I have tried passing the renames from the likelihood by asking for them through must_provide, but that seems to be ran after get_requirements. I have also tried implementing something similar to “renames” that Cobaya’s Boltzmann codes have and that are read in during initialisation, but these seem to not be dataset-specific but rather global renames (i.e. I can’t specify a theory with renames for each dataset in its .yaml file separately).
I would be very grateful for any tips on how to best go about this!
-
- Posts: 1984
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
If the parameters of each likelihood are independent, then what you need from the theory is something to calculate the results for the likelihood's bias parameters. You can do this with closures or objects. E.g. have your theory calculate an object that can calculate results based on input bias parameters. Each likelihood then requests this object from the theory, and calls it with its bias parameters. There's a CCL wrapper that works like this at:
https://github.com/simonsobs/SOLikeT/blob/36f8ca5f66cc30c90795c91cc13960f7ddb2a78e/soliket/ccl.py
The Theory should do as much of the calculation as possible when it initializes the object, but of course cannot do the bias-specific part until each likelihood is called.
https://github.com/simonsobs/SOLikeT/blob/36f8ca5f66cc30c90795c91cc13960f7ddb2a78e/soliket/ccl.py
The Theory should do as much of the calculation as possible when it initializes the object, but of course cannot do the bias-specific part until each likelihood is called.
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Hi Antony,
Thanks a lot for the reply and the link! This looks nice, but I’m not quite sure how one would assign the derived parameters in this case. If calculate() produces some cosmology-only dependent object and then the final, bias-dependent part, is calculated in the likelihood, is there a way to call the Theory object again, after this final calculation is done, to get the derived parameter only, from the likelihood result? When I was playing around with this, it seemed like all of the states (including “derived”) need to be assigned in the calculate method (but perhaps I was doing something wrong…)
Thanks a lot for the reply and the link! This looks nice, but I’m not quite sure how one would assign the derived parameters in this case. If calculate() produces some cosmology-only dependent object and then the final, bias-dependent part, is calculated in the likelihood, is there a way to call the Theory object again, after this final calculation is done, to get the derived parameter only, from the likelihood result? When I was playing around with this, it seemed like all of the states (including “derived”) need to be assigned in the calculate method (but perhaps I was doing something wrong…)
-
- Posts: 1984
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
I'm not sure what you're trying to do with derived parameters. If they are bias specific, they should be derived parameters of the likelihood (calculated by the Likelihood.calculate() using the calculator object from theory). If they are not bias-specific, they can be calculated by the Theory.
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
The derived parameters are not bias specific but, due to the way that the emulator is implemented, I cannot calculate them before providing the bias parameters.
What I'm trying to figure out is whether it is possible to update derived parameter states outside of theory.calculate() method. Since in my case, I would need to execute theory.calculate() to get the cosmology-dependent part and then use the result in the likelihood to add the appropriate bias, I would then want to call theory.get_derivedparam(emulated_theory_wbias) which would set state['derived'] = {'derivedparam': emulated_theory_wbias.params['derivedparam']}. Is this something that can be done in cobaya? If not, of course, I can just assign the parameters to the likelihood, but since it is not really dependent on a particular dataset but is a derived cosmology parameter, I thought it would be nicer to have it tied to the Theory, especially when doing combined runs for several datasets.
What I'm trying to figure out is whether it is possible to update derived parameter states outside of theory.calculate() method. Since in my case, I would need to execute theory.calculate() to get the cosmology-dependent part and then use the result in the likelihood to add the appropriate bias, I would then want to call theory.get_derivedparam(emulated_theory_wbias) which would set state['derived'] = {'derivedparam': emulated_theory_wbias.params['derivedparam']}. Is this something that can be done in cobaya? If not, of course, I can just assign the parameters to the likelihood, but since it is not really dependent on a particular dataset but is a derived cosmology parameter, I thought it would be nicer to have it tied to the Theory, especially when doing combined runs for several datasets.
-
- Posts: 1984
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
I think you could propagate the bias names back to the theory by returning them as part of requirements for the likelihood, e.g. Likelihood can require "bias_results":{dict of bias parameters and any other information needed for theory to calculate with those parameters}, and Theory can list "bias_params" of one of the things it supports (e.g. by returning it from get_can_provide() method).
The Theory would have to combine "bias_results" requests from any different Likelihood instances, as passed to the Theory's must_provide method.
The Theory would have to combine "bias_results" requests from any different Likelihood instances, as passed to the Theory's must_provide method.
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Hm, I’m not sure if I follow this.
I’ve added “bias_results” as a requirement in the likelihood and as a list returned by get_can_provide in Theory. I’ve also created a Theory method get_bias_results which is called from the likelihood with a dictionary of galaxy bias parameters and which assigns this dictionary as a Theory object attribute, self.bias_results. This function is called by the likelihood before calculating the main Theory prediction to set self.bias_results to current bias values, which can then be accessed by Theory.calculate().
However, I now get an error “requirement bias_results is provided by more than one component”. It then lists my theory code twice… I am a bit confused about why that would be (I am only running one dataset likelihood instance for this test).
I’ve added “bias_results” as a requirement in the likelihood and as a list returned by get_can_provide in Theory. I’ve also created a Theory method get_bias_results which is called from the likelihood with a dictionary of galaxy bias parameters and which assigns this dictionary as a Theory object attribute, self.bias_results. This function is called by the likelihood before calculating the main Theory prediction to set self.bias_results to current bias values, which can then be accessed by Theory.calculate().
However, I now get an error “requirement bias_results is provided by more than one component”. It then lists my theory code twice… I am a bit confused about why that would be (I am only running one dataset likelihood instance for this test).
-
- Posts: 1984
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Not sure you need get_bias_results method. I was thinking you'd use bias_results requirement to communicate just the names of the parameters from Likelihood to Theory ({'bias_results':['b1','b2'...]}). Theory would then read and store the names in Theory.must_provide and also return the names of the parameters from Theory.must_provide so they are also then also a requirement of Theory and passed into the theory code for each model. Then calculate all results in Theory.calculate() as normal from the passed-in values of the bias parameters in each model.
If you have a problem, might be worth just making a simple toy test case with the different classes to test the logic (and if it doesn't work, then we'd have a new unit test for Cobaya..). There may be a bug there with Cobaya reporting two listings of the theory code.
If you have a problem, might be worth just making a simple toy test case with the different classes to test the logic (and if it doesn't work, then we'd have a new unit test for Cobaya..). There may be a bug there with Cobaya reporting two listings of the theory code.
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Ah, I did not realise must_provide can also return something, thank you very much for all the help, this worked! I think the error I got before must have been due to some inconsistencies I introduced when adding this extra bias results function, now everything runs smoothly.
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Hi again,
it seems that the above solution does not quite work after all - the Theory object does not use the bias names correctly when two independent datasets that share the same base likelihood with different bias names are run together.
In particular, while the bias names for both datasets are read in and passed to the theory without a problem, Theory.calculate() only ever uses the names of the dataset that is listed as the second one. I guess this is because the theory prediction is only calculated once for each point in the parameter space and is assumed to be shared by the two datasets, so it only uses the latest set of names provided.
I can definitely see how that makes sense in general. I guess the theory object is not really intended to deal with dataset specific parameters, as those calculations are meant to be done in the likelihood, but any pointers on whether there exists some workaround for this (or if I misinterpreted the solution you suggested/the origins of the problem), would be much appreciated!
it seems that the above solution does not quite work after all - the Theory object does not use the bias names correctly when two independent datasets that share the same base likelihood with different bias names are run together.
In particular, while the bias names for both datasets are read in and passed to the theory without a problem, Theory.calculate() only ever uses the names of the dataset that is listed as the second one. I guess this is because the theory prediction is only calculated once for each point in the parameter space and is assumed to be shared by the two datasets, so it only uses the latest set of names provided.
I can definitely see how that makes sense in general. I guess the theory object is not really intended to deal with dataset specific parameters, as those calculations are meant to be done in the likelihood, but any pointers on whether there exists some workaround for this (or if I misinterpreted the solution you suggested/the origins of the problem), would be much appreciated!
-
- Posts: 1984
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Your must_provide method may be called twice with both sets of bias parameters. Make sure you combine them self-consistently to then calculate using the full list?
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Right, so I've changed the state that Theory.calculate() updates to be a dictionary of theory predictions for all the datasets involved. When this is called from the likelihood in logp, I take the dictionary entry that matches the name of the dataset. Within Theory.calculate() all seems to work okay, but when I try running some chains I get an error "Could not find random point giving finite likelihood after 600 tries".
I wanted to add some print statements to Likelihood.logp() to try and understand what is going wrong, but none of them is actually printed - it seems like the function is not actually called when evaluating the likelihood. Could you help me to understand how cobaya knows what the value of likelihood is in this case?
I wanted to add some print statements to Likelihood.logp() to try and understand what is going wrong, but none of them is actually printed - it seems like the function is not actually called when evaluating the likelihood. Could you help me to understand how cobaya knows what the value of likelihood is in this case?
-
- Posts: 1984
- Joined: September 23 2004
- Affiliation: University of Sussex
- Contact:
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
Run with the debug option to see more output, check assignment of parameters, etc.
-
- Posts: 19
- Joined: June 08 2020
- Affiliation: Max Planck Institute for Extraterrestrial Physics
Re: COBAYA: accessing sampled dataset-specific parameters in Theory calculate() method
I finally got it working now, thanks a lot for all the help, really appreciate it!