pypesto.profile

Profile

class pypesto.profile.ProfileOptions[source]

Bases: dict

Options for optimization based profiling.

default_step_size: Default step size of the profiling routine along the profile path (adaptive step lengths algorithms will only use this as a first guess and then refine the update).

min_step_size: Lower bound for the step size in adaptive methods.

max_step_size: Upper bound for the step size in adaptive methods.

step_size_factor: Adaptive methods recompute the likelihood at the predicted point and try to find a good step length by a sort of line search algorithm. This factor controls step handling in this line search.

delta_ratio_max: Maximum allowed drop of the posterior ratio between two profile steps.

ratio_min: Lower bound for likelihood ratio of the profile, based on inverse chi2-distribution. The default 0.145 is slightly lower than the 95% quantile 0.1465 of a chi2 distribution with one degree of freedom.

reg_points: Number of profile points used for regression in regression based adaptive profile points proposal.

reg_order: Maximum degree of regression polynomial used in regression based adaptive profile points proposal.

magic_factor_obj_value: There is this magic factor in the old profiling code which slows down profiling at small ratios (must be >= 0 and < 1).

whole_path: Whether to profile the whole bounds or only till we get below the ratio.

__init__(default_step_size=0.01, min_step_size=0.001, max_step_size=1.0, step_size_factor=1.25, delta_ratio_max=0.1, ratio_min=0.145, reg_points=10, reg_order=4, magic_factor_obj_value=0.5, whole_path=False)[source]

Parameters:

default_step_size (float)
min_step_size (float)
max_step_size (float)
step_size_factor (float)
delta_ratio_max (float)
ratio_min (float)
reg_points (int)
reg_order (int)
magic_factor_obj_value (float)
whole_path (bool)

static create_instance(maybe_options)[source]

Return a valid options object.

Parameters:: maybe_options (ProfileOptions or dict)
Return type:: ProfileOptions

validate()[source]

Check if options are valid.

Raises ValueError if current settings aren’t valid.

pypesto.profile.approximate_parameter_profile(problem, result, profile_index=None, profile_list=None, result_index=0, n_steps=100)[source]

Calculate profile approximation.

Based on an approximation via a normal likelihood centered at the chosen optimal parameter value, with the covariance matrix being the Hessian or FIM.

Parameters:

problem (Problem) – The problem to be solved.
result (Result) – A result object to initialize profiling and to append the profiling results to. For example, one might append more profiling runs to a previous profile, in order to merge these. The existence of an optimization result is obligatory.
profile_index (Iterable[int]) – List with the profile indices to be computed (by default all of the free parameters).
profile_list (int) – Integer which specifies whether a call to the profiler should create a new list of profiles (default) or should be added to a specific profile list.
result_index (int) – Index from which optimization result profiling should be started (default: global optimum, i.e., index = 0).
n_steps (int) – Number of profile steps in each dimension.

Return type:

Result

Returns:

The profile results are filled into result.profile_result.

pypesto.profile.calculate_approximate_ci(xs, ratios, confidence_ratio)[source]

Calculate approximate confidence interval based on profile.

Interval bounds are linearly interpolated.

Parameters:

xs (ndarray) – The ordered parameter values along the profile for the coordinate of interest.
ratios (ndarray) – The likelihood ratios corresponding to the parameter values.
confidence_ratio (float) – Minimum confidence ratio to base the confidence interval upon, as obtained via pypesto.profile.chi2_quantile_to_ratio().

Return type:

tuple[float, float]

Returns:

Bounds of the approximate confidence interval.

pypesto.profile.chi2_quantile_to_ratio(alpha=0.95, df=1)[source]

Compute profile likelihood threshold.

Transform lower tail probability alpha for a chi2 distribution with df degrees of freedom to a profile likelihood ratio threshold.

Parameters:

alpha (float) – Lower tail probability, defaults to 95% interval.
df (int) – Degrees of freedom.

Returns:

The computed likelihood ratio threshold.

pypesto.profile.parameter_profile(problem, result, optimizer, engine=None, profile_index=None, profile_list=None, result_index=0, next_guess_method='adaptive_step_regression', profile_options=None, progress_bar=None, filename=None, overwrite=False)[source]

Compute parameter profiles.

Parameters:

problem (Problem) – The problem to be solved.
result (Result) – A result object to initialize profiling and to append the profiling results to. For example, one might append more profiling runs to a previous profile, in order to merge these. The existence of an optimization result is obligatory.
optimizer (Optimizer) – The optimizer to be used along each profile.
engine (Engine) – The engine to be used. Defaults to pypesto.engine.SingleCoreEngine.
profile_index (Iterable[int]) – List with the parameter indices to be profiled (by default all free indices).
profile_list (int) – Integer which specifies whether a call to the profiler should create a new list of profiles (default) or should be added to a specific profile list.
result_index (int) – Index from which optimization result profiling should be started (default: global optimum, i.e., index = 0).
next_guess_method (Union[Callable, str]) – Method that creates the next starting point for optimization in profiling. One of the update_type options supported by pypesto.profile.profile_next_guess.next_guess().
profile_options (ProfileOptions) – Various options applied to the profile optimization. See pypesto.profile.options.ProfileOptions.
progress_bar (bool) – Whether to display a progress bar.
filename (Union[str, Callable, None]) – Name of the hdf5 file, where the result will be saved. Default is None, which deactivates automatic saving. If set to Auto it will automatically generate a file named year_month_day_profiling_result.hdf5. Optionally a method, see docs for pypesto.store.auto.autosave().
overwrite (bool) – Whether to overwrite result/profiling in the autosave file if it already exists.

Return type:

Result

Returns:

The profile results are filled into result.profile_result.

pypesto.profile.validation_profile_significance(problem_full_data, result_training_data, result_full_data=None, n_starts=1, optimizer=None, engine=None, lsq_objective=False, return_significance=True)[source]

Compute significance of Validation Interval.

It is a confidence region/interval for a new validation experiment. [1] et al. (This method per default returns the significance = 1-alpha!)

The reasoning behind their approach is, that a validation data set is outside the validation interval, if fitting the full data set would lead to a fit \(\theta_{new}\), that does not contain the old fit \(\theta_{train}\) in their (Profile-Likelihood) based parameter-confidence intervals. (I.e. the old fit would be rejected by the fit of the full data.)

This method returns the significance of the validation data set (where result_full_data is the objective function for fitting both data sets). I.e. the largest alpha, such that there is a validation region/interval such that the validation data set lies outside this Validation Interval with probability alpha. (If one is interested in the opposite, set return_significance=False.)

Parameters:

problem_full_data (Problem) – pypesto.problem, such that the objective is the negative-log-likelihood of the training and validation data set.
result_training_data (Result) – Result object from the fitting of the training data set only.
result_full_data (Optional[Result]) – Result object that contains the result of fitting training and validation data combined.
n_starts (Optional[int]) – number of starts for fitting the full data set (if result_full_data is not provided).
optimizer (Optional[Optimizer]) – optimizer used for refitting the data (if result_full_data is not provided).
engine (Optional[Engine]) – engine for refitting (if result_full_data is not provided).
lsq_objective (bool) – indicates if the objective of problem_full_data corresponds to a nllh (False), or a \(\chi^2\) value (True).
return_significance (bool) – indicates, if the function should return the significance (True) (i.e. the probability, that the new data set lies outside the Confidence Interval for the validation experiment, as given by the method), or the largest alpha, such that the validation experiment still lies within the Confidence Interval (False). I.e. \(\alpha = 1-significance\).

Return type:

float