Ensemble

class pypesto.ensemble.Ensemble(x_vectors: numpy.ndarray, x_names: Optional[Sequence[str]] = None, vector_tags: Optional[Sequence[Tuple[int, int]]] = None, ensemble_type: Optional[pypesto.ensemble.constants.EnsembleType] = None, predictions: Optional[Sequence[pypesto.ensemble.ensemble.EnsemblePrediction]] = None, lower_bound: Optional[numpy.ndarray] = None, upper_bound: Optional[numpy.ndarray] = None)[source]

Bases: object

A ensemble is a wrapper around an numpy array. It comes with some convenience functionality: It allows to map parameter values via identifiers to the correct parameters, it allows to compute summaries of the parameter vectors (mean, standard deviation, median, percentiles) more easily, and it can store predictions made by pyPESTO, such that the parameter ensemble and the predictions are linked to each other.

__init__(x_vectors: numpy.ndarray, x_names: Optional[Sequence[str]] = None, vector_tags: Optional[Sequence[Tuple[int, int]]] = None, ensemble_type: Optional[pypesto.ensemble.constants.EnsembleType] = None, predictions: Optional[Sequence[pypesto.ensemble.ensemble.EnsemblePrediction]] = None, lower_bound: Optional[numpy.ndarray] = None, upper_bound: Optional[numpy.ndarray] = None)[source]

Constructor.

Parameters
  • x_vectors – parameter vectors of the ensemble, in the format n_parameter x n_vectors

  • x_names – Names or identifiers of the parameters

  • vector_tags – Additional tag, which adds information about the the parameter vectors of the form (optimization_run, optimization_step) if the ensemble is created from an optimization result or (sampling_chain, sampling_step) if the ensemble is created from a sampling result.

  • ensemble_type – Type of ensemble: Ensemble (default), sample, or unprocessed_chain Samples are meant to be representative, ensembles can be any ensemble of parameters, and unprocessed chains still have burn-ins

  • predictions – List of EnsemblePrediction objects

  • lower_bound – array of potential lower bounds for the parameters

  • upper_bound – array of potential upper bounds for the parameters

check_identifiability()pandas.core.frame.DataFrame[source]

Use ensemble mean and standard deviation to assess (in a rudimentary way) whether or not parameters are identifiable. Returns a dataframe with tuples, which specify whether or not the lower and the upper bounds are violated.

Returns

DataFrame indicating parameter identifiability based on mean plus/minus standard deviations and parameter bounds

Return type

parameter_identifiability

compute_summary(percentiles_list: Sequence[int] = (5, 20, 80, 95))[source]

This function computes the mean, the median, the standard deviation and possibly percentiles for the parameters of the ensemble. Those summary results are added as a data member to the EnsemblePrediction object.

Parameters

percentiles_list – List or tuple of percent numbers for the percentiles

Returns

Dict with mean, std, median, and percentiles of parameter vectors

Return type

summary

static from_sample(result: pypesto.result.Result, remove_burn_in: bool = True, chain_slice: Optional[slice] = None, **kwargs)[source]

Construct an ensemble from a sample.

Parameters
  • result – A pyPESTO result that contains a sample result.

  • remove_burn_in – Exclude parameter vectors from the ensemble if they are in the “burn-in”.

  • chain_slice – Subset the chain with a slice. Any “burn-in” removal occurs first.

Returns

Return type

The ensemble.

predict(predictor: Callable, prediction_id: Optional[str] = None, sensi_orders: Tuple = (0), default_value: Optional[float] = None, mode: str = 'mode_fun', engine: Optional[pypesto.engine.base.Engine] = None)pypesto.ensemble.ensemble.EnsemblePrediction[source]

Convenience function to run predictions for a full ensemble: User needs to hand over a predictor function and settings, then all results are grouped as EnsemblePrediction for the whole ensemble

Parameters
  • predictor – Prediction function, e.g., an AmiciPredictor

  • prediction_id – Identifier for the predictions

  • sensi_orders – Specifies which sensitivities to compute, e.g. (0,1) -> fval, grad

  • default_value – If parameters are needed in the mapping, which are not found in the parameter source, it can make sense to fill them up with this default value (e.g. np.nan) in some cases (to be used with caution though).

  • mode – Whether to compute function values or residuals.

  • engine – Parallelization engine. Defaults to sequential execution on a SingleCoreEngine.

Returns

Return type

The prediction of the ensemble.

class pypesto.ensemble.EnsemblePrediction(predictor: Callable[[Sequence], pypesto.predict.result.PredictionResult], prediction_id: Optional[str] = None, prediction_results: Optional[Sequence[pypesto.predict.result.PredictionResult]] = None, lower_bound: Optional[Sequence[numpy.ndarray]] = None, upper_bound: Optional[Sequence[numpy.ndarray]] = None)[source]

Bases: object

A ensemble prediction consists of an ensemble, i.e., a set of parameter vectors and their identifiers such as a sample, and a prediction function. It can be attached to a ensemble-type object

__init__(predictor: Callable[[Sequence], pypesto.predict.result.PredictionResult], prediction_id: Optional[str] = None, prediction_results: Optional[Sequence[pypesto.predict.result.PredictionResult]] = None, lower_bound: Optional[Sequence[numpy.ndarray]] = None, upper_bound: Optional[Sequence[numpy.ndarray]] = None)[source]

Constructor.

Parameters
  • predictor – Prediction function, e.g., an AmiciPredictor, which takes a parameter vector as input and outputs a PredictionResult object

  • prediction_id – Identifier for the predictions

  • prediction_results – List of Prediction results

  • lower_bound – Array of potential lower bounds for the predictions, should have the same shape as the output of the predictions, i.e., a list of numpy array (one list entry per condition), with the arrays having the shape of n_timepoints x n_outputs for each condition.

  • upper_bound – array of potential upper bounds for the parameters

compute_summary(percentiles_list: Sequence[int] = (5, 20, 80, 95))Dict[source]

Compute the mean, the median, the standard deviation and possibly percentiles from the ensemble prediction results. Those summary results are added as a data member to the EnsemblePrediction object.

Parameters

percentiles_list – List or tuple of percent numbers for the percentiles

Returns

dictionary of predictions results with the keys mean, std, median, percentiles, …

Return type

summary

condense_to_arrays()[source]

This functions reshapes the predictions results to an array and adds them as a member to the EnsemblePrediction objects. It’s meant to be used only if all conditions of a prediction have the same observables, as this is often the case for large-scale data sets taken from online databases or similar.

class pypesto.ensemble.EnsembleType(value)[source]

Bases: enum.Enum

An enumeration.

ensemble = 1
sample = 2
unprocessed_chain = 3
class pypesto.ensemble.Enum(value)[source]

Bases: object

Generic enumeration.

Derive from this class to define new enumerations.

name

The name of the Enum member.

value

The value of the Enum member.

pypesto.ensemble.get_covariance_matrix_parameters(ens: pypesto.ensemble.ensemble.Ensemble)numpy.ndarray[source]

Compute the covariance of ensemble parameters.

Parameters

ens – Ensemble object containing a set of parameter vectors

Returns

covariance matrix of ensemble parameters

Return type

covariance_matrix

pypesto.ensemble.get_covariance_matrix_predictions(ens: Union[pypesto.ensemble.ensemble.Ensemble, pypesto.ensemble.ensemble.EnsemblePrediction], prediction_index: int = 0)numpy.ndarray[source]

Compute the covariance of ensemble predictions.

Parameters
  • ens – Ensemble object containing a set of parameter vectors and a set of predictions or EnsemblePrediction object containing only predictions

  • prediction_index – index telling which prediction from the list should be analyzed

Returns

covariance matrix of ensemble predictions

Return type

covariance_matrix

pypesto.ensemble.get_pca_representation_parameters(ens: pypesto.ensemble.ensemble.Ensemble, n_components: int = 2, rescale_data: bool = True, rescaler: Optional[Callable] = None)Tuple[source]

Compute the representation with reduced dimensionality via principal component analysis (with a given number of principal components) of the parameter ensemble.

Parameters
  • ens – Ensemble objects containing a set of parameter vectors

  • n_components – number of components for the dimension reduction

  • rescale_data – flag indicating whether the principal components should be rescaled using a rescaler function (e.g., an arcsinh function)

  • rescaler – callable function to rescale the output of the PCA (defaults to numpy.arcsinh)

Returns

  • principal_components – principal components of the parameter vector ensemble

  • pca_object – returned fitted pca object from sklearn.decomposition.PCA()

pypesto.ensemble.get_pca_representation_predictions(ens: Union[pypesto.ensemble.ensemble.Ensemble, pypesto.ensemble.ensemble.EnsemblePrediction], prediction_index: int = 0, n_components: int = 2, rescale_data: bool = True, rescaler: Optional[Callable] = None)Tuple[source]

Compute the representation with reduced dimensionality via principal component analysis (with a given number of principal components) of the ensemble prediction.

Parameters
  • ens – Ensemble objects containing a set of parameter vectors and a set of predictions or EnsemblePrediction object containing only predictions

  • prediction_index – index telling which prediction from the list should be analyzed

  • n_components – number of components for the dimension reduction

  • rescale_data – flag indicating whether the principal components should be rescaled using a rescaler function (e.g., an arcsinh function)

  • rescaler – callable function to rescale the output of the PCA (defaults to numpy.arcsinh)

Returns

  • principal_components – principal components of the parameter vector ensemble

  • pca_object – returned fitted pca object from sklearn.decomposition.PCA()

pypesto.ensemble.get_percentile_label(percentile: Union[float, int, str])str[source]

Convert a percentile to a label.

Labels for percentiles are used at different locations (e.g. ensemble prediction code, and visualization code). This method ensures that the same percentile is labeled identically everywhere.

The percentile is rounded to two decimal places in the label representation if it is specified to more decimal places. This is for readability in plotting routines, and to avoid float to string conversion issues related to float precision.

Parameters

percentile – The percentile value that will be used to generate a label.

Returns

Return type

The label of the (possibly rounded) percentile.

pypesto.ensemble.get_spectral_decomposition_lowlevel(matrix: numpy.ndarray, normalize: bool = False, only_separable_directions: bool = False, cutoff_absolute_separable: float = 1e-16, cutoff_relative_separable: float = 1e-16, only_identifiable_directions: bool = False, cutoff_absolute_identifiable: float = 1e-16, cutoff_relative_identifiable: float = 1e-16)Tuple[numpy.ndarray, numpy.ndarray][source]

Compute the spectral decomposition of ensemble parameters or predictions.

Parameters
  • matrix – symmetric matrix (typically a covariance matrix) of parameters or predictions

  • normalize – flag indicating whether the returned Eigenvalues should be normalized with respect to the largest Eigenvalue

  • only_separable_directions – return only separable directions according to cutoff_[absolute/relative]_separable

  • cutoff_absolute_separable – Consider only eigenvalues of the covariance matrix above this cutoff (only applied when only_separable_directions is True)

  • cutoff_relative_separable – Consider only eigenvalues of the covariance matrix above this cutoff, when rescaled with the largest eigenvalue (only applied when only_separable_directions is True)

  • only_identifiable_directions – return only identifiable directions according to cutoff_[absolute/relative]_identifiable

  • cutoff_absolute_identifiable – Consider only low eigenvalues of the covariance matrix with inverses above of this cutoff (only applied when only_identifiable_directions is True)

  • cutoff_relative_identifiable – Consider only low eigenvalues of the covariance matrix when rescaled with the largest eigenvalue with inverses above of this cutoff (only applied when only_identifiable_directions is True)

Returns

  • eigenvalues – Eigenvalues of the covariance matrix

  • eigenvectors – Eigenvectors of the covariance matrix

pypesto.ensemble.get_spectral_decomposition_parameters(ens: pypesto.ensemble.ensemble.Ensemble, normalize: bool = False, only_separable_directions: bool = False, cutoff_absolute_separable: float = 1e-16, cutoff_relative_separable: float = 1e-16, only_identifiable_directions: bool = False, cutoff_absolute_identifiable: float = 1e-16, cutoff_relative_identifiable: float = 1e-16)Tuple[numpy.ndarray, numpy.ndarray][source]

Compute the spectral decomposition of ensemble parameters.

Parameters
  • ens – Ensemble object containing a set of parameter vectors

  • normalize – flag indicating whether the returned Eigenvalues should be normalized with respect to the largest Eigenvalue

  • only_separable_directions – return only separable directions according to cutoff_[absolute/relative]_separable

  • cutoff_absolute_separable – Consider only eigenvalues of the covariance matrix above this cutoff (only applied when only_separable_directions is True)

  • cutoff_relative_separable – Consider only eigenvalues of the covariance matrix above this cutoff, when rescaled with the largest eigenvalue (only applied when only_separable_directions is True)

  • only_identifiable_directions – return only identifiable directions according to cutoff_[absolute/relative]_identifiable

  • cutoff_absolute_identifiable – Consider only low eigenvalues of the covariance matrix with inverses above of this cutoff (only applied when only_identifiable_directions is True)

  • cutoff_relative_identifiable – Consider only low eigenvalues of the covariance matrix when rescaled with the largest eigenvalue with inverses above of this cutoff (only applied when only_identifiable_directions is True)

Returns

  • eigenvalues – Eigenvalues of the covariance matrix

  • eigenvectors – Eigenvectors of the covariance matrix

pypesto.ensemble.get_spectral_decomposition_predictions(ens: pypesto.ensemble.ensemble.Ensemble, normalize: bool = False, only_separable_directions: bool = False, cutoff_absolute_separable: float = 1e-16, cutoff_relative_separable: float = 1e-16, only_identifiable_directions: bool = False, cutoff_absolute_identifiable: float = 1e-16, cutoff_relative_identifiable: float = 1e-16)Tuple[numpy.ndarray, numpy.ndarray][source]

Compute the spectral decomposition of ensemble predictions.

Parameters
  • ens – Ensemble object containing a set of parameter vectors and a set of predictions or EnsemblePrediction object containing only predictions

  • normalize – flag indicating whether the returned Eigenvalues should be normalized with respect to the largest Eigenvalue

  • only_separable_directions – return only separable directions according to cutoff_[absolute/relative]_separable

  • cutoff_absolute_separable – Consider only eigenvalues of the covariance matrix above this cutoff (only applied when only_separable_directions is True)

  • cutoff_relative_separable – Consider only eigenvalues of the covariance matrix above this cutoff, when rescaled with the largest eigenvalue (only applied when only_separable_directions is True)

  • only_identifiable_directions – return only identifiable directions according to cutoff_[absolute/relative]_identifiable

  • cutoff_absolute_identifiable – Consider only low eigenvalues of the covariance matrix with inverses above of this cutoff (only applied when only_identifiable_directions is True)

  • cutoff_relative_identifiable – Consider only low eigenvalues of the covariance matrix when rescaled with the largest eigenvalue with inverses above of this cutoff (only applied when only_identifiable_directions is True)

Returns

  • eigenvalues – Eigenvalues of the covariance matrix

  • eigenvectors – Eigenvectors of the covariance matrix

pypesto.ensemble.get_umap_representation_parameters(ens: pypesto.ensemble.ensemble.Ensemble, n_components: int = 2, normalize_data: bool = False, **kwargs)Tuple[source]

Compute the representation with reduced dimensionality via umap (with a given number of umap components) of the parameter ensemble. Allows to pass on additional keyword arguments to the umap routine.

Parameters
  • ens – Ensemble objects containing a set of parameter vectors

  • n_components – number of components for the dimension reduction

  • normalize_data – flag indicating whether the parameter ensemble should be rescaled with mean and standard deviation

Returns

  • umap_components – first components of the umap embedding

  • umap_object – returned fitted umap object from umap.UMAP()

pypesto.ensemble.get_umap_representation_predictions(ens: Union[pypesto.ensemble.ensemble.Ensemble, pypesto.ensemble.ensemble.EnsemblePrediction], prediction_index: int = 0, n_components: int = 2, normalize_data: bool = False, **kwargs)Tuple[source]

Compute the representation with reduced dimensionality via umap (with a given number of umap components) of the ensemble predictions. Allows to pass on additional keyword arguments to the umap routine.

Parameters
  • ens – Ensemble objects containing a set of parameter vectors and a set of predictions or EnsemblePrediction object containing only predictions

  • prediction_index – index telling which prediction from the list should be analyzed

  • n_components – number of components for the dimension reduction

  • normalize_data – flag indicating whether the parameter ensemble should be rescaled with mean and standard deviation

Returns

  • umap_components – first components of the umap embedding

  • umap_object – returned fitted umap object from umap.UMAP()

pypesto.ensemble.read_from_csv(path: str, sep: str = '\t', index_col: int = 0, headline_parser: Optional[Callable] = None, ensemble_type: Optional[pypesto.ensemble.constants.EnsembleType] = None, lower_bound: Optional[numpy.ndarray] = None, upper_bound: Optional[numpy.ndarray] = None)[source]

function for creating an ensemble from a csv file

Parameters
  • path – path to csv file to read in parameter ensemble

  • sep – separator in csv file

  • index_col – index column in csv file

  • headline_parser – A function which reads in the headline of the csv file and converts it into vector_tags (see constructor of Ensemble for more details)

  • ensemble_type – Ensemble type: representative sample or random ensemble

  • lower_bound – array of potential lower bounds for the parameters

  • upper_bound – array of potential upper bounds for the parameters

Returns

Ensemble object of parameter vectors

Return type

result

pypesto.ensemble.read_from_df(dataframe: pandas.core.frame.DataFrame, headline_parser: Optional[Callable] = None, ensemble_type: Optional[pypesto.ensemble.constants.EnsembleType] = None, lower_bound: Optional[numpy.ndarray] = None, upper_bound: Optional[numpy.ndarray] = None)[source]

function for creating an ensemble from a csv file

Parameters
  • dataframe – pandas.DataFrame to read in parameter ensemble

  • headline_parser – A function which reads in the headline of the csv file and converts it into vector_tags (see constructor of Ensemble for more details)

  • ensemble_type – Ensemble type: representative sample or random ensemble

  • lower_bound – array of potential lower bounds for the parameters

  • upper_bound – array of potential upper bounds for the parameters

Returns

Ensemble object of parameter vectors

Return type

result

pypesto.ensemble.write_ensemble_prediction_to_h5(ensemble_prediction: pypesto.ensemble.ensemble.EnsemblePrediction, output_file: str, base_path: Optional[str] = None)[source]