Storage

This notebook illustrates how simulations and results can be saved to file.

[1]:
import pypesto
import pypesto.optimize as optimize
import pypesto.visualize as visualize
from pypesto.store import (save_to_hdf5, read_from_hdf5)

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import tempfile

%matplotlib inline

Define the objective and problem

[2]:
objective = pypesto.Objective(fun=sp.optimize.rosen,
                              grad=sp.optimize.rosen_der,
                              hess=sp.optimize.rosen_hess)

dim_full = 10
lb = -3 * np.ones((dim_full, 1))
ub = 3 * np.ones((dim_full, 1))

problem = pypesto.Problem(objective=objective, lb=lb, ub=ub)

# create optimizers
optimizer = optimize.ScipyOptimizer(method='l-bfgs-b')

# set number of starts
n_starts = 20

Objective function traces

During optimization, it is possible to regularly write the objective function trace to file. This is useful e.g. when runs fail, or for various diagnostics. Currently, pyPESTO can save histories to 3 backends: in-memory, as CSV files, or to HDF5 files.

Memory History

To record the history in-memory, just set trace_record=True in the pypesto.HistoryOptions. Then, the optimization result contains those histories:

[3]:
# record the history
history_options = pypesto.HistoryOptions(trace_record=True)

# Run optimizaitons
result = optimize.minimize(
    problem=problem, optimizer=optimizer,
    n_starts=n_starts, history_options=history_options)

Now, in addition to queries on the result, we can also access the

[4]:
print("History type: ", type(result.optimize_result.list[0].history))
# print("Function value trace of best run: ", result.optimize_result.list[0].history.get_fval_trace())

fig, ax = plt.subplots(1, 2)
visualize.waterfall(result, ax=ax[0])
visualize.optimizer_history(result, ax=ax[1])
fig.set_size_inches((15, 5))
History type:  <class 'pypesto.objective.history.MemoryHistory'>
../_images/example_store_9_1.png

CSV History

The in-memory storage is however not stored anywhere. To do that, it is possible to store either to CSV or HDF5. This is specified via the storage_file option. If it ends in .csv, a pypesto.objective.history.CsvHistory will be employed; if it ends in .hdf5 a pypesto.objective.history.Hdf5History. Occurrences of the substring {id} in the filename are replaced by the multistart id, allowing to maintain a separate file per run (this is necessary for CSV as otherwise runs are overwritten).

[5]:
# record the history and store to CSV
history_options = pypesto.HistoryOptions(trace_record=True, storage_file='history_{id}.csv')

# Run optimizaitons
result = optimize.minimize(
    problem=problem, optimizer=optimizer,
    n_starts=n_starts, history_options=history_options)

Note that for this simple cost function, saving to CSV takes a considerable amount of time. This overhead decreases for more costly simulators, e.g. using ODE simulations via AMICI.

[6]:
print("History type: ", type(result.optimize_result.list[0].history))
# print("Function value trace of best run: ", result.optimize_result.list[0].history.get_fval_trace())

fig, ax = plt.subplots(1, 2)
visualize.waterfall(result, ax=ax[0])
visualize.optimizer_history(result, ax=ax[1])
fig.set_size_inches((15, 5))
History type:  <class 'pypesto.objective.history.CsvHistory'>
../_images/example_store_14_1.png

HDF5 History

TODO: This is not fully implemented yet (it’s on the way …).

Result storage

Result objects can be stored to HDF5 files. When appliable, this is preferable to just pickling results, which is not guaranteed to be reproducible in the future.

[7]:
# Run optimizaitons
result = optimize.minimize(
    problem=problem, optimizer=optimizer,
    n_starts=n_starts)
[8]:
result.optimize_result.list[0:2]
[8]:
[{'id': '17',
  'x': array([0.99999994, 0.99999994, 1.        , 1.00000003, 1.00000011,
         1.00000009, 1.00000002, 0.99999991, 0.99999978, 0.99999952]),
  'fval': 8.707800564711112e-12,
  'grad': array([-2.31616041e-05, -3.81308795e-05,  1.32978065e-05, -1.23392144e-05,
          6.52303854e-05,  3.58850228e-05,  1.86401788e-05, -7.46042767e-06,
          8.02520832e-06, -8.71388750e-06]),
  'hess': None,
  'res': None,
  'sres': None,
  'n_fval': 87,
  'n_grad': 87,
  'n_hess': 0,
  'n_res': 0,
  'n_sres': 0,
  'x0': array([ 1.45114268,  2.06074379,  1.64058197,  0.6213187 ,  2.28867279,
          0.20877178,  1.83054994, -0.35049857, -2.66672642, -2.75180939]),
  'fval0': 16215.296810239959,
  'history': <pypesto.objective.history.History at 0x7f8dbd6ae070>,
  'exitflag': 0,
  'time': 0.020003557205200195,
  'message': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'},
 {'id': '7',
  'x': array([0.99999998, 0.99999991, 0.99999996, 1.00000002, 1.00000011,
         1.00000024, 1.00000032, 1.00000046, 1.00000083, 1.00000177]),
  'fval': 1.2244681497661217e-11,
  'grad': array([ 1.82728495e-05, -6.74518178e-05, -1.27149830e-05, -2.05128948e-06,
          3.27446361e-06,  6.39483721e-05,  4.57675698e-05, -4.81356983e-06,
         -5.53900259e-05,  2.06167771e-05]),
  'hess': None,
  'res': None,
  'sres': None,
  'n_fval': 91,
  'n_grad': 91,
  'n_hess': 0,
  'n_res': 0,
  'n_sres': 0,
  'x0': array([ 0.80798177, -0.91430344, -2.6742686 , -1.76685642,  0.16784518,
          1.70273894,  0.03732323,  2.71928657,  1.29546904, -2.9200907 ]),
  'fval0': 18006.95154502575,
  'history': <pypesto.objective.history.History at 0x7f8dbd6ae6a0>,
  'exitflag': 0,
  'time': 0.024848461151123047,
  'message': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'}]

As usual, having obtained our result, we can directly perform some plots:

[9]:
# plot waterfalls
visualize.waterfall(result, size=(15,6))
[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8dbd812d00>
../_images/example_store_21_1.png

Save optimization result as HDF5 file

The optimization result can be saved via a pypesto.store.OptimizationResultHDF5Writer.

[10]:
fn = tempfile.mktemp(".hdf5")

# Write result
hdf5_writer = save_to_hdf5.OptimizationResultHDF5Writer(fn)
hdf5_writer.write(result)

# Write problem
hdf5_writer = save_to_hdf5.ProblemHDF5Writer(fn)
hdf5_writer.write(problem)

Read optimization result from HDF5 file

When reading in the stored result again, we recover the original optimization result:

[11]:
# Read result and problem
hdf5_reader = read_from_hdf5.OptimizationResultHDF5Reader(fn)
result = hdf5_reader.read()
[12]:
result.optimize_result.list[0:2]
[12]:
[{'id': '17',
  'x': array([0.99999994, 0.99999994, 1.        , 1.00000003, 1.00000011,
         1.00000009, 1.00000002, 0.99999991, 0.99999978, 0.99999952]),
  'fval': 8.707800564711112e-12,
  'grad': array([-2.31616041e-05, -3.81308795e-05,  1.32978065e-05, -1.23392144e-05,
          6.52303854e-05,  3.58850228e-05,  1.86401788e-05, -7.46042767e-06,
          8.02520832e-06, -8.71388750e-06]),
  'hess': None,
  'res': None,
  'sres': None,
  'n_fval': 87,
  'n_grad': 87,
  'n_hess': 0,
  'n_res': 0,
  'n_sres': 0,
  'x0': array([ 1.45114268,  2.06074379,  1.64058197,  0.6213187 ,  2.28867279,
          0.20877178,  1.83054994, -0.35049857, -2.66672642, -2.75180939]),
  'fval0': 16215.296810239959,
  'history': None,
  'exitflag': 0,
  'time': 0.020003557205200195,
  'message': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'},
 {'id': '7',
  'x': array([0.99999998, 0.99999991, 0.99999996, 1.00000002, 1.00000011,
         1.00000024, 1.00000032, 1.00000046, 1.00000083, 1.00000177]),
  'fval': 1.2244681497661217e-11,
  'grad': array([ 1.82728495e-05, -6.74518178e-05, -1.27149830e-05, -2.05128948e-06,
          3.27446361e-06,  6.39483721e-05,  4.57675698e-05, -4.81356983e-06,
         -5.53900259e-05,  2.06167771e-05]),
  'hess': None,
  'res': None,
  'sres': None,
  'n_fval': 91,
  'n_grad': 91,
  'n_hess': 0,
  'n_res': 0,
  'n_sres': 0,
  'x0': array([ 0.80798177, -0.91430344, -2.6742686 , -1.76685642,  0.16784518,
          1.70273894,  0.03732323,  2.71928657,  1.29546904, -2.9200907 ]),
  'fval0': 18006.95154502575,
  'history': None,
  'exitflag': 0,
  'time': 0.024848461151123047,
  'message': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'}]
[13]:
# plot waterfalls
pypesto.visualize.waterfall(result, size=(15,6))
[13]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f8dbd534a60>
../_images/example_store_27_1.png