Statistics / Writing results¶
Statistics are functors that are called at the end of each iteration of the Bayesian optimizer. Their job is to:
- write the results to files;
- write the current state of the optimization;
- write the data that are useful for your own analyses.
All the statistics are written in a directory called hostname_date_pid-number
. For instance: wallepro-perso.loria.fr_2016-05-13_16_16_09_72226
Limbo provides a few classes for common uses (see Statistics (stats) for details):
ConsoleSummary
: writes a summary tostd::cout
at each iteration of the algorithmAggregatedObservations
: records the value of each evaluation of the function (after aggregation) [filenameaggregated_observations.dat
]BestAggregatedObservations
: records the best value observed so far after each iteration [filenameaggregated_observations.dat
]Observations
: records the value of each evaluation of the function (before aggregation) [filename:observations.dat
]Samples
: records the position in the search space of the evaluated points [filename:samples.dat
]BestObservations
: records the best observation after each iteration? [filenamebest_observations.dat
]BestSamples
: records the position in the search space of the best observation after each iteration [filename:best_samples.dat
]
These statistics are for “advanced users”:
GPAcquisitions
GPKernelHParams
GPLikelihood
GPMeanHParams
The default statistics list is:
boost::fusion::vector<stat::Samples<Params>, stat::AggregatedObservations<Params>,
stat::ConsoleSummary<Params>>
Writing your own statistics class¶
Limbo only provides generic statistics classes. However, it is often useful to add user-defined statistics classes that are specific to a particular experiment.
All the statistics functors follow the same template:
template <typename Params>
struct Samples : public limbo::stat::StatBase<Params> {
template <typename BO, typename AggregatorFunction>
void operator()(const BO& bo, const AggregatorFunction&)
{
// code
}
};
In a few words, they take a BO object (instance of the Bayesian optimizer) and do what they want.
For instance, we could add a statistics class that writes the worst observation at each iteration. Here is how to write this functor:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | template <typename Params>
struct WorstObservation : public stat::StatBase<Params> {
template <typename BO, typename AggregatorFunction>
void operator()(const BO& bo, const AggregatorFunction& afun)
{
// [optional] if statistics have been disabled or if there are no observations, we do not do anything
if (!bo.stats_enabled() || bo.observations().empty())
return;
// [optional] we create a file to write / you can use your own file but remember that this method is called at each iteration (you need to create it in the constructor)
this->_create_log_file(bo, "worst_observations.dat");
// [optional] we add a header to the file to make it easier to read later
if (bo.total_iterations() == 0)
(*this->_log_file) << "#iteration worst_observation sample" << std::endl;
// ----- search for the worst observation ----
// 1. get the aggregated observations
auto rewards = std::vector<double>(bo.observations().size());
std::transform(bo.observations().begin(), bo.observations().end(), rewards.begin(), afun);
// 2. search for the worst element
auto min_e = std::min_element(rewards.begin(), rewards.end());
auto min_obs = bo.observations()[std::distance(rewards.begin(), min_e)];
auto min_sample = bo.samples()[std::distance(rewards.begin(), min_e)];
// ----- write what we have found ------
// the file is (*this->_log_file)
(*this->_log_file) << bo.total_iterations() << " " << min_obs.transpose() << " " << min_sample.transpose() << std::endl;
}
};
|
In order to configure the Bayesian optimizer to use our new statistics class, we first need to define a new statistics list which includes our new WorstObservation:
1 2 3 4 5 | // define a special list of statistics which include our new statistics class
using stat_t = boost::fusion::vector<stat::ConsoleSummary<Params>,
stat::Samples<Params>,
stat::Observations<Params>,
WorstObservation<Params>>;
|
Then, we use it to define the optimizer:
1 | bayes_opt::BOptimizer<Params, statsfun<stat_t>> boptimizer;
|
The full source code is available in src/tutorials/statistics.cpp and reproduced here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | #include <iostream>
#include <limbo/limbo.hpp>
using namespace limbo;
struct Params {
struct bayes_opt_boptimizer : public defaults::bayes_opt_boptimizer {
};
// depending on which internal optimizer we use, we need to import different parameters
#ifdef USE_NLOPT
struct opt_nloptnograd : public defaults::opt_nloptnograd {
};
#elif defined(USE_LIBCMAES)
struct opt_cmaes : public defaults::opt_cmaes {
};
#else
struct opt_gridsearch : public defaults::opt_gridsearch {
};
#endif
// enable / disable the writing of the result files
struct bayes_opt_bobase : public defaults::bayes_opt_bobase {
BO_PARAM(int, stats_enabled, true);
};
// no noise
struct kernel : public defaults::kernel {
BO_PARAM(double, noise, 1e-10);
};
struct kernel_maternfivehalves : public defaults::kernel_maternfivehalves {
};
// we use 10 random samples to initialize the algorithm
struct init_randomsampling {
BO_PARAM(int, samples, 10);
};
// we stop after 40 iterations
struct stop_maxiterations {
BO_PARAM(int, iterations, 40);
};
// we use the default parameters for acqui_ucb
struct acqui_ucb : public defaults::acqui_ucb {
};
};
struct Eval {
// number of input dimension (x.size())
BO_PARAM(size_t, dim_in, 1);
// number of dimenions of the result (res.size())
BO_PARAM(size_t, dim_out, 1);
// the function to be optimized
Eigen::VectorXd operator()(const Eigen::VectorXd& x) const
{
double y = -((5 * x(0) - 2.5) * (5 * x(0) - 2.5)) + 5;
// we return a 1-dimensional vector
return tools::make_vector(y);
}
};
template <typename Params>
struct WorstObservation : public stat::StatBase<Params> {
template <typename BO, typename AggregatorFunction>
void operator()(const BO& bo, const AggregatorFunction& afun)
{
// [optional] if statistics have been disabled or if there are no observations, we do not do anything
if (!bo.stats_enabled() || bo.observations().empty())
return;
// [optional] we create a file to write / you can use your own file but remember that this method is called at each iteration (you need to create it in the constructor)
this->_create_log_file(bo, "worst_observations.dat");
// [optional] we add a header to the file to make it easier to read later
if (bo.total_iterations() == 0)
(*this->_log_file) << "#iteration worst_observation sample" << std::endl;
// ----- search for the worst observation ----
// 1. get the aggregated observations
auto rewards = std::vector<double>(bo.observations().size());
std::transform(bo.observations().begin(), bo.observations().end(), rewards.begin(), afun);
// 2. search for the worst element
auto min_e = std::min_element(rewards.begin(), rewards.end());
auto min_obs = bo.observations()[std::distance(rewards.begin(), min_e)];
auto min_sample = bo.samples()[std::distance(rewards.begin(), min_e)];
// ----- write what we have found ------
// the file is (*this->_log_file)
(*this->_log_file) << bo.total_iterations() << " " << min_obs.transpose() << " " << min_sample.transpose() << std::endl;
}
};
int main()
{
// we use the default acquisition function / model / stat / etc.
// define a special list of statistics which include our new statistics class
using stat_t = boost::fusion::vector<stat::ConsoleSummary<Params>,
stat::Samples<Params>,
stat::Observations<Params>,
WorstObservation<Params>>;
/// remmeber to use the new statistics vector via statsfun<>!
bayes_opt::BOptimizer<Params, statsfun<stat_t>> boptimizer;
// run the evaluation
boptimizer.optimize(Eval());
// the best sample found
std::cout << "Best sample: " << boptimizer.best_sample()(0) << " - Best observation: " << boptimizer.best_observation()(0) << std::endl;
return 0;
}
|