gpp_test_utils¶

Contents:

gpp_test_utils.hpp

gpp_test_utils.cpp

gpp_test_utils.hpp¶

This file declares functions and classes that are useful for unit testing. This includes some relative/absolute precision checks and a few mathematical utilities.

The “big stuff” in this file is a class that defines the interface for a pingable function:

PingableMatrixInputVectorOutputInterface

Then there is a PingDerivative() function that can conduct ping testing on any implementer of this interface.

There’s also a mock environment class that sets up quantities commonly needed by tests of GP functionality. Similarly, there is a mock data class that builds up “history” (points_sampled, points_sampled_value) by constructing a GP with random hyperparameters on a random domain.

Test for gpp_test_utils.hpp: utilities that are useful in writing other tests, like pinging error checking, etc.

namespace boost

class uniform_real

namespace optimal_learning

Macro to allow restrict as a keyword for C++ compilation and CUDA/nvcc compilation. See related entry in gpp_common.hpp for more details.

class MockExpectedImprovementEnvironment

Class to conveniently hold and generate random data that are commonly needed for testing functions in gpp_math.cpp. In particular, this mock is used for testing GP mean, GP variance, and expected improvement (and their gradients).

This class holds arrays: points_sampled, points_sampled_value, points_to_sample, and points_being_sampled which are sized according to the parameters specified in Initialize(), and filled with random numbers.

TODO(GH-125): we currently generate the point sets by repeated calls to rand(). This is generally unwise since the distribution of points is not particularly random. Additionally, our current covariance functions are all stationary, so we would rather generate a random base point x, and then a random (direction, radius) pair so that y = x + direction*radius. We better cover the different behavioral regimes of our code in this case, since it’s the radius value that actually correlates to results.

Public Type

typedef UniformRandomGenerator::EngineType EngineType

Public Functions

MockExpectedImprovementEnvironment()

Construct a MockExpectedImprovementEnvironment and set invalid values for all size parameters (so that Initialize must be called to do anything useful) and pre-allocate some space.

void Initialize(int dim_in, int num_to_sample_in, int num_being_sampled_in, int num_sampled_in)

(Re-)initializes the data data in this function: this includes space allocation and random number generation.

If any of the size parameters are changed from their current values, space will be realloc’d. Then it re-draws another set of uniform random points (in [-5, 5]) for the member arrays points_sampled, points_sampled_value, points_to_sample, and points_being_sampled.

Parameters:

dim: the spatial dimension of a point (i.e., number of independent params in experiment)

num_to_sample: number of points to be sampled in future experiments

num_being_sampled:

number of points being sampled concurrently

num_sampled: number of already-sampled points

void Initialize(int dim_in, int num_to_sample_in, int num_being_sampled_in, int num_sampled_in, UniformRandomGenerator * uniform_generator)

double * points_sampled()

double * points_sampled_value()

double * points_to_sample()

double * points_being_sampled()

OL_DISALLOW_COPY_AND_ASSIGN(MockExpectedImprovementEnvironment)

Public Members

int dim

spatial dimension (e.g., entries per point of points_sampled)

int num_sampled

number of points in points_sampled (history)

int num_to_sample

number of points to be sampled in future experiments (i.e., the q in q,p-EI)

int num_being_sampled

number of points currently being sampled (i.e., the p in q,p-EI)

Public Static Attributes

constexpr EngineType::result_type kDefaultSeed

default seed for repeatability in testing

constexpr double range_min

minimum coordinate value

constexpr double range_max

maximum coordinate value

Private Members

std::vector< double > points_sampled_

coordinates of already-sampled points, X

std::vector< double > points_sampled_value_

function values at points_sampled, y

std::vector< double > points_to_sample_

points to be sampled in experiments (i.e., the q in q,p-EI)

std::vector< double > points_being_sampled_

points being sampled in concurrent experiments (i.e., the p in q,p-EI)

UniformRandomGenerator uniform_generator_

uniform random number generator for generating coordinates

boost::uniform_real < double > uniform_double_

distribution over [min, max] that coordinate values lie in

class MockGaussianProcessPriorData

Struct to generate data randomly from a GaussianProcess. This object contains the generated GaussianProcess as well as the inputs needed to generate it (e.g., hyperparameters, domain, etc).

This struct is intended for convenience (so that test writers do not need to repeat these lines in every test that builds its input data from a GP). It has a fire-and-forget constructor that builds all fields randomly, but it also exposes all internal state/functions used by that ctor so that power users can customize their test scenarios further.

Implementation: this object uses std::unique_ptr to hide complex object definitions in the corresponding cpp file

Public Functions

MockGaussianProcessPriorData(const CovarianceInterface & covariance, const std::vector< double > & noise_variance_in, int dim_in, int num_sampled_in)

Construct an empty MockGaussianProcessPriorData. Member int, double, and vectors are initialized appropriately. covariance_ptr is cloned from covariance. BUT domain_ptr and gaussian_process_ptr ARE NOT initialized. Use this class’s member functions to properly initialize these more complex data.

Parameters:

covariance: the CovarianceInterface object encoding assumptions about the GP’s behavior on our data

noise_variance: the \sigma_n^2 (noise variance) associated w/observation, i-th entry will be associated with the i-th point generated by the GP

dim: the spatial dimension of a point (i.e., number of independent params in experiment)

num_sampled: number of already-sampled points (that we want the GP to hold)

MockGaussianProcessPriorData(const CovarianceInterface & covariance, const std::vector< double > & noise_variance_in, int dim_in, int num_sampled_in, const boost::uniform_real < double > & uniform_double_domain_lower, const boost::uniform_real < double > & uniform_double_domain_upper, const boost::uniform_real < double > & uniform_double_hyperparameters, UniformRandomGenerator * uniform_generator)

Completely constructs a MockGaussianProcessPriorData, initializing all fields. Builds a GP based on a randomly generated domain and hyperparameters.

Parameters:

covariance: the CovarianceInterface object encoding assumptions about the GP’s behavior on our data

noise_variance: the \sigma_n^2 (noise variance) associated w/observation, i-th entry will be associated with the i-th point generated by the GP

dim: the spatial dimension of a point (i.e., number of independent params in experiment)

num_sampled: number of already-sampled points (that we want the GP to hold)

uniform_double_domain_lower:

[min, max] range from which to draw domain lower bounds

uniform_double_domain_upper:

[min, max] range from which to draw domain upper bounds

uniform_double_hyperparameters:

[min, max] range from which to draw hyperparameters

uniform_generator[1]:

a UniformRandomGenerator object providing the random engine for uniform random numbers

Outputs:

uniform_generator[1]:

UniformRandomGenerator object will have its state changed due to random draws

~MockGaussianProcessPriorData()

Prevent inline destructor: the dtor of std::unique_ptr<T> needs access to T’s dtor (b/c unique_ptr’s dtor basically calls delete on T*). But we want to forward-declare all of our T objects, so the dtor must be defined in the cpp file where those defintions are visible.

void InitializeHyperparameters(const boost::uniform_real < double > & uniform_double_hyperparameters, UniformRandomGenerator * uniform_generator)

Sets hyperparameters of covariance with random draws from the specified interval. Modifies: hyperparameters, covariance_ptr

Parameters:

uniform_double_hyperparameters:

[min, max] range from which to draw hyperparameters

uniform_generator[1]:

a UniformRandomGenerator object providing the random engine for uniform random numbers

Outputs:

uniform_generator[1]:

UniformRandomGenerator object will have its state changed due to random draws

void InitializeDomain(const boost::uniform_real < double > & uniform_double_domain_lower, const boost::uniform_real < double > & uniform_double_domain_upper, UniformRandomGenerator * uniform_generator)

Sets the domain from which the GP’s historical data will be generated. For each dimension, we draw [min, max] bounds (domain_bounds); then we construct a domain object. Modifies: domain_bounds, domain_ptr

Parameters:

uniform_double_domain_lower:

[min, max] range from which to draw domain lower bounds

uniform_double_domain_upper:

[min, max] range from which to draw domain upper bounds

uniform_generator[1]:

a UniformRandomGenerator object providing the random engine for uniform random numbers

Outputs:

uniform_generator[1]:

UniformRandomGenerator object will have its state changed due to random draws

void InitializeGaussianProcess(UniformRandomGenerator * uniform_generator)

Builds a GaussianProcess with num_sampled points (drawn randomly from the domain) whose values are drawn randomly from the GP (sampled one at a time and added to the prior). Users MUST call InitializeHyperparameters() and InitializeDomain() (or otherwise initialize hyperparameters and domain_ptr) before calling this function. Modifies: covariance_ptr, best_so_far, gaussian_procss_ptr

Parameters:

uniform_generator[1]:

a UniformRandomGenerator object providing the random engine for uniform random numbers

Outputs:

uniform_generator[1]:

UniformRandomGenerator object will have its state changed due to random draws

Public Members

int dim

spatial dimension (e.g., entries per point of points_sampled)

int num_sampled

number of points in points_sampled (history)

std::vector< ClosedInterval > domain_bounds

set of [min, max] bounds for the coordinates of each dimension

std::unique_ptr< DomainType > domain_ptr

the domain the GP will live in

std::unique_ptr< CovarianceInterface > covariance_ptr

covariance class (for computing covariance and its gradients); encodes assumptions about GP’s behavior on data

std::vector< double > hyperparameters

hyperparameters of the covariance function

std::vector< double > noise_variance

\sigma_n^2, the noise variance

double best_so_far

lowest function value seen so far

std::unique_ptr< GaussianProcess > gaussian_process_ptr

the GP constructed by this object, using points sampled from domain and the stored covariance

class PingableMatrixInputVectorOutputInterface

Class to enable numerical and analytic differentiation of functions of the form:

f_{k} = f(X_{d,i})

with derivatives taken wrt each member of X_{d,i},

gradf_{k,d,i} = \frac{\partial f_k}{\partial X_{d,i}}

In the nomenclature used in the class:

d indexes over num_rows (set in GetInputSizes())

i indexes over num_cols (set in GetInputSizes())

k indexes over GetOutputSize()

Typically d is the spatial_dimension of the problem. So if i ranges over 1 .. num_points, then X_{d,i} is a matrix of num_points points each with dimension spatial_dim. And k refers to num_outputs.

X_{d,i} can of course be any arbitrary matrix; d, i need not refer to spatial dimension and num_points. But that is currently the most common use case.

This class enables easy pinging of a multitude of f, X combinations. Since it abstracts away indexing, it does not limit how implementations store/compute f() and its gradient.

Generally, usage goes as follows:

Use GetInputSizes(), GetOutputSize(), and possibly GetGradientsSize() to inspect the dimensions of the problem

EvaluateAndStoreAnalyticGradient(): compute and internally store the gradient evaluated at a given input*

GetAnalyticGradient: returns the value of the analytic gradient for a given output (k), wrt a given point (d,i)

EvaluateFunction: returns all outputs of the function for a given input

* It is not necessary to fully evaluate the gradient here. Instead, the input point can be stored and evaluation can happen on-the-fly in GetAnalyticGradient() if desired.

So to ping a derivative, you can:

f_p = EvaluateFunction(X + h), f_m = EvaluateFunction(X - h)

Compare:

(f_p - f_m)/(2h)

to

GetAnalyticGradient

See PingDerivative() docs for more details.

Public Functions

void GetInputSizes(int * num_rows, int * num_cols)

Number of rows and columns of the input, X_{d,i}, to f().

For example, the input might be a N_d X N_i matrix, points_to_sample, where N_d = spatial dimension (rows) and N_i = number of points (columns).

Outputs:

num_rows[1]: the number of rows of the input matrix X

num_cols[1]: the number of columns of the input matrix X

int GetOutputSize()

Number of outputs of the function f_k = f(X_{d,i}); i.e., length(f_k)

Returns:

The number of entries in f_k aka number of outputs of f()

int GetGradientsSize()

Number of entries in the gradient of the output wrt each entry of the input.

This should generally not be used unless you require direct access to the analytic gradient.

Returns:

MUST be num_rows*num_cols*GetOutputSize() invalid memory read/writes may occur

void EvaluateAndStoreAnalyticGradient(double const *restrict input_matrix, double *restrict gradients)

Setup so that GetAnalyticGradient(row, column, output) will be able to return gradf[row][column][output] evaluated at X, “input_matrix.”

Typically this will entail computing and storing the analytic gradient. But the only thing that needs to be saved is the contents of input_matrix for later access.

MUST BE CALLED before using GetAnalyticGradient!

Parameters:

input_matrix[num_rows][num_cols]:

the input, X_{d,i}

Outputs:

gradients[num_rows][num_cols][num_outputs]:

filled with the gradient evaluated at input_matrix. Ignored if nullptr. IMPLEMENTATION NOT REQUIRED.

double GetAnalyticGradient(int row_index, int column_index, int output_index)

The gradients are indexed by: dA[input_row][input_column][output_index], where row, column index the input matrix and output_index indexes the output.

Returns the gradient computed/stored by EvaluateAndStoreAnalyticGradient().

Parameters:

row_index: row_index (d) of the input to be differentiated with respect to

column_index: column_index (i) of the input to be differentiated with respect to

output_index: (flat) index into the output

Returns:

The [row_index][column_index][output_index]'th entry of the analytic gradient evaluated at input_matrix (where input matrix was specified in EvaluateAndStoreAnalyticGradient()).

void EvaluateFunction(double const *restrict input_matrix, double *restrict function_values)

Evalutes f_k = f(X_{d,i}). X_{d,i} is the “input_matrix” and f_k is in “function_values.”

Parameters:

input_matrix[num_rows][num_cols]:

the matrix of inputs

Outputs:

function_values[num_outputs]:

vector of outputs of f()

OL_DISALLOW_COPY_AND_ASSIGN(PingableMatrixInputVectorOutputInterface)

Protected Functions

PingableMatrixInputVectorOutputInterface()

~PingableMatrixInputVectorOutputInterface()

gpp_test_utils.cpp¶

Implementations of utitilies useful for unit testing.

Defines

OL_PING_TEST_DEBUG_PRINTF(...)

namespace optimal_learning

Macro to allow restrict as a keyword for C++ compilation and CUDA/nvcc compilation. See related entry in gpp_common.hpp for more details.

Functions
bool CheckIntEquals(int64_t value, int64_t truth)

Checks if |value - truth| == 0

Parameters:

value: number to be tested

truth: the exact/desired result

Returns:

true if value, truth are equal.

double ResidualNorm(double const *restrict A, double const *restrict x, double const *restrict b, int size)

\|b - A*x\|_2

The quantity b - A*x is called the “residual.” This is meaningful when x is the solution of the linear system A*x = b. Then having a small residual norm is a NECESSARY but NOT SUFFICIENT indicator of accuracy in x; that is, these quantities need not be small simultaneously. In particular, we know:

\|\delta x\| / \|x\| \le cond(A) * \|r\| / (\|A\| * \|x\|)

where x is the approximate solution and delta x is the error. So \delta x = 0 <=> r = 0 BUT if not identically 0, ||r|| can be much larger.

However, a numerically (backward) stable algorithm will compute solutions with small relative residual norms regardless of conditioning. Hence coupled with knowledge of the particular algorithm for solving A*x = b, residual norm is a valuable measure of correctness.

Suppose x (computed solution) satisfies: (A + \delta A)*x = b. Then:

\|r\| / (\|A\| * \|x\|) \le \|\delta A\| / \|A\|

So large \|r\| indicates a large backward error, implying the linear solver is not backward stable (and hence should not be used).

Computes ||b - A*x||_2

The quantity b - A*x is called the “residual.” This is meaningful when x is the solution of the linear system A*x = b.

Coupled with knowledge of the underlying algorithm, having a small residual norm is a useful measure of method correctness. See implementation documentation for more details.

This norm is what is minimzied in least squares problems. However here we are not working with least squares solutions and require that A is square.

Parameters:

A[size][size]: the linear system

x[size]: the solution vector

b[size]: the RHS vector

size: the dimension of the problem

Returns:

the 2-norm of b-A*x

bool CheckDoubleWithin(double value, double truth, double tolerance)

Checks if |value - truth| <= tolerance (error)

Parameters:

value: number to be tested

truth: the exact/desired result

tolerance: permissible difference

Returns:

true if value, truth differ by no more than tolerance.

bool CheckDoubleWithinRelativeWithThreshold(double value, double truth, double tolerance, double threshold)

Checks if |value - truth| / |truth| <= tolerance (relative error)

If truth < threshold, CheckDoubleWithin() is performed.

Parameters:

value: number to be tested

truth: the exact/desired result

tolerance: permissible relative difference

threshold: tolerance = |value - truth| if |truth| < threshold, this is to control unexpected large or undefined relative diff when truth is “too small” (0 for example)

Returns:

true if value, truth differ relatively by no more than tolerance.

bool CheckDoubleWithinRelative(double value, double truth, double tolerance)

Checks if |value - truth| / |truth| <= tolerance (relative error)

If truth = 0.0, CheckDoubleWithin() is performed.

Parameters:

value: number to be tested

truth: the exact/desired result

tolerance: permissible relative difference

Returns:

true if value, truth differ relatively by no more than tolerance.

bool CheckMatrixNormWithin(double const *restrict matrix1, double const *restrict matrix2, int size_m, int size_n, double tolerance)

Uses the Frobenius Norm for convenience; matrix 2-norms are expensive to compute.

Checks that ||A - B||_F <= tolerance

Parameters:

matrix1[size_m][size_n]:

matrix A

matrix2[size_m][size_n]:

matrix B

size_m: rows of A, B

size_n: columns of A, B

tolerance: largest permissible norm of the difference A - B

Returns:

true if A - B are “close”

int CheckPointsAreDistinct(double const *restrict point_list, int num_points, int dim, double tolerance)

Check whether the distance between every pair of points is larger than tolerance.

Parameters:

point_list[dim][num_points]:

list of points to check

dim: number of coordinates in each point

num_points: number of points

tolerance: the minimum allowed distance between points

Returns:

the number of pairs of points whose inter-point distance is less than tolerance

int PingDerivative(const PingableMatrixInputVectorOutputInterface & function_and_derivative_evaluator, double const *restrict points, double epsilon, double rate_tolerance_fine, double rate_tolerance_relaxed, double input_output_ratio)
Pings the gradient \nabla f of a function f, using second order finite differences:

grad_f_approximate = ( f(x + h) - f(x - h) )/ (2 * h)

This converges as O(h^2) (Taylor series expansion) for twice-differentiable functions. Thus for h_1 != h_2, we can compute two grad_f_approximate results and thus two errors, e_1, e_2. Then (in exact precision), we would obtain:

log(e_1/e_2) / log(h_1/h_2) >= 2.

(> sign because superconvergence can happen.)

Proof: By taylor-expanding grad_f_approximate, we see that:

`e = grad_f_anlytic - grad_f_approximate = O(h^2) = c*h^2 + H.O.T (Higher Order Terms).

Assuming H.O.T \approx 0, then

e_1/e_2 \approx c*h_1^2 / (c*h_2^2) = h_1^2 / h_2^2,

and

log(e_1/e_2) \approx log(h_1^2 / h_2^2) = 2 * log(h_1/h_2).

Hence

rate = log(e_1/e_2) / log(h_1/h_2) = 2, in exact precision.

If c = 0 (or is very small), then H.O.T. matters: we could have e = 0*h^2 + c_1*h^3 + H.O.T. And we will compute rates larger than 2 (superconvergence). This is not true in general but it can happen.

This function verifies that the expected convergence rate is obtained with acceptable accuracy in the presence of floating point errors.

This function knows how to deal with vector-valued f() which accepts a matrix of inputs, X. Analytic gradients and finite difference approximations are computed for each output of f() with respect to each entry in the input, X. In particular, this function can handle f():

f_k = f(X_{d,i})

computing gradients:

gradf_{d,i,k} = \frac{\partial f_k}{\partial X_{i,d}}.

So d, i index the inputs and k indexes the outputs.

The basic structure is as follows:
for i
  for d
    # Compute function values and analytic/finite difference gradients for the current
    # input. Do this for each h value.
    # Step 1
    for each h = h_1, h_2
      X_p = X; X_p[d][i] += h
      X_m = X; X_m[d][i] -= h

      f_p = f(X_p); # k values in f
      f_m = f(X_m); # k values in f

      # Loop over outputs and compute difference between finite difference and analytic results.
      for k
        grad_f_analytic   = get_grad_f[d][i][k](X) // d,i,k entry of gradient evaluated at X
        grad_f_finitediff = (f_p[k] - f_m[k])/(2*h)
        error[k] = grad_f_analytic - grad_f_finitediff
      endfor
    endfor

    # Step 2
    for k
      check error, convergence
    endfor
  endfor
endfor
Hence all checks for a specific output (k) wrt to a specific point (d,i) happen together. All dimensions of a given point are grouped together. Keep this in mind when writing PingableMatrixInputVectorOutputInterface subclasses as it can make reading debugging output easier.

See PingableMatrixInputVectorOutputInterface (and its subclasses) for more information/examples on how f and \nabla f can be structured.

Note that in some cases, this function will decide to skip some tests or run them under more relaxed tolerances. There are many reasons for this:

When the exact gradient is near 0, finite differencing is simply trying to compute (x1 - x2) = 0, which has an infinite condition number. If our method is correct/accurate, we will be able to reasonably closely approximate 0, but we cannot expect convergence.

Backward stability type error analysis deals with normed bounds; for example, see the discussion in ResidualNorm(). These normed estimates bound the error on the LARGEST entries. The error in the smaller entries can be much larger.

However, even when errors could be large, we’re trying to compute 0, etc., we do not want to completely ignore these scenarios since that could cause us to accept completely bogus outputs. Instead we try to compensate.

Implementer’s Note: This is an “expert tool.” I originally implemented it to automate my workflow when implementing and testing “math as code” for gradients. It is a common testing technique from the world of derivative-based optimization and nonlinear solvers.

Generally, this function is to check at a large number of (random) points in the hopes that most of the points avoid ill-conditioning issues. Random points are chosen make the implementer’s life easier.

The typical workflow to implement f(x) and df(x)/dx might look like:

Code f(x)

Verify f(x)

Analytically compute df/dx (on paper, with a computer algebra system, etc.)

Check df/dx

at some hand-evaluated points

Ping testing (this function)

If errors arise, this function will output some information to provide further context on what input/output combination failed and how. At the head of this file, define OL_PING_TEST_DEBUG_PRINT to turn on super verbose printing, which will provide substantially more detail. This is generally useful to see if failing points are just barely on the wrong side of tolerance or if there is a larger problem afoot.

As it turns out, testing derivative code is no easy undertaking. It is often not a well-conditioned problem, and the conditioning is difficult to predict, being a function of the input point, function values, and gradient. The exact value is knowable from the analytic gradient, but here we are trying to ascertain whether the implementation of the analytic gradient is correct!

Since we do nothing to guarantee good conditioning (random inputs), sometimes our luck is bad. Thus, this function includes a number of heuristics to detect conditions in which numerical error is too high to make an informed decision on whether we are looking at noise or a genuine error.

These heuristics involve a number of “magic numbers” defined in the code. These depend heavily on the complexity of the gradient being tested (as well as the points at which gradients are being computed). Here, we are roughly assuming that the entries of “points” are in [1.0e-3, 1.0e1]. The magic numbers here are not perfect, but they appear to work in the current use cases. I have left them hard-coded for now in the interest of making ping test easier to set up. So, apologies in advance for the magic numbers.

We chose the bypasses to heavily favor eliminating false positives. Some true positives are also lost. This is why OL_PING_TEST_DEBUG_PRINTF is important when debugging. If a true positive comes up, then there are probably many more, which can be viewed by parsing the more detailed output.

Warning

This function is NOT a catch-all. With the heuristics in place, you could adversarially construct an implementation that is wrong but passes this test.

Warning

Additionally, this function cannot detect errors that are not in the gradient. If you intended to implement f = sin(x), f' = cos(x), but instead coded g = sin(x) + 1, this function will not find it. This cannot detect small amounts of noise in the gradient either (e.g., f' = cos(x) + 1.0e-15). There is no way to tell whether that is noise due to numerical error or noise due to incorrectness.

TODO(GH-162): thresholds are an imperfect tool for this task. Loss of precision is not a binary event; you are not certain at 2^{-52} but uncertain at 2^{-51}. It might be better to estimate the range over which we go from meaningful loss of precision to complete noise and have a linear ramp for the tolerance over that space. Maybe it should be done in log-space?

Checks the correctness of analytic gradient calculations using finite differences.

Since the exact level of error is virtually impossible to compute precisely, we use finite differences at two different h values (see implementation docs for details) and check the convergence rate.

Includes logic to skip tests or run at relaxed tolerances when poor conditioning or loss precision is detected.

In general, this function is meant to be used to test analytic gradient computations on a large number of random points. The logic to skip tests or relax tolerances is designed to decrease the false positive rate to 0. Some true positives are lost in the process. So we obtain “reasonable certainty” by testing a large number of points.

If you are implementing/testing new gradients code, please read the function comments for this as well. This is an “expert tool” and not necessarily the most user-friendly at that.

This function produces the most useful debugging output when in PingableMatrixInputVectorOutputInterface,

num_rows = spatial dimension (d)

num_cols = num_points (i)

GetOutputSize() = num_outputs (k)

for functions f_k = f(X_{d,i})

Warning

this function generates ~10 lines of output (to stdout) PER FAILURE. If your implementation is incorrect, expect a large number of lines printed to stdout.

Parameters:

function_and_derivative_evaluator:

an object that inherits from PingableMatrixInputVectorOutputInterface. This object must define all virtual functions in that interface.

points[num_cols][num_rows]:

num_cols points, each with dimension num_rows. Assumed that the coordinate-wise magnitues are “around 1.0”: say [1.0e-3, 1.0e1].

epsilon[2]: array[h1, h2] of step sizes to use for finite differencing. These should not be too small/too large; 5.0e-3, 1.0e-3 are suggested starting values. Note that the more ill-conditioned computing f_k is, the larger the tolerances will need to be. For example, “complex” functions (e.g., many math ops) may be ill-conditioned.

rate_tolerance_fine:

desired amount of deviation from the exact rate

rate_tolerance_relaxed:

maximum allowable abmount of deviation from the exact rate

input_output_ratio:

for ||analytic_gradient||/||input|| < input_output_ratio, ping testing is not performed Suggest values around 1.0e-15 to 1.0e-18 (around machine preciscion).

Returns:

The number of gradient entries that failed pinging. Expected to be 0.
void ExpandDomainBounds(double scale_factor, std::vector< ClosedInterval > * domain_bounds)

Expand each ClosedInterval by scale_factor (outward from their midpoints).

This is handy for testing the constrained EI optimizers. That is, we want a domain which contains the true optima (b/c it is easier to know that things worked if the gradient is 0) but not a domain that is so large that constraint satisfaction is irrelevant. Hence we scale the side-lengths by some “reasonable” amount.

Parameters:

scale_factor: relative amount by which to expand each ClosedInterval

domain_bounds[1]:

input list of valid ClosedInterval

Outputs:

domain_bounds[1]:

input ClosedIntervals expanded by scale_factor

void FillRandomCovarianceHyperparameters(const boost::uniform_real < double > & uniform_double_hyperparameter, UniformRandomGenerator * uniform_generator, std::vector< double > * hyperparameters, CovarianceInterface * covariance)

Generates random hyperparameters and sets a covariance object’s hyperparameters to the new values.

Let n_hyper = covariance->GetNumberOfHyperparameters().

Parameters:

uniform_double_hyperparameter:

an uniform range, [min, max], from which to draw the hyperparameters

uniform_generator[1]:

a UniformRandomGenerator object providing the random engine for uniform random numbers

hyperparameters[1]:

vector initialized to hold n_hyper items

covariance[1]: appropriately initialized covariance object that can accept n_hyper hyperparameters

Outputs:

uniform_generator[1]:

UniformRandomGenerator object will have its state changed due to hyperparameters->size() random draws

hyperparameters[1]:

vector set to the newly generated hyperparameters

covariance[1]: covariance object with its hyperparameters set to “hyperparameters”

void FillRandomDomainBounds(const boost::uniform_real < double > & uniform_double_lower_bound, const boost::uniform_real < double > & uniform_double_upper_bound, UniformRandomGenerator * uniform_generator, std::vector< ClosedInterval > * domain_bounds)

Generates a random list of domain_bounds->size() [min_i, max_i] pairs such that:

min_i \in [uniform_double_lower_bound.a(), uniform_double_lower_bound.b()]

max_i \in [uniform_double_upper_bound.a(), uniform_double_upper_bound.b()]

Parameters:

uniform_double_lower_bound:

an uniform range, [min, max], from which to draw the domain lower bounds, min_i

uniform_double_upper_bound:

an uniform range, [min, max], from which to draw the domain upper bounds, max_i

uniform_generator[1]:

a UniformRandomGenerator object providing the random engine for uniform random numbers

domain_bounds[1]:

vector of ClosedInterval objects to initialize

Outputs:

uniform_generator[1]:

UniformRandomGenerator object will have its state changed due to 2*domain_bounds->size() random draws

domain_bounds[1]:

vector of ClosedInterval objects with their min, max members initialized as described

void FillRandomGaussianProcess(double const *restrict points_to_sample, double const *restrict noise_variance, int dim, int num_to_sample, double *restrict points_to_sample_value, GaussianProcess * gaussian_process)

Utility to draw num_to_sample points from a GaussianProcess and add those values to the prior.

Parameters:

points_to_sample[dim][num_to_sample]:

points at which to draw/sample from the GP

noise_variance[num_to_sample]:

the \sigma_n^2 (noise variance) associated w/the new observations, points_sampled_value

dim: the spatial dimension of a point (i.e., number of independent params in experiment)

num_to_sample: number of points add to the GP

gaussian_process[1]:

a properly initialized GaussianProcess

Outputs:

points_to_sample_value[num_to_sample]:

values of the newly sampled points

gaussian_process[1]:

the input GP with num_sampled additional points added to it; its internal state is modified to match the new data; and its internal PRNG state is modified