moe.bandit package

Submodules

moe.bandit.bandit_interface module

Interface for Bandit functions, supports allocate arms and choose arm.

class moe.bandit.bandit_interface.BanditInterface[source]

Bases: object

Interface for a bandit algorithm.

Abstract class to enable bandit functions, supports allocate arms and choose arm. Implementers of this interface will never override the method choose_arm.

Implementers of this ABC are required to manage their own hyperparameters.

allocate_arms()[source]

Compute the allocation to each arm given historical_info, running bandit subtype` endpoint with hyperparameters in hyperparameter_info`.

Computes the allocation to each arm based on the given subtype, historical info, and hyperparameter info.

Returns:the dictionary of (arm, allocation) key-value pairs
Return type:a dictionary of (str, float64) pairs
static choose_arm(arms_to_allocations)[source]

Choose the arm based on allocation information given in arms_to_allocations.

Throws an exception when ‘arms_to_allocations’ is empty. Implementers of this interface will never override this method.

Parameters:arms_to_allocations – the dictionary of (arm, allocation) key-value pairs
Rtype arms_to_allocations:
 a dictionary of (str, float64) pairs
Returns:name of the chosen arm
Return type:str

moe.bandit.constant module

Some default configuration parameters for bandit components.

moe.bandit.constant.MAX_BERNOULLI_RANDOM_VARIABLE_VARIANCE = 0.25

Used in moe.bandit.ucb.ucb1_tuned.UCB1Tuned.get_ucb_payoff(). 0.25 is the maximum value the variance of a Bernoulli random variable can possibly take. Because the variance formula is \(p(1-p)\) where p is the probability of success and the maximum value is when p = 0.5. \(p(1-p) = 0.5(1-0.5) = 0.25\). See http://en.wikipedia.org/wiki/Bernoulli_distribution for more details.

moe.bandit.data_containers module

Data containers convenient for/used to interact with bandit members.

class moe.bandit.data_containers.BernoulliArm(win=0.0, loss=0.0, total=0, variance=None)[source]

Bases: moe.bandit.data_containers.SampleArm

A Bernoulli arm (name, win, loss, total, variance) sampled from the objective function we are modeling/optimizing.

A Bernoulli arm has payoff 1 for a success and 0 for a failure. See more details on Bernoulli distribution at http://en.wikipedia.org/wiki/Bernoulli_distribution

See superclass SampleArm for more details.

validate()[source]

Check this Bernoulli arm is a valid Bernoulli arm. Also check that this BernoulliArm passes basic validity checks: all values are finite.

A Bernoulli arm has payoff 1 for a success and 0 for a failure. See more details on Bernoulli distribution at http://en.wikipedia.org/wiki/Bernoulli_distribution

Raises ValueError:
 if any member data is non-finite or out of range or the arm is not a valid Bernoulli arm
class moe.bandit.data_containers.HistoricalData(sample_arms=None, validate=True)[source]

Bases: object

A data container for storing the historical data from an entire experiment in a layout convenient for this library.

Users will likely find it most convenient to store experiment historical data of arms in tuples of (win, loss, total, variance); for example, these could be the columns of a database row, part of an ORM, etc. The SampleArm class (above) provides a convenient representation of this input format, but users are not required to use it.

Variables:_arms_sampled – (dict) mapping of arm names to already-sampled arms
append_sample_arms(sample_arms, validate=True)[source]

Append the contents of sample_arms to the data members of this class.

This method first validates the arms and then updates the historical data. The result of combining two valid arms is always a valid arm.

Parameters:
  • sample_arms (a dictionary of (arm name, SampleArm) key-value pairs) – the already-sampled arms: wins, losses, and totals
  • validate (boolean) – whether to sanity-check the input sample_arms
arms_sampled[source]

Return the arms_sampled, a dictionary of (arm name, SampleArm) key-value pairs.

json_payload()[source]

Construct a json serializeable and MOE REST recognizeable dictionary of the historical data.

num_arms[source]

Return the number of sampled arms.

static validate_sample_arms(sample_arms)[source]

Check that sample_arms passes basic validity checks: all values are finite.

Parameters:sample_arms (a dictionary of (arm name, SampleArm) key-value pairs) – already-sampled arms: names, wins, losses, and totals
Returns:True if inputs are valid
Return type:boolean
class moe.bandit.data_containers.SampleArm(win=0.0, loss=0.0, total=0, variance=None)[source]

Bases: object

An arm (name, win, loss, total, variance) sampled from the objective function we are modeling/optimizing.

This class is a representation of a “Sample Arm,” which is defined by the four data members listed here. SampleArm is a convenient way of communicating data to the rest of the bandit library (via the HistoricalData container); it also provides a convenient grouping for interactive introspection.

Users are not required to use SampleArm, iterables with the same data layout will suffice.

Variables:
  • win – (float64 >= 0.0) The amount won from playing this arm
  • loss – (float64 >= 0.0) The amount loss from playing this arm
  • total – (int >= 0) The number of times we have played this arm
  • variance – (float >= 0.0) The variance of this arm, if there is no variance it is equal to None
json_payload()[source]

Convert the sample_arm into a dict to be consumed by json for a REST request.

loss[source]

Return the amount loss, always greater than or equal to zero.

total[source]

Return the total number of tries, always a non-negative integer.

validate()[source]

Check this SampleArm passes basic validity checks: all values are finite.

Raises ValueError:
 if any member data is non-finite or out of range
variance[source]

Return the variance of sampled tries, always greater than or equal to zero, if there is no variance it is equal to None.

win[source]

Return the amount win, always greater than or equal to zero.

moe.bandit.linkers module

Links between the implementations of bandit algorithms.

class moe.bandit.linkers.BanditMethod

Bases: tuple

BanditMethod(subtype, bandit_class)

bandit_class

Alias for field number 1

subtype

Alias for field number 0

moe.bandit.utils module

Utilities for bandit.

moe.bandit.utils.get_equal_arm_allocations(arms_sampled, winning_arm_names=None)[source]

Split allocations equally among the given winning_arm_names. If no winning_arm_names given, split allocations among arms_sampled.

Throws an exception when arms_sampled is empty.

Parameters:arms_sampled (dictionary of (str, SampleArm()) pairs) – a dictionary of arm name to moe.bandit.data_containers.SampleArm
Param:winning_arm_names: a set of names of the winning arms
Type:winning_arm_names: frozenset(str)
Returns:the dictionary of (arm, allocation) key-value pairs
Return type:a dictionary of (str, float64) pairs
Raise:ValueError when arms_sampled are empty.
moe.bandit.utils.get_winning_arm_names_from_payoff_arm_name_list(payoff_arm_name_list)[source]

Compute the set of winning arm names based on the given payoff_arm_name_list..

Throws an exception when payoff_arm_name_list is empty.

Parameters:payoff_arm_name_list (list of (float64, str) tuples) – a list of (payoff, arm name) tuples
Returns:of set of names of the winning arms
Return type:frozenset(str)
Raise:ValueError when payoff_arm_name_list are empty.

Module contents

Bandit directory containing multi-armed bandit implementation in python.

Files in this package

Bandit packages moe.bandit.epsilon: Epsilon bandit policies moe.bandit.ucb: UCB bandit policies moe.bandit.bla: BLA bandit policies

A set of abstract base classes (ABCs) defining an interface for interacting with bandit. These consist of composable functions and classes to allocate bandit arms and choose arm.