moe.bandit package¶
Submodules¶
moe.bandit.bandit_interface module¶
Interface for Bandit functions, supports allocate arms and choose arm.
- class moe.bandit.bandit_interface.BanditInterface[source]¶
Bases: object
Interface for a bandit algorithm.
Abstract class to enable bandit functions, supports allocate arms and choose arm. Implementers of this interface will never override the method choose_arm.
Implementers of this ABC are required to manage their own hyperparameters.
- allocate_arms()[source]¶
Compute the allocation to each arm given historical_info, running bandit subtype` endpoint with hyperparameters in hyperparameter_info`.
Computes the allocation to each arm based on the given subtype, historical info, and hyperparameter info.
Returns: the dictionary of (arm, allocation) key-value pairs Return type: a dictionary of (str, float64) pairs
- static choose_arm(arms_to_allocations)[source]¶
Choose the arm based on allocation information given in arms_to_allocations.
Throws an exception when ‘arms_to_allocations’ is empty. Implementers of this interface will never override this method.
Parameters: arms_to_allocations – the dictionary of (arm, allocation) key-value pairs Rtype arms_to_allocations: a dictionary of (str, float64) pairs Returns: name of the chosen arm Return type: str
moe.bandit.constant module¶
Some default configuration parameters for bandit components.
- moe.bandit.constant.MAX_BERNOULLI_RANDOM_VARIABLE_VARIANCE = 0.25¶
Used in moe.bandit.ucb.ucb1_tuned.UCB1Tuned.get_ucb_payoff(). 0.25 is the maximum value the variance of a Bernoulli random variable can possibly take. Because the variance formula is \(p(1-p)\) where p is the probability of success and the maximum value is when p = 0.5. \(p(1-p) = 0.5(1-0.5) = 0.25\). See http://en.wikipedia.org/wiki/Bernoulli_distribution for more details.
moe.bandit.data_containers module¶
Data containers convenient for/used to interact with bandit members.
- class moe.bandit.data_containers.BernoulliArm(win=0.0, loss=0.0, total=0, variance=None)[source]¶
Bases: moe.bandit.data_containers.SampleArm
A Bernoulli arm (name, win, loss, total, variance) sampled from the objective function we are modeling/optimizing.
A Bernoulli arm has payoff 1 for a success and 0 for a failure. See more details on Bernoulli distribution at http://en.wikipedia.org/wiki/Bernoulli_distribution
See superclass SampleArm for more details.
- validate()[source]¶
Check this Bernoulli arm is a valid Bernoulli arm. Also check that this BernoulliArm passes basic validity checks: all values are finite.
A Bernoulli arm has payoff 1 for a success and 0 for a failure. See more details on Bernoulli distribution at http://en.wikipedia.org/wiki/Bernoulli_distribution
Raises ValueError: if any member data is non-finite or out of range or the arm is not a valid Bernoulli arm
- class moe.bandit.data_containers.HistoricalData(sample_arms=None, validate=True)[source]¶
Bases: object
A data container for storing the historical data from an entire experiment in a layout convenient for this library.
Users will likely find it most convenient to store experiment historical data of arms in tuples of (win, loss, total, variance); for example, these could be the columns of a database row, part of an ORM, etc. The SampleArm class (above) provides a convenient representation of this input format, but users are not required to use it.
Variables: _arms_sampled – (dict) mapping of arm names to already-sampled arms - append_sample_arms(sample_arms, validate=True)[source]¶
Append the contents of sample_arms to the data members of this class.
This method first validates the arms and then updates the historical data. The result of combining two valid arms is always a valid arm.
Parameters: - sample_arms (a dictionary of (arm name, SampleArm) key-value pairs) – the already-sampled arms: wins, losses, and totals
- validate (boolean) – whether to sanity-check the input sample_arms
- arms_sampled[source]¶
Return the arms_sampled, a dictionary of (arm name, SampleArm) key-value pairs.
- json_payload()[source]¶
Construct a json serializeable and MOE REST recognizeable dictionary of the historical data.
- static validate_sample_arms(sample_arms)[source]¶
Check that sample_arms passes basic validity checks: all values are finite.
Parameters: sample_arms (a dictionary of (arm name, SampleArm) key-value pairs) – already-sampled arms: names, wins, losses, and totals Returns: True if inputs are valid Return type: boolean
- class moe.bandit.data_containers.SampleArm(win=0.0, loss=0.0, total=0, variance=None)[source]¶
Bases: object
An arm (name, win, loss, total, variance) sampled from the objective function we are modeling/optimizing.
This class is a representation of a “Sample Arm,” which is defined by the four data members listed here. SampleArm is a convenient way of communicating data to the rest of the bandit library (via the HistoricalData container); it also provides a convenient grouping for interactive introspection.
Users are not required to use SampleArm, iterables with the same data layout will suffice.
Variables: - win – (float64 >= 0.0) The amount won from playing this arm
- loss – (float64 >= 0.0) The amount loss from playing this arm
- total – (int >= 0) The number of times we have played this arm
- variance – (float >= 0.0) The variance of this arm, if there is no variance it is equal to None
- json_payload()[source]¶
Convert the sample_arm into a dict to be consumed by json for a REST request.
- validate()[source]¶
Check this SampleArm passes basic validity checks: all values are finite.
Raises ValueError: if any member data is non-finite or out of range
moe.bandit.linkers module¶
Links between the implementations of bandit algorithms.
moe.bandit.utils module¶
Utilities for bandit.
- moe.bandit.utils.get_equal_arm_allocations(arms_sampled, winning_arm_names=None)[source]¶
Split allocations equally among the given winning_arm_names. If no winning_arm_names given, split allocations among arms_sampled.
Throws an exception when arms_sampled is empty.
Parameters: arms_sampled (dictionary of (str, SampleArm()) pairs) – a dictionary of arm name to moe.bandit.data_containers.SampleArm Param: winning_arm_names: a set of names of the winning arms Type: winning_arm_names: frozenset(str) Returns: the dictionary of (arm, allocation) key-value pairs Return type: a dictionary of (str, float64) pairs Raise: ValueError when arms_sampled are empty.
- moe.bandit.utils.get_winning_arm_names_from_payoff_arm_name_list(payoff_arm_name_list)[source]¶
Compute the set of winning arm names based on the given payoff_arm_name_list..
Throws an exception when payoff_arm_name_list is empty.
Parameters: payoff_arm_name_list (list of (float64, str) tuples) – a list of (payoff, arm name) tuples Returns: of set of names of the winning arms Return type: frozenset(str) Raise: ValueError when payoff_arm_name_list are empty.
Module contents¶
Bandit directory containing multi-armed bandit implementation in python.
Files in this package
- moe.bandit.constant: some default configuration values for optimal_learning components
- moe.bandit.data_containers: SampleArm and HistoricalData containers for passing data to the bandit library
- moe.bandit.linkers: linkers connecting bandit components.
- moe.bandit.bandit_interface: an interface into different bandit policies
Bandit packages moe.bandit.epsilon: Epsilon bandit policies moe.bandit.ucb: UCB bandit policies moe.bandit.bla: BLA bandit policies
A set of abstract base classes (ABCs) defining an interface for interacting with bandit. These consist of composable functions and classes to allocate bandit arms and choose arm.