View source on GitHub |
Module importing all policies.
Modules
bernoulli_thompson_sampling_policy
module: Policy for Bernoulli Thompson Sampling.
boltzmann_reward_prediction_policy
module: Policy for reward prediction and boltzmann exploration.
categorical_policy
module: Policy that chooses actions based on a categorical distribution.
constraints
module: An API for representing constraints.
falcon_reward_prediction_policy
module: Policy that samples actions based on the FALCON algorithm.
greedy_multi_objective_neural_policy
module: Policy for greedy multi-objective prediction.
greedy_reward_prediction_policy
module: Policy for greedy reward prediction.
lin_ucb_policy
module: Linear UCB Policy.
linalg
module: Utility code for linear algebra functions.
linear_bandit_policy
module: Linear Bandit Policy.
linear_thompson_sampling_policy
module: Linear Thompson Sampling Policy.
loss_utils
module: Loss utility code.
mixture_policy
module: A policy class that chooses from a set of policies to get the actions from.
neural_linucb_policy
module: Neural + LinUCB Policy.
ranking_policy
module: Ranking policy.
reward_prediction_base_policy
module: Base policy that samples actions based on predicted rewards.