tfp.substrates.numpy.optimizer.nelder_mead_minimize

Minimum of the objective function using the Nelder Mead simplex algorithm.

View aliases

Main aliases

tfp.experimental.substrates.numpy.optimizer.nelder_mead_minimize

tfp.substrates.numpy.optimizer.nelder_mead_minimize(
    objective_function,
    initial_simplex=None,
    initial_vertex=None,
    step_sizes=None,
    objective_at_initial_simplex=None,
    objective_at_initial_vertex=None,
    batch_evaluate_objective=False,
    func_tolerance=1e-08,
    position_tolerance=1e-08,
    parallel_iterations=1,
    max_iterations=None,
    reflection=None,
    expansion=None,
    contraction=None,
    shrinkage=None,
    name=None
)

Performs an unconstrained minimization of a (possibly non-smooth) function using the Nelder Mead simplex method. Nelder Mead method does not support univariate functions. Hence the dimensions of the domain must be 2 or greater. For details of the algorithm, see [Press, Teukolsky, Vetterling and Flannery(2007)][1].

Points in the domain of the objective function may be represented as a Tensor of general shape but with rank at least 1. The algorithm proceeds by modifying a full rank simplex in the domain. The initial simplex may either be specified by the user or can be constructed using a single vertex supplied by the user. In the latter case, if v0 is the supplied vertex, the simplex is the convex hull of the set:

S = {v0} + {v0 + step_i * e_i}

Here e_i is a vector which is 1 along the i-th axis and zero elsewhere and step_i is a characteristic length scale along the i-th axis. If the step size is not supplied by the user, a unit step size is used in every axis. Alternately, a single step size may be specified which is used for every axis. The most flexible option is to supply a bespoke step size for every axis.

Usage:

The following example demonstrates the usage of the Nelder Mead minimzation on a two dimensional problem with the minimum located at a non-differentiable point.

  # The objective function
  def sqrt_quadratic(x):
    return tf.sqrt(tf.reduce_sum(x ** 2, axis=-1))

  start = tf.constant([6.0, -21.0])  # Starting point for the search.
  optim_results = tfp.optimizer.nelder_mead_minimize(
      sqrt_quadratic, initial_vertex=start, func_tolerance=1e-8,
      batch_evaluate_objective=True)

  # Check that the search converged
  assert(optim_results.converged)
  # Check that the argmin is close to the actual value.
  np.testing.assert_allclose(optim_results.position, np.array([0.0, 0.0]),
                              atol=1e-7)
  # Print out the total number of function evaluations it took.
  print("Function evaluations: %d" % optim_results.num_objective_evaluations)

References:

[1]: William Press, Saul Teukolsky, William Vetterling and Brian Flannery. Numerical Recipes in C++, third edition. pp. 502-507. (2007). http://numerical.recipes/cpppages/chap0sel.pdf

[2]: Jeffrey Lagarias, James Reeds, Margaret Wright and Paul Wright. Convergence properties of the Nelder-Mead simplex method in low dimensions, Siam J. Optim., Vol 9, No. 1, pp. 112-147. (1998). http://www.math.kent.edu/~reichel/courses/Opt/reading.material.2/nelder.mead.pdf

[3]: Fuchang Gao and Lixing Han. Implementing the Nelder-Mead simplex algorithm with adaptive parameters. Computational Optimization and Applications, Vol 51, Issue 1, pp 259-277. (2012). https://pdfs.semanticscholar.org/15b4/c4aa7437df4d032c6ee6ce98d6030dd627be.pdf

Args
`objective_function`	A Python callable that accepts a point as a real `Tensor` and returns a `Tensor` of real dtype containing the value of the function at that point. The function to be minimized. If `batch_evaluate_objective` is `True`, the callable may be evaluated on a `Tensor` of shape `[n+1] + s` where `n` is the dimension of the problem and `s` is the shape of a single point in the domain (so `n` is the size of a `Tensor` representing a single point). In this case, the expected return value is a `Tensor` of shape `[n+1]`. Note that this method does not support univariate functions so the problem dimension `n` must be strictly greater than 1.
`initial_simplex`	(Optional) `Tensor` of real dtype. The initial simplex to start the search. If supplied, should be a `Tensor` of shape `[n+1] + s` where `n` is the dimension of the problem and `s` is the shape of a single point in the domain. Each row (i.e. the `Tensor` with a given value of the first index) is interpreted as a vertex of a simplex and hence the rows must be affinely independent. If not supplied, an axes aligned simplex is constructed using the `initial_vertex` and `step_sizes`. Only one and at least one of `initial_simplex` and `initial_vertex` must be supplied.
`initial_vertex`	(Optional) `Tensor` of real dtype and any shape that can be consumed by the `objective_function`. A single point in the domain that will be used to construct an axes aligned initial simplex.
`step_sizes`	(Optional) `Tensor` of real dtype and shape broadcasting compatible with `initial_vertex`. Supplies the simplex scale along each axes. Only used if `initial_simplex` is not supplied. See description above for details on how step sizes and initial vertex are used to construct the initial simplex.
`objective_at_initial_simplex`	(Optional) Rank `1` `Tensor` of real dtype of a rank `1` `Tensor`. The value of the objective function at the initial simplex. May be supplied only if `initial_simplex` is supplied. If not supplied, it will be computed.
`objective_at_initial_vertex`	(Optional) Scalar `Tensor` of real dtype. The value of the objective function at the initial vertex. May be supplied only if the `initial_vertex` is also supplied.
`batch_evaluate_objective`	(Optional) Python `bool`. If True, the objective function will be evaluated on all the vertices of the simplex packed into a single tensor. If False, the objective will be mapped across each vertex separately. Evaluating the objective function in a batch allows use of vectorization and should be preferred if the objective function allows it.
`func_tolerance`	(Optional) Scalar `Tensor` of real dtype. The algorithm stops if the absolute difference between the largest and the smallest function value on the vertices of the simplex is below this number.
`position_tolerance`	(Optional) Scalar `Tensor` of real dtype. The algorithm stops if the largest absolute difference between the coordinates of the vertices is below this threshold.
`parallel_iterations`	(Optional) Positive integer. The number of iterations allowed to run in parallel.
`max_iterations`	(Optional) Scalar positive `Tensor` of dtype `int32`. The maximum number of iterations allowed. If `None` then no limit is applied.
`reflection`	(Optional) Positive Scalar `Tensor` of same dtype as `initial_vertex`. This parameter controls the scaling of the reflected vertex. See, [Press et al(2007)][1] for details. If not specified, uses the dimension dependent prescription of [Gao and Han(2012)][3].
`expansion`	(Optional) Positive Scalar `Tensor` of same dtype as `initial_vertex`. Should be greater than `1` and `reflection`. This parameter controls the expanded scaling of a reflected vertex. See, [Press et al(2007)][1] for details. If not specified, uses the dimension dependent prescription of [Gao and Han(2012)][3].
`contraction`	(Optional) Positive scalar `Tensor` of same dtype as `initial_vertex`. Must be between `0` and `1`. This parameter controls the contraction of the reflected vertex when the objective function at the reflected point fails to show sufficient decrease. See, [Press et al(2007)][1] for more details. If not specified, uses the dimension dependent prescription of [Gao and Han(2012][3].
`shrinkage`	(Optional) Positive scalar `Tensor` of same dtype as `initial_vertex`. Must be between `0` and `1`. This parameter is the scale by which the simplex is shrunk around the best point when the other steps fail to produce improvements. See, [Press et al(2007)][1] for more details. If not specified, uses the dimension dependent prescription of [Gao and Han(2012][3].
`name`	(Optional) Python str. The name prefixed to the ops created by this function. If not supplied, the default name 'minimize' is used.

Returns
`optimizer_results`	A namedtuple containing the following items: converged: Scalar boolean tensor indicating whether the minimum was found within tolerance. num_objective_evaluations: The total number of objective evaluations performed. position: A `Tensor` containing the last argument value found during the search. If the search converged, then this value is the argmin of the objective function. objective_value: A tensor containing the value of the objective function at the `position`. If the search converged, then this is the (local) minimum of the objective function. final_simplex: The last simplex constructed before stopping. final_objective_values: The objective function evaluated at the vertices of the final simplex. initial_simplex: The starting simplex. initial_objective_values: The objective function evaluated at the vertices of the initial simplex. num_iterations: The number of iterations of the main algorithm body.

Returns

optimizer_results A namedtuple containing the following items: converged: Scalar boolean tensor indicating whether the minimum was found within tolerance. num_objective_evaluations: The total number of objective evaluations performed. position: A Tensor containing the last argument value found during the search. If the search converged, then this value is the argmin of the objective function. objective_value: A tensor containing the value of the objective function at the position. If the search converged, then this is the (local) minimum of the objective function. final_simplex: The last simplex constructed before stopping. final_objective_values: The objective function evaluated at the vertices of the final simplex. initial_simplex: The starting simplex. initial_objective_values: The objective function evaluated at the vertices of the initial simplex. num_iterations: The number of iterations of the main algorithm body.

Raises
`ValueError`	If any of the following conditions hold If none or more than one of `initial_simplex` and `initial_vertex` are supplied. If `initial_simplex` and `step_sizes` are both specified.

Raises

ValueError

If any of the following conditions hold

If none or more than one of initial_simplex and initial_vertex are supplied.
If initial_simplex and step_sizes are both specified.