tf.nn.max_pool2d

Performs max pooling on 2D spatial data such as images.

This is a more specific version of tf.nn.max_pool where the input tensor is 4D, representing 2D spatial data such as images. Using these APIs are equivalent

Downsamples the input images along theirs spatial dimensions (height and width) by taking its maximum over an input window defined by ksize. The window is shifted by strides along each dimension.

For example, for strides=(2, 2) and padding=VALID windows that extend outside of the input are not included in the output:

x = tf.constant([[1., 2., 3., 4.],
                 [5., 6., 7., 8.],
                 [9., 10., 11., 12.]])
# Add the `batch` and `channels` dimensions.
x = x[tf.newaxis, :, :, tf.newaxis]
result = tf.nn.max_pool2d(x, ksize=(2, 2), strides=(2, 2),
                          padding="VALID")
result[0, :, :, 0]
<tf.Tensor: shape=(1, 2), dtype=float32, numpy=
array([[6., 8.]], dtype=float32)>

With padding=SAME, we get:

x = tf.constant([[1., 2., 3., 4.],
                 [5., 6., 7., 8.],
                 [9., 10., 11., 12.]])
x = x[tf.newaxis, :, :, tf.newaxis]
result = tf.nn.max_pool2d(x, ksize=(2, 2), strides=(2, 2),
                          padding='SAME')
result[0, :, :, 0]
<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[ 6., 8.],
       [10.,12.]], dtype=float32)>

We can also specify padding explicitly. The following example adds width-1 padding on all sides (top, bottom, left, right):

x = tf.constant([[1., 2., 3., 4.],
                 [5., 6., 7., 8.],
                 [9., 10., 11., 12.]])
x = x[tf.newaxis, :, :, tf.newaxis]
result = tf.nn.max_pool2d(x, ksize=(2, 2), strides=(2, 2),
                          padding=[[0, 0], [1, 1], [1, 1], [0, 0]])
result[0, :, :, 0]
<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[ 1., 3., 4.],
       [ 9., 11., 12.]], dtype=float32)>

For more examples and detail, see tf.nn.max_pool.

input A 4-D Tensor of the format specified by data_format.
ksize An int or list of ints that has length 1, 2 or 4. The size of the window for each dimension of the input tensor. If only one integer is specified, then we apply the same window for all 4 dims. If two are provided then we use those for H, W dimensions and keep N, C dimension window size = 1.
strides An int or list of ints that has length 1, 2 or 4. The stride of the sliding window for each dimension of the input tensor. If only one integer is specified, we apply the same stride to all 4 dims. If two are provided we use those for the H, W dimensions and keep N, C of stride = 1.
padding Either the string "SAME" or "VALID" indicating the type of padding algorithm to use, or a list indicating the explicit paddings at the start and end of each dimension. See here for more information. When explicit padding is used and data_format is "NHWC", this should be in the form [[0, 0], [pad_top, pad_bottom], [pad_left, pad_right], [0, 0]]. When explicit padding used and data_format is "NCHW", this should be in the form [[0, 0], [0, 0], [pad_top, pad_bottom], [pad_left, pad_right]]. When using explicit padding, the size of the paddings cannot be greater than the sliding window size.
data_format A string. 'NHWC', 'NCHW' and 'NCHW_VECT_C' are supported.
name Optional name for the operation.

A Tensor of format specified by data_format. The max pooled output tensor.