Generate a randomly distorted bounding box for an image deterministically.
tf.raw_ops.StatelessSampleDistortedBoundingBox(
image_size,
bounding_boxes,
min_object_covered,
seed,
aspect_ratio_range=[0.75, 1.33],
area_range=[0.05, 1],
max_attempts=100,
use_image_if_no_bounding_boxes=False,
name=None
)
Bounding box annotations are often supplied in addition to ground-truth labels
in image recognition or object localization tasks. A common technique for
training such a system is to randomly distort an image while preserving its
content, i.e. data augmentation. This Op, given the same seed
,
deterministically outputs a randomly distorted localization of an object, i.e.
bounding box, given an image_size
, bounding_boxes
and a series of
constraints.
The output of this Op is a single bounding box that may be used to crop the
original image. The output is returned as 3 tensors: begin
, size
and
bboxes
. The first 2 tensors can be fed directly into tf.slice
to crop the
image. The latter may be supplied to tf.image.draw_bounding_boxes
to visualize
what the bounding box looks like.
Bounding boxes are supplied and returned as [y_min, x_min, y_max, x_max]
. The
bounding box coordinates are floats in [0.0, 1.0]
relative to the width and
the height of the underlying image.
The output of this Op is guaranteed to be the same given the same seed
and is
independent of how many times the function is called, and independent of global
seed settings (e.g. tf.random.set_seed
).
Example usage:
image = np.array([[[1], [2], [3]], [[4], [5], [6]], [[7], [8], [9]]])
bbox = tf.constant(
[0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])
seed = (1, 2)
# Generate a single distorted bounding box.
bbox_begin, bbox_size, bbox_draw = (
tf.image.stateless_sample_distorted_bounding_box(
tf.shape(image), bounding_boxes=bbox, seed=seed))
# Employ the bounding box to distort the image.
tf.slice(image, bbox_begin, bbox_size)
<tf.Tensor: shape=(2, 2, 1), dtype=int64, numpy=
array([[[1],
[2]],
[[4],
[5]]])>
# Draw the bounding box in an image summary.
colors = np.array([[1.0, 0.0, 0.0], [0.0, 0.0, 1.0]])
tf.image.draw_bounding_boxes(
tf.expand_dims(tf.cast(image, tf.float32),0), bbox_draw, colors)
<tf.Tensor: shape=(1, 3, 3, 1), dtype=float32, numpy=
array([[[[1.],
[1.],
[3.]],
[[1.],
[1.],
[6.]],
[[7.],
[8.],
[9.]]]], dtype=float32)>
Note that if no bounding box information is available, setting
use_image_if_no_bounding_boxes = true
will assume there is a single implicit
bounding box covering the whole image. If use_image_if_no_bounding_boxes
is
false and no bounding boxes are supplied, an error is raised.
Returns | |
---|---|
A tuple of Tensor objects (begin, size, bboxes).
|
|
begin
|
A Tensor . Has the same type as image_size .
|
size
|
A Tensor . Has the same type as image_size .
|
bboxes
|
A Tensor of type float32 .
|