text.Trimmer

Truncates a list of segments using a pre-determined truncation strategy.

Methods

generate_mask

View source

Generates a boolean mask specifying which portions of segments to drop.

Users should be able to use the results of generate_mask() to drop items in segments using tf.ragged.boolean_mask(seg, mask).

Args
segments A list of RaggedTensor each w/ a shape of [num_batch, (num_items)].

Returns
a list with len(segments) number of items and where each item is a RaggedTensor with the same shape as its counterpart in segments and with a boolean dtype where each value is True if the corresponding value in segments should be kept and False if it should be dropped instead.

trim

View source

Truncate the list of segments.

Truncate the list of segments using the truncation strategy defined by generate_mask.

Args
segments A list of RaggedTensors w/ shape [num_batch, (num_items)].

Returns
a list of RaggedTensors with len(segments) number of items and where each item has the same shape as its counterpart in segments and with unwanted values dropped. The values are dropped according to the TruncationStrategy defined.