Decodes the output of a CTC model.
tf.keras.ops.ctc_decode(
inputs,
sequence_lengths,
strategy='greedy',
beam_width=100,
top_paths=1,
merge_repeated=True,
mask_index=None
)
Args |
inputs
|
A tensor of shape (batch_size, max_length, num_classes)
containing the logits (the output of the model).
They should not be normalized via softmax.
|
sequence_lengths
|
A tensor of shape (batch_size,) containing the
sequence lengths for the batch.
|
strategy
|
A string for the decoding strategy. Supported values are
"greedy" and "beam_search" .
|
beam_width
|
An integer scalar beam width used in beam search.
Defaults to 100.
|
top_paths
|
An integer scalar, the number of top paths to return.
Defaults to 1.
|
merge_repeated
|
A boolean scalar, whether to merge repeated
labels in the output. Defaults to True .
|
mask_index
|
An integer scalar, the index of the mask character in
the vocabulary. Defaults to None .
|
Returns |
A tuple containing:
- The tensor representing the list of decoded sequences. If
strategy="greedy" , the shape is (1, batch_size, max_length) . If
strategy="beam_seatch" , the shape is
(top_paths, batch_size, max_length) . Note that: -1 indicates the
blank label.
- If
strategy="greedy" , a tensor of shape (batch_size, 1)
representing the negative of the sum of the probability logits for
each sequence. If strategy="beam_seatch" , a tensor of shape
(batch_size, top_paths) representing the log probability for each
sequence.
|