Find the token count of each document/row.
tft.word_count(
tokens: Union[tf.SparseTensor, tf.RaggedTensor], name: Optional[str] = None
) -> tf.Tensor
tokens
is either a RaggedTensor
or SparseTensor
, representing tokenized
strings. This function simply returns size of each row, so the dtype is not
constrained to string.
Example:
sparse = tf.SparseTensor(indices=[[0, 0], [0, 1], [2, 2]],
values=['a', 'b', 'c'], dense_shape=(4, 4))
tft.word_count(sparse)
<tf.Tensor: shape=(4,), dtype=int64, numpy=array([2, 0, 1, 0])>
Args |
tokens
|
either
(1) a SparseTensor , or
(2) a RaggedTensor with ragged rank of 1, non-ragged rank of 1
of dtype tf.string containing tokens to be counted
|
name
|
(Optional) A name for this operation.
|
Returns |
A one-dimensional Tensor the token counts of each row.
|
Raises |
ValueError
|
if tokens is neither sparse nor ragged
|