Gets unique elements and their counts from the input dataset
.
tff.analytics.data_processing.get_unique_elements_with_counts(
dataset: tf.data.Dataset, string_max_bytes: Optional[int] = None
) -> tuple[tf.Tensor, tf.Tensor]
This method returns a tuple of elements
and counts
, where elements
are
the unique elements in the dataset, and counts is the number of times each one
appears.
The input dataset
must yield batched rank-1 tensors. This function reads
each coordinate of the tensor as an individual element and caps the total
number of elements to return.
Args |
dataset
|
A tf.data.Dataset to elements from. Element type must be
tf.string .
|
string_max_bytes
|
The maximum length (in bytes) of strings in the dataset.
Strings longer than string_max_bytes will be truncated. Defaults to
None , which means there is no limit of the string length.
|
Returns |
elements
|
A rank-1 Tensor containing all the unique elements of the input
dataset .
|
counts
|
A rank-1 Tensor containing the counts for each of the elements in
elements .
|
Raises |
ValueError
|
-- If the shape of elements in dataset is not rank 1
-- If string_max_bytes is not None and is less than 1.
|
TypeError
|
If dataset.element_spec.dtype must be tf.string is not
tf.string .
|