Factory for string and values aggregation by IBLT.
Inherits From: UnweightedAggregationFactory
tff.analytics.IbltFactory(
*,
capacity: int,
string_max_bytes: int,
encoding: tff.analytics.heavy_hitters.iblt.CharacterEncoding
= tff.analytics.heavy_hitters.iblt.CharacterEncoding.UTF8
,
repetitions: int,
seed: int = 0,
sketch_agg_factory: Optional[tff.aggregators.UnweightedAggregationFactory
] = None,
value_tensor_agg_factory: Optional[tff.aggregators.UnweightedAggregationFactory
] = None
) -> None
Args |
capacity
|
The capacity of the IBLT sketch. Must be positive.
|
string_max_bytes
|
The maximum length in bytes of a string in the IBLT.
Must be positive.
|
encoding
|
The character encoding of the string data to encode. For
non-character binary data or strings with unknown encoding, specify
CharacterEncoding.UNKNOWN .
|
repetitions
|
The number of repetitions in IBLT data structure (must be >=
3). Must be at least 3 .
|
seed
|
An integer seed for hash functions. Defaults to 0.
|
sketch_agg_factory
|
(Optional) A UnweightedAggregationFactory specifying
the value aggregation to sum IBLT sketches. Defaults to
tff.aggregators.SumFactory . If sketch_agg_factory is set to a
tff.aggregators.SecureSumFactory , then the upper_bound_threshold
should be at least 2 ** 32 - 1.
|
value_tensor_agg_factory
|
(Optional) A UnweightedAggregationFactory
specifying the value aggregation to sum value tensors. Defaults to
tff.aggregators.SumFactory . Note that when using sketch_agg_factory
is set to a tff.aggregators.SecureSumFactory , the value to be summed
might be clipped depends on the choices of upper_bound_threshold and
lower_bound_threshold parameters in SecureSumFactory .
|
Raises |
ValueError
|
if parameters don't meet expectations.
|
Methods
create
View source
create(
value_type: tff.types.SequenceType
) -> tff.templates.AggregationProcess
Creates an AggregationProcess using IBLT to aggregate strings.
Args |
value_type
|
A tff.SequenceType representing the type of the input
dataset, must be compatible with the following tff.Type :
tff.SequenceType(collections.OrderedDict([ (DATASETKEY, np.str),
(DATASET_VALUE, tff.TensorType(shape=[None], dtype=np.int64)), ]))
|
Raises |
ValueError
|
If value_type is not as expected.
|