make_accumulators_ptransform: this is a beam.PTransform which maps data
to a more compact mergeable representation (accumulator). Mergeable here
means that it is possible to combine multiple representations produced from
a partition of the dataset into a representation of the entire dataset.
merge_accumulators_ptransform: this is a beam.PTransform which operates
on a collection of accumulators, i.e. the results of both the
make_accumulators_ptransform and merge_accumulators_ptransform stages,
and produces a single reduced accumulator. This operation must be
associative and commutative in order to have reliably reproducible results.
extract_output: this is a beam.PTransform which operates on the result of
the merge_accumulators_ptransform stage, and produces the outputs of the
analyzer. These outputs must be consistent with the output_dtypes and
output_shapes provided to ptransform_analyzer.
This container also holds a cache_coder (PTransformAnalyzerCacheCoder)
which can encode outputs and decode the inputs of the
merge_accumulators_ptransform stage.
In many cases, SimpleJsonPTransformAnalyzerCacheCoder would be sufficient.
To ensure the correctness of this analyzer, the following must hold:
merge(make({D1, ..., Dn})) == merge({make(D1), ..., make(Dn)})