Calculate the sufficient statistics (average & second moment) of two sets.
tf_agents.utils.tensor_normalizer.parallel_variance_calculation(
n_a: tf_agents.typing.types.Int
,
avg_a: tf_agents.typing.types.Float
,
m2_a: tf_agents.typing.types.Float
,
n_b: tf_agents.typing.types.Int
,
avg_b: tf_agents.typing.types.Float
,
m2_b: tf_agents.typing.types.Float
,
m2_b_c: tf_agents.typing.types.Float
) -> Tuple[tf_agents.typing.types.Int
, tf_agents.typing.types.Float
, tf_agents.typing.types.Float
, tf_agents.typing.types.Float
]
For better precision if sets are of different sizes, a
should be the smaller
and b
the bigger.
For more details, see the parallel algorithm of Chan et al. at: https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm
For stability we use kahan_summation
to accumulate second moments.
Takes in the sufficient statistics for sets A
and B
and calculates the
variance and sufficient statistics for the union of A
and B
.
If e.g. B
is a single observation x_b
, use n_b=1
, avg_b = x_b
, and
m2_b = 0
.
To get avg_a
and m2_a
from a tensor x
of shape [n_a, ...]
, use:
n_a = tf.shape(x)[0]
avg_a = tf.math.reduce_mean(x, axis=[0])
m2_a = tf.math.reduce_sum(tf.math.squared_difference(t, avg_a), axis=[0])
Returns | |
---|---|
A tuple (n_ab, avg_ab, m2_ab, m2_ab_c) such that var_ab ,
the variance of A|B , may be calculated via var_ab = m2_ab / n_ab ,
and the sample variance assample_var_ab = m2_ab / (n_ab - 1) .
|