tfp.experimental.stats.RunningVariance

A running variance computation.

Inherits From: RunningCovariance, AutoCompositeTensor

tfp.experimental.stats.RunningVariance(
    num_samples,
    mean,
    sum_squared_residuals,
    event_ndims,
    name='RunningCovariance'
)

This is just an alias for RunningCovariance, with the event_ndims set to 0 to compute variances.

RunningVariance is meant to serve general streaming variance needs. For a specialized version that fits streaming over MCMC samples, see VarianceReducer in tfp.experimental.mcmc.

Methods

`covariance`

View source

covariance(
    ddof=0
)

Returns the covariance accumulated so far.

Args
`ddof`	Requested dynamic degrees of freedom for the covariance calculation. For example, use `ddof=0` for population covariance and `ddof=1` for sample covariance. Defaults to the population covariance.

Returns
`covariance`	An estimate of the covariance.

`from_example`

View source

@classmethod
from_example(
    example
)

Starts a RunningVariance from an example.

Args
`example`	A `Tensor`. The `RunningVariance` will accept samples of the same dtype and broadcast-compatible shape as the example.

Returns
`var`	An empty `RunningVariance`, ready for incoming samples. Note that by convention, the supplied example is used only for initialization, but not counted as a sample.

`from_shape`

View source

@classmethod
from_shape(
    shape=(), dtype=tf.float32
)

Starts a RunningVariance from shape and dtype metadata.

Args
`shape`	Python `Tuple` or `TensorShape` representing the shape of incoming samples. This is useful to supply if the `RunningVariance` will be carried by a `tf.while_loop`, so that broadcasting does not change the shape across loop iterations.
`dtype`	Dtype of incoming samples and the resulting statistics. By default, the dtype is `tf.float32`. Any integer dtypes will be cast to corresponding floats (i.e. `tf.int32` will be cast to `tf.float32`), as intermediate calculations should be performing floating-point division.

Returns
`var`	An empty `RunningCovariance`, ready for incoming samples.

`from_stats`

View source

@classmethod
from_stats(
    num_samples, mean, variance
)

Initialize a RunningVariance object with given stats.

This allows the user to initialize knowing the mean, variance, and number of samples seen so far.

Args
`num_samples`	Scalar `float` `Tensor`, for number of examples already seen.
`mean`	`float` `Tensor`, for starting mean of estimate.
`variance`	`float` `Tensor`, for starting estimate of the variance.

Returns
`RunningVariance` object, with given mean and variance estimate.

`tree_flatten`

View source

tree_flatten()

`tree_unflatten`

View source

@classmethod
tree_unflatten(
    metadata, tensors
)

`update`

View source

update(
    new_sample, axis=None
)

Update the RunningCovariance with a new sample.

The update formula is from Philippe Pebay (2008) [1]. This implementation supports both batched and chunked covariance computation. A "batch" is the usual parallel computation, namely a batch of size N implies N independent covariance computations, each stepping one sample (or chunk) at a time. A "chunk" of size M implies incorporating M samples into a single covariance computation at once, which is more efficient than one by one.

To further illustrate the difference between batching and chunking, consider the following example:

# treat as 3 samples from each of 5 independent vector random variables of
# shape (2,)
sample = tf.ones((3, 5, 2))
running_cov = tfp.experimental.stats.RunningCovariance.from_shape(
    (5, 2), event_ndims=1)
running_cov = running_cov.update(sample, axis=0)
final_cov = running_cov.covariance()
final_cov.shape # (5, 2, 2)

Args
`new_sample`	Incoming sample with shape and dtype compatible with those used to form this `RunningCovariance`.
`axis`	If chunking is desired, this is an integer that specifies the axis with chunked samples. For individual samples, set this to `None`. By default, samples are not chunked (`axis` is None).

Returns
`cov`	Newly allocated `RunningCovariance` updated to include `new_sample`.

References

[1]: Philippe Pebay. Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments. Technical Report SAND2008-6212, 2008. https://prod-ng.sandia.gov/techlib-noauth/access-control.cgi/2008/086212.pdf

`variance`

View source

variance(
    ddof=0
)

Returns the variance accumulated so far.

Args
`ddof`	Requested dynamic degrees of freedom for the variance calculation. For example, use `ddof=0` for population variance and `ddof=1` for sample variance. Defaults to the population variance.

Returns
`variance`	An estimate of the variance.