View source on GitHub |
A running variance computation.
Inherits From: RunningCovariance
, AutoCompositeTensor
tfp.experimental.stats.RunningVariance(
num_samples,
mean,
sum_squared_residuals,
event_ndims,
name='RunningCovariance'
)
This is just an alias for RunningCovariance
, with the event_ndims
set to 0
to compute variances.
RunningVariance
is meant to serve general streaming variance needs.
For a specialized version that fits streaming over MCMC samples, see
VarianceReducer
in tfp.experimental.mcmc
.
Methods
covariance
covariance(
ddof=0
)
Returns the covariance accumulated so far.
Args | |
---|---|
ddof
|
Requested dynamic degrees of freedom for the covariance calculation.
For example, use ddof=0 for population covariance and ddof=1 for
sample covariance. Defaults to the population covariance.
|
Returns | |
---|---|
covariance
|
An estimate of the covariance. |
from_example
@classmethod
from_example( example )
Starts a RunningVariance
from an example.
Args | |
---|---|
example
|
A Tensor . The RunningVariance will accept samples
of the same dtype and broadcast-compatible shape as the example.
|
Returns | |
---|---|
var
|
An empty RunningVariance , ready for incoming samples. Note
that by convention, the supplied example is used only for
initialization, but not counted as a sample.
|
from_shape
@classmethod
from_shape( shape=(), dtype=tf.float32 )
Starts a RunningVariance
from shape and dtype metadata.
Args | |
---|---|
shape
|
Python Tuple or TensorShape representing the shape of incoming
samples. This is useful to supply if the RunningVariance will be
carried by a tf.while_loop , so that broadcasting does not change the
shape across loop iterations.
|
dtype
|
Dtype of incoming samples and the resulting statistics.
By default, the dtype is tf.float32 . Any integer dtypes will be
cast to corresponding floats (i.e. tf.int32 will be cast to
tf.float32 ), as intermediate calculations should be performing
floating-point division.
|
Returns | |
---|---|
var
|
An empty RunningCovariance , ready for incoming samples.
|
from_stats
@classmethod
from_stats( num_samples, mean, variance )
Initialize a RunningVariance
object with given stats.
This allows the user to initialize knowing the mean, variance, and number of samples seen so far.
Args | |
---|---|
num_samples
|
Scalar float Tensor , for number of examples already seen.
|
mean
|
float Tensor , for starting mean of estimate.
|
variance
|
float Tensor , for starting estimate of the variance.
|
Returns | |
---|---|
RunningVariance object, with given mean and variance estimate.
|
tree_flatten
tree_flatten()
tree_unflatten
@classmethod
tree_unflatten( metadata, tensors )
update
update(
new_sample, axis=None
)
Update the RunningCovariance
with a new sample.
The update formula is from Philippe Pebay (2008) [1]. This implementation supports both batched and chunked covariance computation. A "batch" is the usual parallel computation, namely a batch of size N implies N independent covariance computations, each stepping one sample (or chunk) at a time. A "chunk" of size M implies incorporating M samples into a single covariance computation at once, which is more efficient than one by one.
To further illustrate the difference between batching and chunking, consider the following example:
# treat as 3 samples from each of 5 independent vector random variables of
# shape (2,)
sample = tf.ones((3, 5, 2))
running_cov = tfp.experimental.stats.RunningCovariance.from_shape(
(5, 2), event_ndims=1)
running_cov = running_cov.update(sample, axis=0)
final_cov = running_cov.covariance()
final_cov.shape # (5, 2, 2)
Args | |
---|---|
new_sample
|
Incoming sample with shape and dtype compatible with those
used to form this RunningCovariance .
|
axis
|
If chunking is desired, this is an integer that specifies the axis
with chunked samples. For individual samples, set this to None . By
default, samples are not chunked (axis is None).
|
Returns | |
---|---|
cov
|
Newly allocated RunningCovariance updated to include new_sample .
|
References
[1]: Philippe Pebay. Formulas for Robust, One-Pass Parallel Computation of Covariances and Arbitrary-Order Statistical Moments. Technical Report SAND2008-6212, 2008. https://prod-ng.sandia.gov/techlib-noauth/access-control.cgi/2008/086212.pdf
variance
variance(
ddof=0
)
Returns the variance accumulated so far.
Args | |
---|---|
ddof
|
Requested dynamic degrees of freedom for the variance calculation.
For example, use ddof=0 for population variance and ddof=1 for
sample variance. Defaults to the population variance.
|
Returns | |
---|---|
variance
|
An estimate of the variance. |