Applies Batch Normalization [(Ioffe and Szegedy, 2015)][1] to samples from a
data distribution. This can be used to stabilize training of normalizing
flows ([Papamakarios et al., 2016][3]; [Dinh et al., 2017][2])
When training Deep Neural Networks (DNNs), it is common practice to
normalize or whiten features by shifting them to have zero mean and
scaling them to have unit variance.
The inverse() method of the BatchNormalization bijector, which is used in
the log-likelihood computation of data samples, implements the normalization
procedure (shift-and-scale) using the mean and standard deviation of the
current minibatch.
Conversely, the forward() method of the bijector de-normalizes samples (e.g.
X*std(Y) + mean(Y) with the running-average mean and standard deviation
computed at training-time. De-normalization is useful for sampling.
dist = tfd.TransformedDistribution(
distribution=tfd.Normal()),
bijector=tfb.BatchNorm())
y = tfd.MultivariateNormalDiag(loc=1., scale=2.).sample(100) # ~ N(1, 2)
x = dist.bijector.inverse(y) # ~ N(0, 1)
y = dist.sample() # ~ N(1, 2)
During training time, BatchNorm.inverse and BatchNorm.forward are not
guaranteed to be inverses of each other because inverse(y) uses statistics
of the current minibatch, while forward(x) uses running-average statistics
accumulated from training. In other words,
BatchNorm.inverse(BatchNorm.forward(...)) and
BatchNorm.forward(BatchNorm.inverse(...)) will be identical when
training=False but may be different when training=True.
References
[1]: Sergey Ioffe and Christian Szegedy. Batch Normalization: Accelerating
Deep Network Training by Reducing Internal Covariate Shift. In
International Conference on Machine Learning, 2015.
https://arxiv.org/abs/1502.03167
[2]: Laurent Dinh, Jascha Sohl-Dickstein, and Samy Bengio. Density Estimation
using Real NVP. In International Conference on Learning
Representations, 2017. https://arxiv.org/abs/1605.08803
[3]: George Papamakarios, Theo Pavlakou, and Iain Murray. Masked
Autoregressive Flow for Density Estimation. In Neural Information
Processing Systems, 2017. https://arxiv.org/abs/1705.07057
1e-6). This ensures positivity of the scale variable.
</td>
</tr><tr>
<td>training</td>
<td>
If True, updates running-average statistics during call toinverse().
</td>
</tr><tr>
<td>validate_args</td>
<td>
Pythonboolindicating whether arguments should be
checked for correctness.
</td>
</tr><tr>
<td>name</td>
<td>
Pythonstr` name given to ops managed by this object.
Tensor. The input to the "forward" Jacobian determinant evaluation.
event_ndims
Number of dimensions in the probabilistic events being
transformed. Must be greater than or equal to
self.forward_min_event_ndims. The result is summed over the final
dimensions to produce a scalar Jacobian determinant for each event,
i.e. it has shape x.shape.ndims - event_ndims dimensions.
name
The name to give this op.
Returns
Tensor, if this bijector is injective.
If not injective this is not implemented.
Raises
TypeError
if self.dtype is specified and y.dtype is not
self.dtype.
NotImplementedError
if neither _forward_log_det_jacobian
nor {_inverse, _inverse_log_det_jacobian} are implemented, or
this is a non-injective bijector.
Note that forward_log_det_jacobian is the negative of this function,
evaluated at g^{-1}(y).
Args
y
Tensor. The input to the "inverse" Jacobian determinant evaluation.
event_ndims
Number of dimensions in the probabilistic events being
transformed. Must be greater than or equal to
self.inverse_min_event_ndims. The result is summed over the final
dimensions to produce a scalar Jacobian determinant for each event,
i.e. it has shape y.shape.ndims - event_ndims dimensions.
name
The name to give this op.
Returns
Tensor, if this bijector is injective.
If not injective, returns the tuple of local log det
Jacobians, log(det(Dg_i^{-1}(y))), where g_i is the restriction
of g to the ith partition Di.
Raises
TypeError
if self.dtype is specified and y.dtype is not
self.dtype.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2020-10-01 UTC."],[],[]]