View source on GitHub |
A preprocessing layer which normalizes continuous features.
Inherits From: PreprocessingLayer
, Layer
, Module
tf.keras.layers.Normalization(
axis=-1, mean=None, variance=None, invert=False, **kwargs
)
This layer will shift and scale inputs into a distribution centered around
0 with standard deviation 1. It accomplishes this by precomputing the mean
and variance of the data, and calling (input - mean) / sqrt(var)
at
runtime.
The mean and variance values for the layer must be either supplied on
construction or learned via adapt()
. adapt()
will compute the mean and
variance of the data and store them as the layer's weights. adapt()
should
be called before fit()
, evaluate()
, or predict()
.
For an overview and full list of preprocessing layers, see the preprocessing guide.
Examples:
Calculate a global mean and variance by analyzing the dataset in adapt()
.
adapt_data = np.array([1., 2., 3., 4., 5.], dtype='float32')
input_data = np.array([1., 2., 3.], dtype='float32')
layer = tf.keras.layers.Normalization(axis=None)
layer.adapt(adapt_data)
layer(input_data)
<tf.Tensor: shape=(3,), dtype=float32, numpy=
array([-1.4142135, -0.70710677, 0.], dtype=float32)>
Calculate a mean and variance for each index on the last axis.
adapt_data = np.array([[0., 7., 4.],
[2., 9., 6.],
[0., 7., 4.],
[2., 9., 6.]], dtype='float32')
input_data = np.array([[0., 7., 4.]], dtype='float32')
layer = tf.keras.layers.Normalization(axis=-1)
layer.adapt(adapt_data)
layer(input_data)
<tf.Tensor: shape=(1, 3), dtype=float32, numpy=
array([-1., -1., -1.], dtype=float32)>
Pass the mean and variance directly.
input_data = np.array([[1.], [2.], [3.]], dtype='float32')
layer = tf.keras.layers.Normalization(mean=3., variance=2.)
layer(input_data)
<tf.Tensor: shape=(3, 1), dtype=float32, numpy=
array([[-1.4142135 ],
[-0.70710677],
[ 0. ]], dtype=float32)>
Use the layer to de-normalize inputs (after adapting the layer).
adapt_data = np.array([[0., 7., 4.],
[2., 9., 6.],
[0., 7., 4.],
[2., 9., 6.]], dtype='float32')
input_data = np.array([[1., 2., 3.]], dtype='float32')
layer = tf.keras.layers.Normalization(axis=-1, invert=True)
layer.adapt(adapt_data)
layer(input_data)
<tf.Tensor: shape=(1, 3), dtype=float32, numpy=
array([2., 10., 8.], dtype=float32)>
Attributes | |
---|---|
is_adapted
|
Whether the layer has been fit to data already. |
Methods
adapt
adapt(
data, batch_size=None, steps=None
)
Computes the mean and variance of values in a dataset.
Calling adapt()
on a Normalization
layer is an alternative to
passing in mean
and variance
arguments during layer construction. A
Normalization
layer should always either be adapted over a dataset or
passed mean
and variance
.
During adapt()
, the layer will compute a mean
and variance
separately for each position in each axis specified by the axis
argument. To calculate a single mean
and variance
over the input
data, simply pass axis=None
.
In order to make Normalization
efficient in any distribution context,
the computed mean and variance are kept static with respect to any
compiled tf.Graph
s that call the layer. As a consequence, if the layer
is adapted a second time, any models using the layer should be
re-compiled. For more information see
tf.keras.layers.experimental.preprocessing.PreprocessingLayer.adapt
.
adapt()
is meant only as a single machine utility to compute layer
state. To analyze a dataset that cannot fit on a single machine, see
Tensorflow Transform
for a multi-machine, map-reduce solution.
Arguments | |
---|---|
data
|
The data to train on. It can be passed either as a
tf.data.Dataset , or as a numpy array.
|
batch_size
|
Integer or None .
Number of samples per state update.
If unspecified, batch_size will default to 32.
Do not specify the batch_size if your data is in the
form of datasets, generators, or keras.utils.Sequence instances
(since they generate batches).
|
steps
|
Integer or None .
Total number of steps (batches of samples)
When training with input tensors such as
TensorFlow data tensors, the default None is equal to
the number of samples in your dataset divided by
the batch size, or 1 if that cannot be determined. If x is a
tf.data dataset, and 'steps' is None, the epoch will run until
the input dataset is exhausted. When passing an infinitely
repeating dataset, you must specify the steps argument. This
argument is not supported with array inputs.
|
compile
compile(
run_eagerly=None, steps_per_execution=None
)
Configures the layer for adapt
.
Arguments | |
---|---|
run_eagerly
|
Bool. If True , this Model 's
logic will not be wrapped in a tf.function . Recommended to leave
this as None unless your Model cannot be run inside a
tf.function . Defaults to False .
|
steps_per_execution
|
Int. The number of batches to run
during each tf.function call. Running multiple batches inside a
single tf.function call can greatly improve performance on TPUs or
small models with a large Python overhead. Defaults to 1 .
|
reset_state
reset_state()
Resets the statistics of the preprocessing layer.
update_state
update_state(
data
)
Accumulates statistics for the preprocessing layer.
Arguments | |
---|---|
data
|
A mini-batch of inputs to the layer. |