tf.keras.metrics.CategoricalCrossentropy

Computes the crossentropy metric between the labels and predictions.

Inherits From: MeanMetricWrapper, Mean, Metric, Layer, Module

This is the crossentropy metric class to be used when there are multiple label classes (2 or more). Here we assume that labels are given as a one_hot representation. eg., When labels values are [2, 0, 1], y_true = [[0, 0, 1], [1, 0, 0], [0, 1, 0]].

name (Optional) string name of the metric instance.
dtype (Optional) data type of the metric result.
from_logits (Optional) Whether output is expected to be a logits tensor. By default, we consider that output encodes a probability distribution.
label_smoothing (Optional) Float in [0, 1]. When > 0, label values are smoothed, meaning the confidence on label values are relaxed. e.g. label_smoothing=0.2 means that we will use a value of 0.1 for label 0 and 0.9 for label 1"

Standalone usage:

# EPSILON = 1e-7, y = y_true, y` = y_pred
# y` = clip_ops.clip_by_value(output, EPSILON, 1. - EPSILON)
# y` = [[0.05, 0.95, EPSILON], [0.1, 0.8, 0.1]]
# xent = -sum(y * log(y'), axis = -1)
#      = -((log 0.95), (log 0.1))
#      = [0.051, 2.302]
# Reduced xent = (0.051 + 2.302) / 2
m = tf.keras.metrics.CategoricalCrossentropy()
m.update_state([[0, 1, 0], [0, 0, 1]],
               [[0.05, 0.95, 0], [0.1, 0.8, 0.1]])
m.result().numpy()
1.1769392
m.reset_state()
m.update_state([[0, 1, 0], [0, 0, 1]],
               [[0.05, 0.95, 0], [0.1, 0.8, 0.1]],
               sample_weight=tf.constant([0.3, 0.7]))
m.result().numpy()
1.6271976

Usage with compile() API:

model.compile(
  optimizer='sgd',
  loss='mse',
  metrics=[tf.keras.metrics.CategoricalCrossentropy()])

Methods

merge_state

View source

Merges the state from one or more metrics.

This method can be used by distributed systems to merge the state computed by different metric instances. Typically the state will be stored in the form of the metric's weights. For example, a tf.keras.metrics.Mean metric contains a list of two weight values: a total and a count. If there were two instances of a tf.keras.metrics.Accuracy that each independently aggregated partial state for an overall accuracy calculation, these two metric's states could be combined as follows:

m1 = tf.keras.metrics.Accuracy()
_ = m1.update_state([[1], [2]], [[0], [2]])
m2 = tf.keras.metrics.Accuracy()
_ = m2.update_state([[3], [4]], [[3], [4]])
m2.merge_state([m1])
m2.result().numpy()
0.75

Args
metrics an iterable of metrics. The metrics must have compatible state.

Raises
ValueError If the provided iterable does not contain metrics matching the metric's required specifications.

reset_state

View source

Resets all of the metric state variables.

This function is called between epochs/steps, when a metric is evaluated during training.

result

View source

Computes and returns the metric value tensor.

Result computation is an idempotent operation that simply calculates the metric value using the state variables.

update_state

View source

Accumulates metric statistics.

For sparse categorical metrics, the shapes of y_true and y_pred are different.

Args
y_true Ground truth label values. shape = [batch_size, d0, .. dN-1] or shape = [batch_size, d0, .. dN-1, 1].
y_pred The predicted probability values. shape = [batch_size, d0, .. dN].
sample_weight Optional sample_weight acts as a coefficient for the metric. If a scalar is provided, then the metric is simply scaled by the given value. If sample_weight is a tensor of size [batch_size], then the metric for each sample of the batch is rescaled by the corresponding element in the sample_weight vector. If the shape of sample_weight is [batch_size, d0, .. dN-1] (or can be broadcasted to this shape), then each metric element of y_pred is scaled by the corresponding value of sample_weight. (Note on dN-1: all metric functions reduce by 1 dimension, usually the last axis (-1)).

Returns
Update op.