Cross Layer in Deep & Cross Network to learn explicit feature interactions.
tfrs.layers.dcn.Cross(
projection_dim: Optional[int] = None,
diag_scale: Optional[float] = 0.0,
use_bias: bool = True,
preactivation: Optional[Union[str, tf.keras.layers.Activation]] = None,
kernel_initializer: Union[Text, tf.keras.initializers.Initializer] = 'truncated_normal',
bias_initializer: Union[Text, tf.keras.initializers.Initializer] = 'zeros',
kernel_regularizer: Union[Text, None, tf.keras.regularizers.Regularizer] = None,
bias_regularizer: Union[Text, None, tf.keras.regularizers.Regularizer] = None,
**kwargs
)
Used in the notebooks
A layer that creates explicit and bounded-degree feature interactions
efficiently. The call
method accepts inputs
as a tuple of size 2
tensors. The first input x0
is the base layer that contains the original
features (usually the embedding layer); the second input xi
is the output
of the previous Cross
layer in the stack, i.e., the i-th Cross
layer. For the first Cross
layer in the stack, x0 = xi.
The output is x_{i+1} = x0 .* (W * xi + bias + diag_scale * xi) + xi,
where .* designates elementwise multiplication, W could be a full-rank
matrix, or a low-rank matrix U*V to reduce the computational cost, and
diag_scale increases the diagonal of W to improve training stability (
especially for the low-rank case).
Example |
# after embedding layer in a functional model:
input = tf.keras.Input(shape=(None,), name='index', dtype=tf.int64)
x0 = tf.keras.layers.Embedding(input_dim=32, output_dim=6)
x1 = Cross()(x0, x0)
x2 = Cross()(x0, x1)
logits = tf.keras.layers.Dense(units=10)(x2)
model = tf.keras.Model(input, logits)
|
Args |
projection_dim
|
project dimension to reduce the computational cost.
Default is None such that a full (input_dim by input_dim ) matrix
W is used. If enabled, a low-rank matrix W = U*V will be used, where U
is of size input_dim by projection_dim and V is of size
projection_dim by input_dim . projection_dim need to be smaller
than input_dim /2 to improve the model efficiency. In practice, we've
observed that projection_dim = d/4 consistently preserved the
accuracy of a full-rank version.
|
diag_scale
|
a non-negative float used to increase the diagonal of the
kernel W by diag_scale , that is, W + diag_scale * I, where I is an
identity matrix.
|
use_bias
|
whether to add a bias term for this layer. If set to False,
no bias term will be used.
|
preactivation
|
Activation applied to output matrix of the layer, before
multiplication with the input. Can be used to control the scale of the
layer's outputs and improve stability.
|
kernel_initializer
|
Initializer to use on the kernel matrix.
|
bias_initializer
|
Initializer to use on the bias vector.
|
kernel_regularizer
|
Regularizer to use on the kernel matrix.
|
bias_regularizer
|
Regularizer to use on bias vector.
|
Input shape: A tuple of 2 (batch_size, input_dim
) dimensional inputs.
Output shape: A single (batch_size, input_dim
) dimensional output.
Methods
call
View source
call(
x0: tf.Tensor, x: Optional[tf.Tensor] = None
) -> tf.Tensor
Computes the feature cross.
Args |
x0
|
The input tensor
|
x
|
Optional second input tensor. If provided, the layer will compute
crosses between x0 and x; if not provided, the layer will compute
crosses between x0 and itself.
|
Returns |
Tensor of crosses.
|