View source on GitHub |
Additive attention layer, a.k.a. Bahdanau-style attention.
Inherits From: Attention
, Layer
, Operation
tf.keras.layers.AdditiveAttention(
use_scale=True, dropout=0.0, **kwargs
)
Inputs are a list with 2 or 3 elements:
- A
query
tensor of shape(batch_size, Tq, dim)
. - A
value
tensor of shape(batch_size, Tv, dim)
. - A optional
key
tensor of shape(batch_size, Tv, dim)
. If none supplied,value
will be used askey
.
The calculation follows the steps:
- Calculate attention scores using
query
andkey
with shape(batch_size, Tq, Tv)
as a non-linear sumscores = reduce_sum(tanh(query + key), axis=-1)
. - Use scores to calculate a softmax distribution with shape
(batch_size, Tq, Tv)
. - Use the softmax distribution to create a linear combination of
value
with shape(batch_size, Tq, dim)
.
Args | |
---|---|
use_scale
|
If True , will create a scalar variable to scale the
attention scores.
|
dropout
|
Float between 0 and 1. Fraction of the units to drop for the
attention scores. Defaults to 0.0 .
|
Output | |
---|---|
Attention outputs of shape (batch_size, Tq, dim) .
(Optional) Attention scores after masking and softmax with shape
(batch_size, Tq, Tv) .
|
Methods
from_config
@classmethod
from_config( config )
Creates a layer from its config.
This method is the reverse of get_config
,
capable of instantiating the same layer from the config
dictionary. It does not handle layer connectivity
(handled by Network), nor weights (handled by set_weights
).
Args | |
---|---|
config
|
A Python dictionary, typically the output of get_config. |
Returns | |
---|---|
A layer instance. |
symbolic_call
symbolic_call(
*args, **kwargs
)