Update relevant entries in 'var' and 'accum' according to the momentum scheme.
tf.raw_ops.SparseApplyMomentum(
var, accum, lr, grad, indices, momentum, use_locking=False, use_nesterov=False,
name=None
)
Set use_nesterov = True if you want to use Nesterov momentum.
That is for rows we have grad for, we update var and accum as follows:
$$accum = accum * momentum + grad$$
$$var -= lr * accum$$
Args | |
---|---|
var
|
A mutable Tensor . Must be one of the following types: float32 , float64 , int32 , uint8 , int16 , int8 , complex64 , int64 , qint8 , quint8 , qint32 , bfloat16 , uint16 , complex128 , half , uint32 , uint64 .
Should be from a Variable().
|
accum
|
A mutable Tensor . Must have the same type as var .
Should be from a Variable().
|
lr
|
A Tensor . Must have the same type as var .
Learning rate. Must be a scalar.
|
grad
|
A Tensor . Must have the same type as var . The gradient.
|
indices
|
A Tensor . Must be one of the following types: int32 , int64 .
A vector of indices into the first dimension of var and accum.
|
momentum
|
A Tensor . Must have the same type as var .
Momentum. Must be a scalar.
|
use_locking
|
An optional bool . Defaults to False .
If True , updating of the var and accum tensors will be protected
by a lock; otherwise the behavior is undefined, but may exhibit less
contention.
|
use_nesterov
|
An optional bool . Defaults to False .
If True , the tensor passed to compute grad will be
var - lr * momentum * accum, so in the end, the var you get is actually
var - lr * momentum * accum.
|
name
|
A name for the operation (optional). |
Returns | |
---|---|
A mutable Tensor . Has the same type as var .
|