That is for rows we have grad for, we update var, accum and linear as follows:
accum_new = accum + grad * grad
linear += grad - (accum_new^(-lr_power) - accum^(-lr_power)) / lr * var
quadratic = 1.0 / (accum_new^(lr_power) * lr) + 2 * l2
var = (sign(linear) * l1 - linear) / quadratic if |linear| > l1 else 0.0
accum = accum_new
Args
var
A Tensor of type resource. Should be from a Variable().
accum
A Tensor of type resource. Should be from a Variable().
linear
A Tensor of type resource. Should be from a Variable().
grad
A Tensor. Must be one of the following types: float32, float64, int32, uint8, int16, int8, complex64, int64, qint8, quint8, qint32, bfloat16, qint16, quint16, uint16, complex128, half, uint32, uint64.
The gradient.
indices
A Tensor. Must be one of the following types: int32, int64.
A vector of indices into the first dimension of var and accum.
lr
A Tensor. Must have the same type as grad.
Scaling factor. Must be a scalar.
l1
A Tensor. Must have the same type as grad.
L1 regularization. Must be a scalar.
l2
A Tensor. Must have the same type as grad.
L2 regularization. Must be a scalar.
lr_power
A Tensor. Must have the same type as grad.
Scaling factor. Must be a scalar.
use_locking
An optional bool. Defaults to False.
If True, updating of the var and accum tensors will be protected
by a lock; otherwise the behavior is undefined, but may exhibit less
contention.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-01-23 UTC."],[],[]]