View source on GitHub |
Hints for collective operations like AllReduce.
tf.distribute.experimental.CollectiveHints(
bytes_per_pack=0
)
This can be passed to methods like
tf.distribute.get_replica_context().all_reduce()
to optimize collective
operation performance. Note that these are only hints, which may or may not
change the actual behavior. Some options only apply to certain strategy and
are ignored by others.
One common optimization is to break gradients all-reduce into multiple packs so that weight updates can overlap with gradient all-reduce.
Example:
hints = tf.distribute.experimental.CollectiveHints(
bytes_per_pack=50 * 1024 * 1024)
grads = tf.distribute.get_replica_context().all_reduce(
'sum', grads, experimental_hints=hints)
optimizer.apply_gradients(zip(grads, vars),
experimental_aggregate_gradients=False)
Args | |
---|---|
bytes_per_pack
|
A non-negative integer. Breaks collective operations into
packs of certain size. If it's zero, the value is determined
automatically. This only applies to all-reduce with
MultiWorkerMirroredStrategy currently.
|
Raises | |
---|---|
ValueError
|
When arguments have invalid value. |