View source on GitHub |
Registers a dataset with the tf.data service.
tf.data.experimental.service.register_dataset(
service, dataset, compression='AUTO'
)
register_dataset
registers a dataset with the tf.data service so that
datasets can be created later with
tf.data.experimental.service.from_dataset_id
. This is useful when the
dataset
is registered by one process, then used in another process. When the same
process is both registering and reading from the dataset, it is simpler to use
tf.data.experimental.service.distribute
instead.
If the dataset is already registered with the tf.data service,
register_dataset
returns the already-registered dataset's id.
dispatcher = tf.data.experimental.service.DispatchServer()
dispatcher_address = dispatcher.target.split("://")[1]
worker = tf.data.experimental.service.WorkerServer(
tf.data.experimental.service.WorkerConfig(
dispatcher_address=dispatcher_address))
dataset = tf.data.Dataset.range(10)
dataset_id = tf.data.experimental.service.register_dataset(
dispatcher.target, dataset)
dataset = tf.data.experimental.service.from_dataset_id(
processing_mode="parallel_epochs",
service=dispatcher.target,
dataset_id=dataset_id,
element_spec=dataset.element_spec)
print(list(dataset.as_numpy_iterator()))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Args | |
---|---|
service
|
A string or a tuple indicating how to connect to the tf.data
service. If it's a string, it should be in the format
[<protocol>://]<address> , where <address> identifies the dispatcher
address and <protocol> can optionally be used to override the default
protocol to use. If it's a tuple, it should be (protocol, address).
|
dataset
|
A tf.data.Dataset to register with the tf.data service.
|
compression
|
(Optional.) How to compress the dataset's elements before
transferring them over the network. "AUTO" leaves the decision of how to
compress up to the tf.data service runtime. None indicates not to
compress.
|
Returns | |
---|---|
A scalar int64 tensor of the registered dataset's id. |