tf.data.experimental.service.register_dataset

Registers a dataset with the tf.data service.

register_dataset registers a dataset with the tf.data service so that datasets can be created later with tf.data.experimental.service.from_dataset_id. This is useful when the dataset is registered by one process, then used in another process. When the same process is both registering and reading from the dataset, it is simpler to use tf.data.experimental.service.distribute instead.

If the dataset is already registered with the tf.data service, register_dataset returns the already-registered dataset's id.

dispatcher = tf.data.experimental.service.DispatchServer()
dispatcher_address = dispatcher.target.split("://")[1]
worker = tf.data.experimental.service.WorkerServer(
    tf.data.experimental.service.WorkerConfig(
        dispatcher_address=dispatcher_address))
dataset = tf.data.Dataset.range(10)
dataset_id = tf.data.experimental.service.register_dataset(
    dispatcher.target, dataset)
dataset = tf.data.experimental.service.from_dataset_id(
    processing_mode="parallel_epochs",
    service=dispatcher.target,
    dataset_id=dataset_id,
    element_spec=dataset.element_spec)
print(list(dataset.as_numpy_iterator()))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

service A string or a tuple indicating how to connect to the tf.data service. If it's a string, it should be in the format [<protocol>://]<address>, where <address> identifies the dispatcher address and <protocol> can optionally be used to override the default protocol to use. If it's a tuple, it should be (protocol, address).
dataset A tf.data.Dataset to register with the tf.data service.
compression (Optional.) How to compress the dataset's elements before transferring them over the network. "AUTO" leaves the decision of how to compress up to the tf.data service runtime. None indicates not to compress.

A scalar int64 tensor of the registered dataset's id.