Creates a baseline task for next-character prediction on Shakespeare.
tff.simulation.baselines.shakespeare.create_character_prediction_task(
train_client_spec: tff.simulation.baselines.ClientSpec
,
eval_client_spec: Optional[tff.simulation.baselines.ClientSpec
] = None,
sequence_length: int = DEFAULT_SEQUENCE_LENGTH,
cache_dir: Optional[str] = None,
use_synthetic_data: bool = False
) -> tff.simulation.baselines.BaselineTask
The goal of the task is to take sequence_length
characters (eg. alpha-
numeric characters and puctuation characters) and predict the next character.
Here, all sentences are drawn from the collected works of William Shakespeare,
and a client corresponds to role in a play.
Args |
train_client_spec
|
A tff.simulation.baselines.ClientSpec specifying how to
preprocess train client data.
|
eval_client_spec
|
An optional tff.simulation.baselines.ClientSpec
specifying how to preprocess evaluation client data. If set to None , the
evaluation datasets will use a batch size of 64 with no extra
preprocessing.
|
sequence_length
|
A positive integer dictating the length of each example in
a client's dataset. By default, this is set to
tff.simulation.baselines.shakespeare.DEFAULT_SEQUENCE_LENGTH .
|
cache_dir
|
An optional directory to cache the downloadeded datasets. If
None , they will be cached to ~/.tff/ .
|
use_synthetic_data
|
A boolean indicating whether to use synthetic
Shakespeare data. This option should only be used for testing purposes, in
order to avoid downloading the entire Shakespeare dataset.
|