View source on GitHub |
Abstract base class for converting between text and integers.
A note on padding:
Because text data is typically variable length and nearly always requires
padding during training, ID 0 is always reserved for padding. To accommodate
this, all TextEncoder
s behave in certain ways:
encode
: never returns id 0 (all ids are 1+)decode
: drops 0 in the input idsvocab_size
: includes ID 0New subclasses should be careful to match this behavior.
Attributes | |
---|---|
vocab_size
|
Size of the vocabulary. Decode produces ints [1, vocab_size). |
Methods
decode
@abc.abstractmethod
decode( ids )
Decodes a list of integers into text.
encode
@abc.abstractmethod
encode( s )
Encodes text into a list of integers.
load_from_file
@classmethod
@abc.abstractmethod
load_from_file( filename_prefix )
Load from file. Inverse of save_to_file.
save_to_file
@abc.abstractmethod
save_to_file( filename_prefix )
Store to file. Inverse of load_from_file.