View source on GitHub |
An Example
is a mostly-normalized data format for storing data for training and inference.
It contains a key-value store features
where each key (string) maps to a
tf.train.Feature
message. This flexible and compact format allows the
storage of large amounts of typed data, but requires that the data shape
and use be determined by the configuration files and parsers that are used to
read and write this format.
In TensorFlow, Example
s are read in row-major
format, so any configuration that describes data with rank-2 or above
should keep this in mind. For example, to store an M x N
matrix of bytes,
the tf.train.BytesList
must contain M*N bytes, with M
rows of N
contiguous values
each. That is, the BytesList
value must store the matrix as:
.... row 0 .... // .... row 1 .... // ........... // ... row M-1 ....
An Example
for a movie recommendation application:
features {
feature {
key: "age"
value { float_list {
value: 29.0
} }
}
feature {
key: "movie"
value { bytes_list {
value: "The Shawshank Redemption"
value: "Fight Club"
} }
}
feature {
key: "movie_ratings"
value { float_list {
value: 9.0
value: 9.7
} }
}
feature {
key: "suggestion"
value { bytes_list {
value: "Inception"
} }
}
Note:that this feature exists to be used as a label in training.
# E.g., if training a logistic regression model to predict purchase
# probability in our learning tool we would set the label feature to
# "suggestion_purchased".
feature {
key: "suggestion_purchased"
value { float_list {
value: 1.0
} }
}
# Similar to "suggestion_purchased" above this feature exists to be used
# as a label in training.
# E.g., if training a linear regression model to predict purchase
# price in our learning tool we would set the label feature to
# "purchase_price".
feature {
key: "purchase_price"
value { float_list {
value: 9.99
} }
}
}
A conformant Example
dataset obeys the following conventions:
- If a Feature
K
exists in one example with data typeT
, it must be of typeT
in all other examples when present. It may be omitted. - The number of instances of Feature
K
list data may vary across examples, depending on the requirements of the model. - If a Feature
K
doesn't exist in an example, aK
-specific default will be used, if configured. - If a Feature
K
exists in an example but contains no items, the intent is considered to be an empty tensor and no default will be used.
Attributes | |
---|---|
features
|
Features features
|