View source on GitHub |
Parses SequenceExample to feature maps.
tfr.data.parse_from_sequence_example(
serialized,
list_size=None,
context_feature_spec=None,
example_feature_spec=None,
size_feature_name=None,
mask_feature_name=None,
shuffle_examples=False,
seed=None
)
The FixedLenFeature
in example_feature_spec
is converted to
FixedLenSequenceFeature
to parse feature_list
in SequenceExample. We keep
track of the non-trivial default_values (e.g., -1 for labels) for features in
example_feature_spec
and use them to replace the parsing defaults of the
SequenceExample (i.e., 0 for numbers and "" for strings). Due to this
complexity, we only allow scalar non-trivial default values for numbers.
When list_size
is None, the 2nd dim of the output Tensors are not fixed and
vary from batch to batch. When list_size
is specified as a positive integer,
truncation or padding is applied so that the 2nd dim of the output Tensors is
the specified list_size
.
Example:
serialized = [
sequence_example {
context {
feature {
key: "query_length"
value { int64_list { value: 3 } }
}
}
feature_lists {
feature_list {
key: "unigrams"
value {
feature { bytes_list { value: "tensorflow" } }
feature { bytes_list { value: ["learning" "to" "rank"] } }
}
}
feature_list {
key: "utility"
value {
feature { float_list { value: 0.0 } }
feature { float_list { value: 1.0 } }
}
}
}
}
sequence_example {
context {
feature {
key: "query_length"
value { int64_list { value: 2 } }
}
}
feature_lists {
feature_list {
key: "unigrams"
value {
feature { bytes_list { value: "gbdt" } }
feature { }
}
}
feature_list {
key: "utility"
value {
feature { float_list { value: 0.0 } }
feature { float_list { value: 0.0 } }
}
}
}
}
]
We can use arguments:
context_feature_spec: {
"query_length": tf.io.FixedLenFeature([1], dtypes.int64)
}
example_feature_spec: {
"unigrams": tf.io.VarLenFeature(dtypes.string),
"utility": tf.io.FixedLenFeature([1], dtypes.float32,
default_value=[0.])
}
And the expected output is:
{
"unigrams": SparseTensor(
indices=array([[0, 0, 0], [0, 1, 0], [0, 1, 1], [0, 1, 2], [1, 0, 0], [1,
1, 0], [1, 1, 1]]),
values=["tensorflow", "learning", "to", "rank", "gbdt"],
dense_shape=array([2, 2, 3])),
"utility": [[[ 0.], [ 1.]], [[ 0.], [ 0.]]],
"query_length": [[3], [2]],
}
Args | |
---|---|
serialized
|
(Tensor) A string Tensor for a batch of serialized SequenceExample. |
list_size
|
(int) The number of frames to keep for a SequenceExample. If specified, truncation or padding may happen. Otherwise, the output Tensors have a dynamic list size. |
context_feature_spec
|
(dict) A mapping from feature keys to
FixedLenFeature or VarLenFeature values for context.
|
example_feature_spec
|
(dict) A mapping from feature keys to
FixedLenFeature or VarLenFeature values for the list of examples.
These features are stored in the feature_lists field in SequenceExample.
FixedLenFeature is translated to FixedLenSequenceFeature to parse
SequenceExample. Note that no missing value in the middle of a
feature_list is allowed for frames.
|
size_feature_name
|
(str) Name of feature for example list sizes. Populates
the feature dictionary with a tf.int32 Tensor of shape [batch_size] for
this feature name. If None, which is default, this feature is not
generated.
|
mask_feature_name
|
(str) Name of feature for example list masks. Populates
the feature dictionary with a tf.bool Tensor of shape [batch_size,
list_size] for this feature name. If None, which is default, this feature
is not generated.
|
shuffle_examples
|
(bool) A boolean to indicate whether examples within a list are shuffled before the list is trimmed down to list_size elements (when list has more than list_size elements). |
seed
|
(int) A seed passed onto random_ops.uniform() to shuffle examples. |
Returns | |
---|---|
A mapping from feature keys to Tensor or SparseTensor .
|