View source on GitHub |
tfds.features.FeatureConnector
for audio.
Inherits From: Tensor
, FeatureConnector
tfds.features.Audio(
*,
file_format: Optional[str] = None,
shape: utils.Shape = (None,),
dtype: type_utils.TfdsDType = np.int64,
sample_rate: Optional[int] = None,
encoding: Union[str, Encoding] = tfds.features.Encoding.NONE
,
doc: feature_lib.DocArg = None,
lazy_decode: bool = False
)
In _generate_examples
, Audio accept:
- A
np.ndarray
of shape(length,)
or(length, channels)
- A path to a
.mp3
,.wav
,... file. - A file-object (e.g.
with path.open('rb') as fobj:
)
By default, Audio features are decoded as the raw integer wave form
tf.Tensor(shape=(None,), dtype=tf.int64)
.
When encoding an audio with a different number of channels than expected by the feature, TFDS automatically tries to correct the number of channels.
Args | |
---|---|
file_format
|
str , the audio file format. Can be any format ffmpeg
understands. If None , will attempt to infer from the file extension.
|
shape
|
tuple , shape of the data.
|
dtype
|
The dtype of the data. |
sample_rate
|
int , additional metadata exposed to the user through
info.features['audio'].sample_rate . This value isn't used neither in
encoding nor decoding.
|
encoding
|
Internal encoding. See tfds.features.Encoding for available
values.
|
doc
|
Documentation of this feature (e.g. description). |
lazy_decode
|
bool , if set True then stores audio as is and decodes it
to numpy array when loaded. Otherwise saves decoded audio.
|
Methods
catalog_documentation
catalog_documentation() -> List[CatalogFeatureDocumentation]
Returns the feature documentation to be shown in the catalog.
cls_from_name
@classmethod
cls_from_name( python_class_name: str ) -> Type['FeatureConnector']
Returns the feature class for the given Python class.
decode_batch_example
decode_batch_example(
example_data
)
See base class for details.
decode_example
decode_example(
tfexample_data
)
See base class for details.
decode_example_np
decode_example_np(
example_data: type_utils.NpArrayOrScalar
) -> type_utils.NpArrayOrScalar
Encode the feature dict into NumPy-compatible input.
Args | |
---|---|
example_data
|
Value to convert to NumPy. |
Returns | |
---|---|
np_data
|
Data as NumPy-compatible type: either a Python primitive (bytes, int, etc) or a NumPy array. |
decode_ragged_example
decode_ragged_example(
example_data
)
See base class for details.
encode_example
encode_example(
audio_or_path_or_fobj
)
Convert the given audio into a dict convertible to tf example.
from_config
@classmethod
from_config( root_dir: str ) -> FeatureConnector
Reconstructs the FeatureConnector from the config file.
Usage:
features = FeatureConnector.from_config('path/to/dir')
Args | |
---|---|
root_dir
|
Directory containing the features.json file. |
Returns | |
---|---|
The reconstructed feature instance. |
from_json
@classmethod
from_json( value: Json ) -> FeatureConnector
FeatureConnector factory.
This function should be called from the tfds.features.FeatureConnector
base class. Subclass should implement the from_json_content
.
Example:
feature = tfds.features.FeatureConnector.from_json(
{'type': 'Image', 'content': {'shape': [32, 32, 3], 'dtype': 'uint8'} }
)
assert isinstance(feature, tfds.features.Image)
Args | |
---|---|
value
|
dict(type=, content=) containing the feature to restore. Match
dict returned by to_json .
|
Returns | |
---|---|
The reconstructed FeatureConnector. |
from_json_content
@classmethod
from_json_content( value: Union[Json, feature_pb2.AudioFeature] ) -> 'Audio'
FeatureConnector factory (to overwrite).
Subclasses should overwrite this method. This method is used when importing the feature connector from the config.
This function should not be called directly. FeatureConnector.from_json
should be called instead.
See existing FeatureConnectors for implementation examples.
Args | |
---|---|
value
|
FeatureConnector information represented as either Json or a
Feature proto. The content must match what is returned by
to_json_content .
|
doc
|
Documentation of this feature (e.g. description). |
Returns | |
---|---|
The reconstructed FeatureConnector. |
from_proto
@classmethod
from_proto( feature_proto: feature_pb2.Feature ) -> T
Instantiates a feature from its proto representation.
get_serialized_info
get_serialized_info()
See base class for details.
get_tensor_info
get_tensor_info() -> feature_lib.TensorInfo
See base class for details.
get_tensor_spec
get_tensor_spec() -> TreeDict[tf.TensorSpec]
Returns the tf.TensorSpec of this feature (not the element spec!).
Note that the output of this method may not correspond to the element spec of the dataset. For example, currently this method does not support RaggedTensorSpec.
load_metadata
load_metadata(
data_dir: epath.PathLike, feature_name: Optional[str]
)
Restore the feature metadata from disk.
If a dataset is re-loaded and generated files exists on disk, this function will restore the feature metadata from the saved file.
Args | |
---|---|
data_dir
|
path to the dataset folder to which save the info (ex:
~/datasets/cifar10/1.2.0/ )
|
feature_name
|
the name of the feature (from the FeaturesDict key) |
repr_html
repr_html(
ex: np.ndarray
) -> str
Audio are displayed in the player.
repr_html_batch
repr_html_batch(
ex: np.ndarray
) -> str
Returns the HTML str representation of the object (Sequence).
repr_html_ragged
repr_html_ragged(
ex: np.ndarray
) -> str
Returns the HTML str representation of the object (Nested sequence).
save_config
save_config(
root_dir: str
) -> None
Exports the FeatureConnector
to a file.
Args | |
---|---|
root_dir
|
path/to/dir containing the features.json
|
save_metadata
save_metadata(
data_dir: epath.PathLike, feature_name: Optional[str]
) -> None
Save the feature metadata on disk.
This function is called after the data has been generated (by
_download_and_prepare
) to save the feature connector info with the
generated dataset.
Some dataset/features dynamically compute info during
_download_and_prepare
. For instance:
- Labels are loaded from the downloaded data
- Vocabulary is created from the downloaded data
- ImageLabelFolder compute the image dtypes/shape from the manual_dir
After the info have been added to the feature, this function allow to save those additional info to be restored the next time the data is loaded.
By default, this function do not save anything, but sub-classes can overwrite the function.
Args | |
---|---|
data_dir
|
path to the dataset folder to which save the info (ex:
~/datasets/cifar10/1.2.0/ )
|
feature_name
|
the name of the feature (from the FeaturesDict key) |
to_json
to_json() -> Json
Exports the FeatureConnector to Json.
Each feature is serialized as a dict(type=..., content=...)
.
type
: The cannonical name of the feature (module.FeatureName
).content
: is specific to each feature connector and defined into_json_content
. Can contain nested sub-features (like fortfds.features.FeaturesDict
andtfds.features.Sequence
).
For example:
tfds.features.FeaturesDict({
'input': tfds.features.Image(),
'target': tfds.features.ClassLabel(num_classes=10),
})
Is serialized as:
{
"type": "tensorflow_datasets.core.features.features_dict.FeaturesDict",
"content": {
"input": {
"type": "tensorflow_datasets.core.features.image_feature.Image",
"content": {
"shape": [null, null, 3],
"dtype": "uint8",
"encoding_format": "png"
}
},
"target": {
"type":
"tensorflow_datasets.core.features.class_label_feature.ClassLabel",
"content": {
"num_classes": 10
}
}
}
}
Returns | |
---|---|
A dict(type=, content=) . Will be forwarded to from_json when
reconstructing the feature.
|
to_json_content
to_json_content() -> feature_pb2.AudioFeature
FeatureConnector factory (to overwrite).
This function should be overwritten by the subclass to allow re-importing the feature connector from the config. See existing FeatureConnector for example of implementation.
Returns | |
---|---|
The FeatureConnector metadata in either a dict, or a Feature proto. This
output is used in from_json_content when reconstructing the feature.
|
to_proto
to_proto() -> feature_pb2.Feature
Exports the FeatureConnector to the Feature proto.
For features that have a specific schema defined in a proto, this function needs to be overriden. If there's no specific proto schema, then the feature will be represented using JSON.
Returns | |
---|---|
The feature proto describing this feature. |
Class Variables | |
---|---|
ALIASES |
['tensorflow_datasets.core.features.feature.Tensor']
|