TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

yes_no

Description:

Sixty recordings of one individual saying yes or no in Hebrew; each recording is eight words long.

The main point of the dataset is to provide an easy and fast way to test out the Kaldi scripts for free.

The archive "waves_yesno.tar.gz" contains 60 .wav files, sampled at 8 kHz. All were recorded by the same male speaker, in Hebrew. In each file, the individual says 8 words; each word is either the Hebrew for "yes" or "no", so each file is a random sequence of 8 yes-es or noes. There is no separate transcription provided; the sequence is encoded in the filename, with 1 for yes and 0 for no.

Additional Documentation: Explore on Papers With Code
Homepage: https://www.openslr.org/1/
Source code: tfds.audio.yesno.YesNo
Versions:
- 1.0.0 (default): No release notes.
Download size: 4.49 MiB
Dataset size: 16.27 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'train'`	60

Feature structure:

FeaturesDict({
    'audio': Audio(shape=(None,), dtype=int64),
    'audio/filename': Text(shape=(), dtype=string),
    'label': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=2)),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
audio	Audio	(None,)	int64
audio/filename	Text		string
label	Sequence(ClassLabel)	(None,)	int64

Supervised keys (See as_supervised doc): ('audio', 'label')
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):

Citation:

@ONLINE {YesNo,
    author = "Created for the Kaldi Project",
    title  = "YesNo",
    url    = "http://www.openslr.org/1/"
}