common_voice

References:

ab

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ab')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 8
'other' 752
'test' 9
'train' 22
'validated' 31
'validation' 0
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ar

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ar')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 6333
'other' 18283
'test' 7622
'train' 14227
'validated' 43291
'validation' 7517
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

as

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/as')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 31
'other' 0
'test' 110
'train' 270
'validated' 504
'validation' 124
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

br

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/br')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 623
'other' 10912
'test' 2087
'train' 2780
'validated' 8560
'validation' 1997
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ca

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ca')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 18846
'other' 64446
'test' 15724
'train' 285584
'validated' 416701
'validation' 15724
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cnh

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/cnh')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 433
'other' 2934
'test' 752
'train' 807
'validated' 2432
'validation' 756
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cs

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/cs')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 685
'other' 7475
'test' 4144
'train' 5655
'validated' 30431
'validation' 4118
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/cv')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 1282
'other' 6927
'test' 788
'train' 931
'validated' 3496
'validation' 818
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cy

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/cy')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 3648
'other' 17919
'test' 4820
'train' 6839
'validated' 72984
'validation' 4776
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

de

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/de')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 32789
'other' 10095
'test' 15588
'train' 246525
'validated' 565186
'validation' 15588
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

dv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/dv')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 840
'other' 0
'test' 2202
'train' 2680
'validated' 11866
'validation' 2077
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

el

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/el')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 185
'other' 5659
'test' 1522
'train' 2316
'validated' 5996
'validation' 1401
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

en

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/en')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 189562
'other' 169895
'test' 16164
'train' 564337
'validated' 1224864
'validation' 16164
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

eo

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/eo')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 4736
'other' 2946
'test' 8969
'train' 19587
'validated' 58094
'validation' 8987
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

es

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/es')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 40640
'other' 144791
'test' 15089
'train' 161813
'validated' 236314
'validation' 15089
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

et

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/et')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 3557
'other' 569
'test' 2509
'train' 2966
'validated' 10683
'validation' 2507
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

eu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/eu')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 5387
'other' 23570
'test' 5172
'train' 7505
'validated' 63009
'validation' 5172
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fa

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/fa')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 11698
'other' 22510
'test' 5213
'train' 7593
'validated' 251659
'validation' 5213
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/fi')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 59
'other' 149
'test' 428
'train' 460
'validated' 1305
'validation' 415
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/fr')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 40351
'other' 3222
'test' 15763
'train' 298982
'validated' 461004
'validation' 15763
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fy-NL

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/fy-NL')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 1031
'other' 21569
'test' 3020
'train' 3927
'validated' 10495
'validation' 2790
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ga-IE

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ga-IE')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 409
'other' 2130
'test' 506
'train' 541
'validated' 3352
'validation' 497
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

hi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/hi')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 60
'other' 139
'test' 127
'train' 157
'validated' 419
'validation' 135
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

hsb

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/hsb')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 227
'other' 62
'test' 387
'train' 808
'validated' 1367
'validation' 172
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

hu

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/hu')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 169
'other' 295
'test' 1649
'train' 3348
'validated' 6457
'validation' 1434
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ia

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ia')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 192
'other' 1095
'test' 899
'train' 3477
'validated' 5978
'validation' 1601
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

id

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/id')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 470
'other' 6782
'test' 1844
'train' 2130
'validated' 8696
'validation' 1835
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

it

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/it')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 12189
'other' 14549
'test' 12928
'train' 58015
'validated' 102579
'validation' 12928
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ja

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ja')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 504
'other' 885
'test' 632
'train' 722
'validated' 3072
'validation' 586
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ka

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ka')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 139
'other' 44
'test' 656
'train' 1058
'validated' 2275
'validation' 527
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

kab

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/kab')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 18134
'other' 88021
'test' 14622
'train' 120530
'validated' 573718
'validation' 14622
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ky

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ky')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 926
'other' 7223
'test' 1503
'train' 1955
'validated' 9236
'validation' 1511
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

lg

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/lg')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 290
'other' 3110
'test' 584
'train' 1250
'validated' 2220
'validation' 384
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

lt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/lt')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 102
'other' 1629
'test' 466
'train' 931
'validated' 1644
'validation' 244
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

lv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/lv')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 143
'other' 1560
'test' 1882
'train' 2552
'validated' 6444
'validation' 2002
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

mn

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/mn')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 667
'other' 3272
'test' 1862
'train' 2183
'validated' 7487
'validation' 1837
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

mt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/mt')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 314
'other' 5714
'test' 1617
'train' 2036
'validated' 5747
'validation' 1516
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

nl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/nl')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 3308
'other' 27
'test' 5708
'train' 9460
'validated' 52488
'validation' 4938
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

or

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/or')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 62
'other' 4302
'test' 98
'train' 388
'validated' 615
'validation' 129
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

pa-IN

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/pa-IN')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 43
'other' 1411
'test' 116
'train' 211
'validated' 371
'validation' 44
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

pl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/pl')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 4601
'other' 12848
'test' 5153
'train' 7468
'validated' 90791
'validation' 5153
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

pt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/pt')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 1740
'other' 8390
'test' 4641
'train' 6514
'validated' 41584
'validation' 4592
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

rm-sursilv

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/rm-sursilv')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 639
'other' 2102
'test' 1194
'train' 1384
'validated' 3783
'validation' 1205
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

rm-vallader

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/rm-vallader')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 374
'other' 727
'test' 378
'train' 574
'validated' 1316
'validation' 357
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ro

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ro')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 485
'other' 1945
'test' 1778
'train' 3399
'validated' 6039
'validation' 858
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ru

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ru')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 3056
'other' 10247
'test' 8007
'train' 15481
'validated' 74256
'validation' 7963
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

rw

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/rw')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 206790
'other' 22923
'test' 15724
'train' 515197
'validated' 832929
'validation' 15032
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

sah

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/sah')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 66
'other' 1275
'test' 757
'train' 1442
'validated' 2606
'validation' 405
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

sl

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/sl')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 92
'other' 2502
'test' 881
'train' 2038
'validated' 4669
'validation' 556
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

sv-SE

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/sv-SE')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 462
'other' 3043
'test' 2027
'train' 2331
'validated' 12552
'validation' 2019
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ta

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/ta')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 594
'other' 7428
'test' 1781
'train' 2009
'validated' 12652
'validation' 1779
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

th

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/th')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 467
'other' 2671
'test' 2188
'train' 2917
'validated' 7028
'validation' 1922
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

tr

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/tr')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 1726
'other' 325
'test' 1647
'train' 1831
'validated' 18685
'validation' 1647
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

tt

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/tt')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 287
'other' 1798
'test' 4485
'train' 11211
'validated' 25781
'validation' 2127
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

uk

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/uk')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 1255
'other' 8161
'test' 3235
'train' 4035
'validated' 22337
'validation' 3236
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

vi

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/vi')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 78
'other' 870
'test' 198
'train' 221
'validated' 619
'validation' 200
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

vot

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/vot')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 6
'other' 411
'test' 0
'train' 3
'validated' 3
'validation' 0
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

zh-CN

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/zh-CN')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 5305
'other' 8948
'test' 8760
'train' 18541
'validated' 36405
'validation' 8743
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

zh-HK

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/zh-HK')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 2999
'other' 38830
'test' 5172
'train' 7506
'validated' 41835
'validation' 5172
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

zh-TW

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:common_voice/zh-TW')
  • Description:
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but were always adding more voices and languages.
Split Examples
'invalidated' 3584
'other' 22477
'test' 2895
'train' 3507
'validated' 61232
'validation' 2895
  • Features:
{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}