Références :
ab
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ab')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 8 |
'other' | 752 |
'test' | 9 |
'train' | 22 |
'validated' | 31 |
'validation' | 0 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ar
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ar')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 6333 |
'other' | 18283 |
'test' | 7622 |
'train' | 14227 |
'validated' | 43291 |
'validation' | 7517 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
comme
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/as')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 31 |
'other' | 0 |
'test' | 110 |
'train' | 270 |
'validated' | 504 |
'validation' | 124 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
br
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/br')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 623 |
'other' | 10912 |
'test' | 2087 |
'train' | 2780 |
'validated' | 8560 |
'validation' | 1997 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
Californie
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ca')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 18846 |
'other' | 64446 |
'test' | 15724 |
'train' | 285584 |
'validated' | 416701 |
'validation' | 15724 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cnh
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/cnh')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 433 |
'other' | 2934 |
'test' | 752 |
'train' | 807 |
'validated' | 2432 |
'validation' | 756 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cs
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/cs')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 685 |
'other' | 7475 |
'test' | 4144 |
'train' | 5655 |
'validated' | 30431 |
'validation' | 4118 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cv
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/cv')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 1282 |
'other' | 6927 |
'test' | 788 |
'train' | 931 |
'validated' | 3496 |
'validation' | 818 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cy
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/cy')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 3648 |
'other' | 17919 |
'test' | 4820 |
'train' | 6839 |
'validated' | 72984 |
'validation' | 4776 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
de
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/de')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 32789 |
'other' | 10095 |
'test' | 15588 |
'train' | 246525 |
'validated' | 565186 |
'validation' | 15588 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
dv
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/dv')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 840 |
'other' | 0 |
'test' | 2202 |
'train' | 2680 |
'validated' | 11866 |
'validation' | 2077 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
el
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/el')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 185 |
'other' | 5659 |
'test' | 1522 |
'train' | 2316 |
'validated' | 5996 |
'validation' | 1401 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fr
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/en')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 189562 |
'other' | 169895 |
'test' | 16164 |
'train' | 564337 |
'validated' | 1224864 |
'validation' | 16164 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
eo
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/eo')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 4736 |
'other' | 2946 |
'test' | 8969 |
'train' | 19587 |
'validated' | 58094 |
'validation' | 8987 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
es
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/es')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 40640 |
'other' | 144791 |
'test' | 15089 |
'train' | 161813 |
'validated' | 236314 |
'validation' | 15089 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
et
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/et')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 3557 |
'other' | 569 |
'test' | 2509 |
'train' | 2966 |
'validated' | 10683 |
'validation' | 2507 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
UE
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/eu')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 5387 |
'other' | 23570 |
'test' | 5172 |
'train' | 7505 |
'validated' | 63009 |
'validation' | 5172 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fa
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/fa')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 11698 |
'other' | 22510 |
'test' | 5213 |
'train' | 7593 |
'validated' | 251659 |
'validation' | 5213 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fi
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/fi')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 59 |
'other' | 149 |
'test' | 428 |
'train' | 460 |
'validated' | 1305 |
'validation' | 415 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fr
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/fr')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 40351 |
'other' | 3222 |
'test' | 15763 |
'train' | 298982 |
'validated' | 461004 |
'validation' | 15763 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fy-NL
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/fy-NL')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 1031 |
'other' | 21569 |
'test' | 3020 |
'train' | 3927 |
'validated' | 10495 |
'validation' | 2790 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ga-IE
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ga-IE')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 409 |
'other' | 2130 |
'test' | 506 |
'train' | 541 |
'validated' | 3352 |
'validation' | 497 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
Salut
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/hi')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 60 |
'other' | 139 |
'test' | 127 |
'train' | 157 |
'validated' | 419 |
'validation' | 135 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
HSB
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/hsb')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 227 |
'other' | 62 |
'test' | 387 |
'train' | 808 |
'validated' | 1367 |
'validation' | 172 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
hein
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/hu')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 169 |
'other' | 295 |
'test' | 1649 |
'train' | 3348 |
'validated' | 6457 |
'validation' | 1434 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
je
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ia')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 192 |
'other' | 1095 |
'test' | 899 |
'train' | 3477 |
'validated' | 5978 |
'validation' | 1601 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
identifiant
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/id')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 470 |
'other' | 6782 |
'test' | 1844 |
'train' | 2130 |
'validated' | 8696 |
'validation' | 1835 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
il
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/it')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 12189 |
'other' | 14549 |
'test' | 12928 |
'train' | 58015 |
'validated' | 102579 |
'validation' | 12928 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
oui
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ja')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 504 |
'other' | 885 |
'test' | 632 |
'train' | 722 |
'validated' | 3072 |
'validation' | 586 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ka
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ka')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 139 |
'other' | 44 |
'test' | 656 |
'train' | 1058 |
'validated' | 2275 |
'validation' | 527 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
kab
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/kab')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 18134 |
'other' | 88021 |
'test' | 14622 |
'train' | 120530 |
'validated' | 573718 |
'validation' | 14622 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ky
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ky')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 926 |
'other' | 7223 |
'test' | 1503 |
'train' | 1955 |
'validated' | 9236 |
'validation' | 1511 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
LG
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/lg')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 290 |
'other' | 3110 |
'test' | 584 |
'train' | 1250 |
'validated' | 2220 |
'validation' | 384 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
lt
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/lt')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 102 |
'other' | 1629 |
'test' | 466 |
'train' | 931 |
'validated' | 1644 |
'validation' | 244 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
lv
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/lv')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 143 |
'other' | 1560 |
'test' | 1882 |
'train' | 2552 |
'validated' | 6444 |
'validation' | 2002 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
minute
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/mn')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 667 |
'other' | 3272 |
'test' | 1862 |
'train' | 2183 |
'validated' | 7487 |
'validation' | 1837 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
mont
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/mt')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 314 |
'other' | 5714 |
'test' | 1617 |
'train' | 2036 |
'validated' | 5747 |
'validation' | 1516 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
nl
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/nl')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 3308 |
'other' | 27 |
'test' | 5708 |
'train' | 9460 |
'validated' | 52488 |
'validation' | 4938 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ou
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/or')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 62 |
'other' | 4302 |
'test' | 98 |
'train' | 388 |
'validated' | 615 |
'validation' | 129 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
douleur
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/pa-IN')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 43 |
'other' | 1411 |
'test' | 116 |
'train' | 211 |
'validated' | 371 |
'validation' | 44 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
svp
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/pl')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 4601 |
'other' | 12848 |
'test' | 5153 |
'train' | 7468 |
'validated' | 90791 |
'validation' | 5153 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
pt
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/pt')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 1740 |
'other' | 8390 |
'test' | 4641 |
'train' | 6514 |
'validated' | 41584 |
'validation' | 4592 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
rm-sursilv
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/rm-sursilv')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 639 |
'other' | 2102 |
'test' | 1194 |
'train' | 1384 |
'validated' | 3783 |
'validation' | 1205 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
rm-vallader
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/rm-vallader')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 374 |
'other' | 727 |
'test' | 378 |
'train' | 574 |
'validated' | 1316 |
'validation' | 357 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ro
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ro')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 485 |
'other' | 1945 |
'test' | 1778 |
'train' | 3399 |
'validated' | 6039 |
'validation' | 858 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ru')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 3056 |
'other' | 10247 |
'test' | 8007 |
'train' | 15481 |
'validated' | 74256 |
'validation' | 7963 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
rw
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/rw')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 206790 |
'other' | 22923 |
'test' | 15724 |
'train' | 515197 |
'validated' | 832929 |
'validation' | 15032 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sah
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/sah')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 66 |
'other' | 1275 |
'test' | 757 |
'train' | 1442 |
'validated' | 2606 |
'validation' | 405 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sl
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/sl')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 92 |
'other' | 2502 |
'test' | 881 |
'train' | 2038 |
'validated' | 4669 |
'validation' | 556 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sv-SE
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/sv-SE')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 462 |
'other' | 3043 |
'test' | 2027 |
'train' | 2331 |
'validated' | 12552 |
'validation' | 2019 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ta
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/ta')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 594 |
'other' | 7428 |
'test' | 1781 |
'train' | 2009 |
'validated' | 12652 |
'validation' | 1779 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ème
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/th')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 467 |
'other' | 2671 |
'test' | 2188 |
'train' | 2917 |
'validated' | 7028 |
'validation' | 1922 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
tr
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/tr')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 1726 |
'other' | 325 |
'test' | 1647 |
'train' | 1831 |
'validated' | 18685 |
'validation' | 1647 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
tt
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/tt')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 287 |
'other' | 1798 |
'test' | 4485 |
'train' | 11211 |
'validated' | 25781 |
'validation' | 2127 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
Royaume-Uni
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/uk')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 1255 |
'other' | 8161 |
'test' | 3235 |
'train' | 4035 |
'validated' | 22337 |
'validation' | 3236 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
vi
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/vi')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 78 |
'other' | 870 |
'test' | 198 |
'train' | 221 |
'validated' | 619 |
'validation' | 200 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
voter
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/vot')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 6 |
'other' | 411 |
'test' | 0 |
'train' | 3 |
'validated' | 3 |
'validation' | 0 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
zh-CN
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/zh-CN')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 5305 |
'other' | 8948 |
'test' | 8760 |
'train' | 18541 |
'validated' | 36405 |
'validation' | 8743 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
zh-HK
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/zh-HK')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 2999 |
'other' | 38830 |
'test' | 5172 |
'train' | 7506 |
'validated' | 41835 |
'validation' | 5172 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
zh-TW
Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :
ds = tfds.load('huggingface:common_voice/zh-TW')
- Description :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
- Version : 6.1.0
- Divisions :
Diviser | Exemples |
---|---|
'invalidated' | 3584 |
'other' | 22477 |
'test' | 2895 |
'train' | 3507 |
'validated' | 61232 |
'validation' | 2895 |
- Caractéristiques :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}