Ссылки:
аб
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ab')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 8 |
'other' | 752 |
'test' | 9 |
'train' | 22 |
'validated' | 31 |
'validation' | 0 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ар
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ar')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 6333 |
'other' | 18283 |
'test' | 7622 |
'train' | 14227 |
'validated' | 43291 |
'validation' | 7517 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
как
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/as')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 31 |
'other' | 0 |
'test' | 110 |
'train' | 270 |
'validated' | 504 |
'validation' | 124 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
бр
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/br')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 623 |
'other' | 10912 |
'test' | 2087 |
'train' | 2780 |
'validated' | 8560 |
'validation' | 1997 год |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
Калифорния
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ca')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 18846 |
'other' | 64446 |
'test' | 15724 |
'train' | 285584 |
'validated' | 416701 |
'validation' | 15724 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
CNH
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/cnh')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 433 |
'other' | 2934 |
'test' | 752 |
'train' | 807 |
'validated' | 2432 |
'validation' | 756 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
CS
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/cs')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 685 |
'other' | 7475 |
'test' | 4144 |
'train' | 5655 |
'validated' | 30431 |
'validation' | 4118 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
резюме
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/cv')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 1282 |
'other' | 6927 |
'test' | 788 |
'train' | 931 |
'validated' | 3496 |
'validation' | 818 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
сай
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/cy')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 3648 |
'other' | 17919 |
'test' | 4820 |
'train' | 6839 |
'validated' | 72984 |
'validation' | 4776 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
де
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/de')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 32789 |
'other' | 10095 |
'test' | 15588 |
'train' | 246525 |
'validated' | 565186 |
'validation' | 15588 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
дв
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/dv')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 840 |
'other' | 0 |
'test' | 2202 |
'train' | 2680 |
'validated' | 11866 |
'validation' | 2077 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
эль
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/el')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 185 |
'other' | 5659 |
'test' | 1522 |
'train' | 2316 |
'validated' | 5996 |
'validation' | 1401 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/en')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 189562 |
'other' | 169895 |
'test' | 16164 |
'train' | 564337 |
'validated' | 1224864 |
'validation' | 16164 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
эо
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/eo')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 4736 |
'other' | 2946 |
'test' | 8969 |
'train' | 19587 |
'validated' | 58094 |
'validation' | 8987 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
эс
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/es')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 40640 |
'other' | 144791 |
'test' | 15089 |
'train' | 161813 |
'validated' | 236314 |
'validation' | 15089 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
и др.
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/et')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 3557 |
'other' | 569 |
'test' | 2509 |
'train' | 2966 |
'validated' | 10683 |
'validation' | 2507 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
Евросоюз
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/eu')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 5387 |
'other' | 23570 |
'test' | 5172 |
'train' | 7505 |
'validated' | 63009 |
'validation' | 5172 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
фа
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/fa')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 11698 |
'other' | 22510 |
'test' | 5213 |
'train' | 7593 |
'validated' | 251659 |
'validation' | 5213 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
фи
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/fi')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 59 |
'other' | 149 |
'test' | 428 |
'train' | 460 |
'validated' | 1305 |
'validation' | 415 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
фр.
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/fr')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 40351 |
'other' | 3222 |
'test' | 15763 |
'train' | 298982 |
'validated' | 461004 |
'validation' | 15763 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fy-NL
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/fy-NL')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 1031 |
'other' | 21569 |
'test' | 3020 |
'train' | 3927 |
'validated' | 10495 |
'validation' | 2790 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
га-IE
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ga-IE')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 409 |
'other' | 2130 |
'test' | 506 |
'train' | 541 |
'validated' | 3352 |
'validation' | 497 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
привет
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/hi')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 60 |
'other' | 139 |
'test' | 127 |
'train' | 157 |
'validated' | 419 |
'validation' | 135 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
HSB
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/hsb')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 227 |
'other' | 62 |
'test' | 387 |
'train' | 808 |
'validated' | 1367 |
'validation' | 172 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ху
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/hu')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 169 |
'other' | 295 |
'test' | 1649 г. |
'train' | 3348 |
'validated' | 6457 |
'validation' | 1434 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
я
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ia')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 192 |
'other' | 1095 |
'test' | 899 |
'train' | 3477 |
'validated' | 5978 |
'validation' | 1601 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
идентификатор
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/id')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 470 |
'other' | 6782 |
'test' | 1844 г. |
'train' | 2130 |
'validated' | 8696 |
'validation' | 1835 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
это
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/it')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 12189 |
'other' | 14549 |
'test' | 12928 |
'train' | 58015 |
'validated' | 102579 |
'validation' | 12928 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
да
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ja')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 504 |
'other' | 885 |
'test' | 632 |
'train' | 722 |
'validated' | 3072 |
'validation' | 586 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ка
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ka')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 139 |
'other' | 44 |
'test' | 656 |
'train' | 1058 |
'validated' | 2275 |
'validation' | 527 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
каб
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/kab')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 18134 |
'other' | 88021 |
'test' | 14622 |
'train' | 120530 |
'validated' | 573718 |
'validation' | 14622 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
окей
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ky')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 926 |
'other' | 7223 |
'test' | 1503 |
'train' | 1955 год |
'validated' | 9236 |
'validation' | 1511 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
LG
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/lg')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 290 |
'other' | 3110 |
'test' | 584 |
'train' | 1250 |
'validated' | 2220 |
'validation' | 384 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
лт
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/lt')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 102 |
'other' | 1629 г. |
'test' | 466 |
'train' | 931 |
'validated' | 1644 г. |
'validation' | 244 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
лв
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/lv')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 143 |
'other' | 1560 г. |
'test' | 1882 г. |
'train' | 2552 |
'validated' | 6444 |
'validation' | 2002 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
минута
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/mn')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 667 |
'other' | 3272 |
'test' | 1862 г. |
'train' | 2183 |
'validated' | 7487 |
'validation' | 1837 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
тонна
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/mt')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 314 |
'other' | 5714 |
'test' | 1617 |
'train' | 2036 год |
'validated' | 5747 |
'validation' | 1516 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
Нидерланды
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/nl')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 3308 |
'other' | 27 |
'test' | 5708 |
'train' | 9460 |
'validated' | 52488 |
'validation' | 4938 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
или
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/or')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 62 |
'other' | 4302 |
'test' | 98 |
'train' | 388 |
'validated' | 615 |
'validation' | 129 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
боль
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/pa-IN')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 43 |
'other' | 1411 |
'test' | 116 |
'train' | 211 |
'validated' | 371 |
'validation' | 44 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
пожалуйста
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/pl')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 4601 |
'other' | 12848 |
'test' | 5153 |
'train' | 7468 |
'validated' | 90791 |
'validation' | 5153 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
пт
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/pt')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 1740 г. |
'other' | 8390 |
'test' | 4641 |
'train' | 6514 |
'validated' | 41584 |
'validation' | 4592 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
rm-sursilv
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/rm-sursilv')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 639 |
'other' | 2102 |
'test' | 1194 |
'train' | 1384 |
'validated' | 3783 |
'validation' | 1205 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
РМ-Валадер
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/rm-vallader')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 374 |
'other' | 727 |
'test' | 378 |
'train' | 574 |
'validated' | 1316 |
'validation' | 357 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ро
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ro')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 485 |
'other' | 1945 год |
'test' | 1778 г. |
'train' | 3399 |
'validated' | 6039 |
'validation' | 858 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ру
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ru')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 3056 |
'other' | 10247 |
'test' | 8007 |
'train' | 15481 |
'validated' | 74256 |
'validation' | 7963 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
RW
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/rw')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 206790 |
'other' | 22923 |
'test' | 15724 |
'train' | 515197 |
'validated' | 832929 |
'validation' | 15032 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
сэр
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/sah')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 66 |
'other' | 1275 |
'test' | 757 |
'train' | 1442 |
'validated' | 2606 |
'validation' | 405 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
сл
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/sl')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 92 |
'other' | 2502 |
'test' | 881 |
'train' | 2038 год |
'validated' | 4669 |
'validation' | 556 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
св-ЮВ
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/sv-SE')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 462 |
'other' | 3043 |
'test' | 2027 год |
'train' | 2331 |
'validated' | 12552 |
'validation' | 2019 год |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
та
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/ta')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 594 |
'other' | 7428 |
'test' | 1781 г. |
'train' | 2009 год |
'validated' | 12652 |
'validation' | 1779 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
й
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/th')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 467 |
'other' | 2671 |
'test' | 2188 |
'train' | 2917 |
'validated' | 7028 |
'validation' | 1922 год |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
тр
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/tr')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 1726 г. |
'other' | 325 |
'test' | 1647 г. |
'train' | 1831 г. |
'validated' | 18685 |
'validation' | 1647 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
тт
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/tt')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 287 |
'other' | 1798 г. |
'test' | 4485 |
'train' | 11211 |
'validated' | 25781 |
'validation' | 2127 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
Великобритания
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/uk')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 1255 |
'other' | 8161 |
'test' | 3235 |
'train' | 4035 |
'validated' | 22337 |
'validation' | 3236 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ви
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/vi')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 78 |
'other' | 870 |
'test' | 198 |
'train' | 221 |
'validated' | 619 |
'validation' | 200 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
вот
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/vot')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 6 |
'other' | 411 |
'test' | 0 |
'train' | 3 |
'validated' | 3 |
'validation' | 0 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ж-CN
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/zh-CN')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 5305 |
'other' | 8948 |
'test' | 8760 |
'train' | 18541 |
'validated' | 36405 |
'validation' | 8743 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ж-ГК
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/zh-HK')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 2999 год |
'other' | 38830 |
'test' | 5172 |
'train' | 7506 |
'validated' | 41835 |
'validation' | 5172 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ж-TW
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:common_voice/zh-TW')
- Описание :
Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.
- Лицензия : https://github.com/common-voice/common-voice/blob/main/LICENSE .
- Версия : 6.1.0
- Расколы :
Расколоть | Примеры |
---|---|
'invalidated' | 3584 |
'other' | 22477 |
'test' | 2895 |
'train' | 3507 |
'validated' | 61232 |
'validation' | 2895 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"path": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"audio": {
"sampling_rate": 48000,
"mono": true,
"decode": true,
"id": null,
"_type": "Audio"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"up_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"down_votes": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"age": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"gender": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"accent": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"locale": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"segment": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}