TFDS supporte désormais le format Croissant 🥐 ! Lisez la documentation pour en savoir plus.

Cette page a été traduite par l'API Cloud Translation.

voix_commune

Références :

ab

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ab')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	8
`'other'`	752
`'test'`	9
`'train'`	22
`'validated'`	31
`'validation'`	0

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ar

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ar')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	6333
`'other'`	18283
`'test'`	7622
`'train'`	14227
`'validated'`	43291
`'validation'`	7517

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

comme

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/as')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	31
`'other'`	0
`'test'`	110
`'train'`	270
`'validated'`	504
`'validation'`	124

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

br

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/br')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	623
`'other'`	10912
`'test'`	2087
`'train'`	2780
`'validated'`	8560
`'validation'`	1997

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

Californie

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ca')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	18846
`'other'`	64446
`'test'`	15724
`'train'`	285584
`'validated'`	416701
`'validation'`	15724

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cnh

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/cnh')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	433
`'other'`	2934
`'test'`	752
`'train'`	807
`'validated'`	2432
`'validation'`	756

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cs

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/cs')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	685
`'other'`	7475
`'test'`	4144
`'train'`	5655
`'validated'`	30431
`'validation'`	4118

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cv

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/cv')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	1282
`'other'`	6927
`'test'`	788
`'train'`	931
`'validated'`	3496
`'validation'`	818

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cy

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/cy')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	3648
`'other'`	17919
`'test'`	4820
`'train'`	6839
`'validated'`	72984
`'validation'`	4776

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

de

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/de')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	32789
`'other'`	10095
`'test'`	15588
`'train'`	246525
`'validated'`	565186
`'validation'`	15588

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

dv

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/dv')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	840
`'other'`	0
`'test'`	2202
`'train'`	2680
`'validated'`	11866
`'validation'`	2077

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

el

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/el')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	185
`'other'`	5659
`'test'`	1522
`'train'`	2316
`'validated'`	5996
`'validation'`	1401

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fr

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/en')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	189562
`'other'`	169895
`'test'`	16164
`'train'`	564337
`'validated'`	1224864
`'validation'`	16164

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

eo

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/eo')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	4736
`'other'`	2946
`'test'`	8969
`'train'`	19587
`'validated'`	58094
`'validation'`	8987

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

es

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/es')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	40640
`'other'`	144791
`'test'`	15089
`'train'`	161813
`'validated'`	236314
`'validation'`	15089

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

et

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/et')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	3557
`'other'`	569
`'test'`	2509
`'train'`	2966
`'validated'`	10683
`'validation'`	2507

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

UE

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/eu')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	5387
`'other'`	23570
`'test'`	5172
`'train'`	7505
`'validated'`	63009
`'validation'`	5172

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fa

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/fa')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	11698
`'other'`	22510
`'test'`	5213
`'train'`	7593
`'validated'`	251659
`'validation'`	5213

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fi

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/fi')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	59
`'other'`	149
`'test'`	428
`'train'`	460
`'validated'`	1305
`'validation'`	415

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fr

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/fr')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	40351
`'other'`	3222
`'test'`	15763
`'train'`	298982
`'validated'`	461004
`'validation'`	15763

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

fy-NL

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/fy-NL')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	1031
`'other'`	21569
`'test'`	3020
`'train'`	3927
`'validated'`	10495
`'validation'`	2790

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ga-IE

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ga-IE')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	409
`'other'`	2130
`'test'`	506
`'train'`	541
`'validated'`	3352
`'validation'`	497

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

Salut

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/hi')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	60
`'other'`	139
`'test'`	127
`'train'`	157
`'validated'`	419
`'validation'`	135

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

HSB

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/hsb')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	227
`'other'`	62
`'test'`	387
`'train'`	808
`'validated'`	1367
`'validation'`	172

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

hein

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/hu')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	169
`'other'`	295
`'test'`	1649
`'train'`	3348
`'validated'`	6457
`'validation'`	1434

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

je

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ia')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	192
`'other'`	1095
`'test'`	899
`'train'`	3477
`'validated'`	5978
`'validation'`	1601

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

identifiant

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/id')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	470
`'other'`	6782
`'test'`	1844
`'train'`	2130
`'validated'`	8696
`'validation'`	1835

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

il

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/it')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	12189
`'other'`	14549
`'test'`	12928
`'train'`	58015
`'validated'`	102579
`'validation'`	12928

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

oui

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ja')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	504
`'other'`	885
`'test'`	632
`'train'`	722
`'validated'`	3072
`'validation'`	586

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ka

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ka')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	139
`'other'`	44
`'test'`	656
`'train'`	1058
`'validated'`	2275
`'validation'`	527

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

kab

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/kab')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	18134
`'other'`	88021
`'test'`	14622
`'train'`	120530
`'validated'`	573718
`'validation'`	14622

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ky

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ky')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	926
`'other'`	7223
`'test'`	1503
`'train'`	1955
`'validated'`	9236
`'validation'`	1511

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

LG

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/lg')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	290
`'other'`	3110
`'test'`	584
`'train'`	1250
`'validated'`	2220
`'validation'`	384

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

lt

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/lt')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	102
`'other'`	1629
`'test'`	466
`'train'`	931
`'validated'`	1644
`'validation'`	244

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

lv

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/lv')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	143
`'other'`	1560
`'test'`	1882
`'train'`	2552
`'validated'`	6444
`'validation'`	2002

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

minute

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/mn')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	667
`'other'`	3272
`'test'`	1862
`'train'`	2183
`'validated'`	7487
`'validation'`	1837

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

mont

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/mt')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	314
`'other'`	5714
`'test'`	1617
`'train'`	2036
`'validated'`	5747
`'validation'`	1516

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

nl

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/nl')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	3308
`'other'`	27
`'test'`	5708
`'train'`	9460
`'validated'`	52488
`'validation'`	4938

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ou

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/or')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	62
`'other'`	4302
`'test'`	98
`'train'`	388
`'validated'`	615
`'validation'`	129

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

douleur

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/pa-IN')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	43
`'other'`	1411
`'test'`	116
`'train'`	211
`'validated'`	371
`'validation'`	44

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

svp

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/pl')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	4601
`'other'`	12848
`'test'`	5153
`'train'`	7468
`'validated'`	90791
`'validation'`	5153

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

pt

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/pt')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	1740
`'other'`	8390
`'test'`	4641
`'train'`	6514
`'validated'`	41584
`'validation'`	4592

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

rm-sursilv

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/rm-sursilv')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	639
`'other'`	2102
`'test'`	1194
`'train'`	1384
`'validated'`	3783
`'validation'`	1205

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

rm-vallader

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/rm-vallader')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	374
`'other'`	727
`'test'`	378
`'train'`	574
`'validated'`	1316
`'validation'`	357

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ro

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ro')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	485
`'other'`	1945
`'test'`	1778
`'train'`	3399
`'validated'`	6039
`'validation'`	858

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ru

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ru')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	3056
`'other'`	10247
`'test'`	8007
`'train'`	15481
`'validated'`	74256
`'validation'`	7963

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

rw

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/rw')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	206790
`'other'`	22923
`'test'`	15724
`'train'`	515197
`'validated'`	832929
`'validation'`	15032

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

sah

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/sah')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	66
`'other'`	1275
`'test'`	757
`'train'`	1442
`'validated'`	2606
`'validation'`	405

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

sl

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/sl')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	92
`'other'`	2502
`'test'`	881
`'train'`	2038
`'validated'`	4669
`'validation'`	556

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

sv-SE

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/sv-SE')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	462
`'other'`	3043
`'test'`	2027
`'train'`	2331
`'validated'`	12552
`'validation'`	2019

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ta

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/ta')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	594
`'other'`	7428
`'test'`	1781
`'train'`	2009
`'validated'`	12652
`'validation'`	1779

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ème

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/th')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	467
`'other'`	2671
`'test'`	2188
`'train'`	2917
`'validated'`	7028
`'validation'`	1922

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

tr

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/tr')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	1726
`'other'`	325
`'test'`	1647
`'train'`	1831
`'validated'`	18685
`'validation'`	1647

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

tt

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/tt')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	287
`'other'`	1798
`'test'`	4485
`'train'`	11211
`'validated'`	25781
`'validation'`	2127

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

Royaume-Uni

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/uk')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	1255
`'other'`	8161
`'test'`	3235
`'train'`	4035
`'validated'`	22337
`'validation'`	3236

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

vi

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/vi')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	78
`'other'`	870
`'test'`	198
`'train'`	221
`'validated'`	619
`'validation'`	200

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

voter

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/vot')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	6
`'other'`	411
`'test'`	0
`'train'`	3
`'validated'`	3
`'validation'`	0

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

zh-CN

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/zh-CN')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	5305
`'other'`	8948
`'test'`	8760
`'train'`	18541
`'validated'`	36405
`'validation'`	8743

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

zh-HK

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/zh-HK')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	2999
`'other'`	38830
`'test'`	5172
`'train'`	7506
`'validated'`	41835
`'validation'`	5172

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

zh-TW

Utilisez la commande suivante pour charger cet ensemble de données dans TFDS :

ds = tfds.load('huggingface:common_voice/zh-TW')

Description :

Common Voice is Mozilla's initiative to help teach machines how real people speak.
The dataset currently consists of 7,335 validated hours of speech in 60 languages, but we’re always adding more voices and languages.

Licence : https://github.com/common-voice/common-voice/blob/main/LICENSE
Version : 6.1.0
Divisions :

Diviser	Exemples
`'invalidated'`	3584
`'other'`	22477
`'test'`	2895
`'train'`	3507
`'validated'`	61232
`'validation'`	2895

Caractéristiques :

{
    "client_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "path": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "audio": {
        "sampling_rate": 48000,
        "mono": true,
        "decode": true,
        "id": null,
        "_type": "Audio"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "up_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "down_votes": {
        "dtype": "int64",
        "id": null,
        "_type": "Value"
    },
    "age": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "gender": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "accent": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "locale": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "segment": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}