Ссылки:
ru_de
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_de')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_tr
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_tr')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_fa
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_fa')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en_sv-SE
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_sv-SE')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en_mn
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_mn')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_zh-CN
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_zh-CN')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en_cy
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_cy')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_ca
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_ca')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_sl
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_sl')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en_et
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_et')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en_id
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_id')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_ar
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_ar')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en_ta
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_ta')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_lv
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_lv')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
en_ja
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/en_ja')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 15531 |
'train' | 289430 |
'validation' | 15531 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fr_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/fr_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 14760 |
'train' | 207374 |
'validation' | 14760 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
де_ен
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/de_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 13511 |
'train' | 127834 |
'validation' | 13511 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
es_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/es_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 13221 |
'train' | 79015 |
'validation' | 13221 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ca_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/ca_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 12730 |
'train' | 95854 |
'validation' | 12730 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
it_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/it_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 8951 |
'train' | 31698 |
'validation' | 8940 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ru_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/ru_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 6300 |
'train' | 12112 |
'validation' | 6110 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
zh-CN_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/zh-CN_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 4898 |
'train' | 7085 |
'validation' | 4843 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
pt_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/pt_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 4023 |
'train' | 9158 |
'validation' | 3318 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
fa_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/fa_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 3445 |
'train' | 53949 |
'validation' | 3445 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
et_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/et_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 1571 г. |
'train' | 1782 г. |
'validation' | 1576 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
mn_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/mn_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 1759 г. |
'train' | 2067 |
'validation' | 1761 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
nl_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/nl_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 1699 г. |
'train' | 7108 |
'validation' | 1699 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
tr_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/tr_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 1629 г. |
'train' | 3966 |
'validation' | 1624 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ar_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/ar_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 1695 г. |
'train' | 2283 |
'validation' | 1758 г. |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sv-SE_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/sv-SE_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 1595 г. |
'train' | 2160 |
'validation' | 1349 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
lv_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/lv_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 1629 г. |
'train' | 2337 |
'validation' | 1125 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
sl_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/sl_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 360 |
'train' | 1843 г. |
'validation' | 509 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ta_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/ta_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 786 |
'train' | 1358 |
'validation' | 384 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ja_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/ja_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 684 |
'train' | 1119 |
'validation' | 635 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
id_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/id_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 844 |
'train' | 1243 |
'validation' | 792 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
cy_en
Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:
ds = tfds.load('huggingface:covost2/cy_en')
- Описание :
CoVoST 2, a large-scale multilingual speech translation corpus covering translations from 21 languages into English and from English into 15 languages. The dataset is created using Mozilla’s open source Common Voice database of crowdsourced voice recordings.
Note that in order to limit the required storage for preparing this dataset, the audio
is stored in the .mp3 format and is not converted to a float32 array. To convert, the audio
file to a float32 array, please make use of the `.map()` function as follows:
python
import torchaudio
def map_to_array(batch):
speech_array, _ = torchaudio.load(batch["file"])
batch["speech"] = speech_array.numpy()
return batch
dataset = dataset.map(map_to_array, remove_columns=["file"])
- Лицензия : Нет известной лицензии.
- Версия : 1.0.0
- Расколы :
Расколоть | Примеры |
---|---|
'test' | 690 |
'train' | 1241 |
'validation' | 690 |
- Функции :
{
"client_id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"file": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"translation": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}