참고자료:
xor 검색
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:xor_tydi_qa/xor-retrieve')
- 설명 :
XOR-TyDi QA brings together for the first time information-seeking questions,
open-retrieval QA, and multilingual QA to create a multilingual open-retrieval
QA dataset that enables cross-lingual answer retrieval. It consists of questions
written by information-seeking native speakers in 7 typologically diverse languages
and answer annotations that are retrieved from multilingual document collections.
There are three sub-tasks: XOR-Retrieve, XOR-EnglishSpan, and XOR-Full.
XOR-Retrieve is a cross-lingual retrieval task where a question is written in the target
language (e.g., Japanese) and a system is required to retrieve English document that answers the question.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.1.0
- 분할 :
나뉘다 | 예 |
---|---|
'test' | 2499 |
'train' | 15250 |
'validation' | 2110 |
- 특징 :
{
"question": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"num_classes": 7,
"names": [
"ar",
"bn",
"fi",
"ja",
"ko",
"ru",
"te"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"answers": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
xor-full
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:xor_tydi_qa/xor-full')
- 설명 :
XOR-TyDi QA brings together for the first time information-seeking questions,
open-retrieval QA, and multilingual QA to create a multilingual open-retrieval
QA dataset that enables cross-lingual answer retrieval. It consists of questions
written by information-seeking native speakers in 7 typologically diverse languages
and answer annotations that are retrieved from multilingual document collections.
There are three sub-tasks: XOR-Retrieve, XOR-EnglishSpan, and XOR-Full.
XOR-Full is a cross-lingual retrieval task where a question is written in the target
language (e.g., Japanese) and a system is required to output a short answer in the target language.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.1.0
- 분할 :
나뉘다 | 예 |
---|---|
'test' | 8176 |
'train' | 61360 |
'validation' | 3473 |
- 특징 :
{
"question": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"num_classes": 7,
"names": [
"ar",
"bn",
"fi",
"ja",
"ko",
"ru",
"te"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"answers": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}