참고자료:
주석이 달린
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:simple_questions_v2/annotated')
- 설명 :
SimpleQuestions is a dataset for simple QA, which consists
of a total of 108,442 questions written in natural language by human
English-speaking annotators each paired with a corresponding fact,
formatted as (subject, relationship, object), that provides the answer
but also a complete explanation. Fast have been extracted from the
Knowledge Base Freebase (freebase.com). We randomly shuffle these
questions and use 70% of them (75910) as training set, 10% as
validation set (10845), and the remaining 20% as test set.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'test' | 75910 |
'train' | 75910 |
'validation' | 75910 |
- 특징 :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"subject_entity": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"relationship": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"object_entity": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"question": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
프리베이스2m
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:simple_questions_v2/freebase2m')
- 설명 :
SimpleQuestions is a dataset for simple QA, which consists
of a total of 108,442 questions written in natural language by human
English-speaking annotators each paired with a corresponding fact,
formatted as (subject, relationship, object), that provides the answer
but also a complete explanation. Fast have been extracted from the
Knowledge Base Freebase (freebase.com). We randomly shuffle these
questions and use 70% of them (75910) as training set, 10% as
validation set (10845), and the remaining 20% as test set.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'train' | 10843106 |
- 특징 :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"subject_entity": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"relationship": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"object_entities": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
프리베이스5m
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:simple_questions_v2/freebase5m')
- 설명 :
SimpleQuestions is a dataset for simple QA, which consists
of a total of 108,442 questions written in natural language by human
English-speaking annotators each paired with a corresponding fact,
formatted as (subject, relationship, object), that provides the answer
but also a complete explanation. Fast have been extracted from the
Knowledge Base Freebase (freebase.com). We randomly shuffle these
questions and use 70% of them (75910) as training set, 10% as
validation set (10845), and the remaining 20% as test set.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'train' | 12010500 |
- 특징 :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"subject_entity": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"relationship": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"object_entities": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}