참고자료:
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:assin2')
- 설명 :
The ASSIN 2 corpus is composed of rather simple sentences. Following the procedures of SemEval 2014 Task 1.
The training and validation data are composed, respectively, of 6,500 and 500 sentence pairs in Brazilian Portuguese,
annotated for entailment and semantic similarity. Semantic similarity values range from 1 to 5, and text entailment
classes are either entailment or none. The test data are composed of approximately 3,000 sentence pairs with the same
annotation. All data were manually annotated.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'test' | 2448 |
'train' | 6500 |
'validation' | 500 |
- 특징 :
{
"sentence_pair_id": {
"dtype": "int64",
"id": null,
"_type": "Value"
},
"premise": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"hypothesis": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"relatedness_score": {
"dtype": "float32",
"id": null,
"_type": "Value"
},
"entailment_judgment": {
"num_classes": 2,
"names": [
"NONE",
"ENTAILMENT"
],
"id": null,
"_type": "ClassLabel"
}
}