공상 과학

설명 :

SciTail 데이터 세트는 객관식 과학 시험 및 웹 문장에서 생성된 함의 데이터 세트입니다. 각 질문과 정답 선택은 가설을 형성하기 위해 단정적인 진술로 변환됩니다. 정보 검색은 웹 문장의 큰 텍스트 코퍼스에서 관련 텍스트를 얻기 위해 사용되며, 이러한 문장은 전제 P로 사용됩니다. 이러한 전제-가설 쌍의 주석은 지지(entails) 또는 그렇지 않음(중립) 순서로 크라우드 소싱됩니다. SciTail 데이터 세트를 생성합니다. 데이터 세트에는 27,026개의 예가 포함되어 있으며 10,101개의 예는 수반 레이블이 있고 16,925개의 예는 중립 레이블이 있습니다.

추가 문서 : 코드가 있는 논문에서 탐색
홈페이지 : https://allenai.org/data/scitail
소스 코드 : tfds.datasets.sci_tail.Builder
버전 :
- 1.0.0 (기본값): 최초 릴리스.
다운로드 크기 : 13.52 MiB
데이터 세트 크기 : 6.01 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	2,126
`'train'`	23,097
`'validation'`	1,304

기능 구조 :

FeaturesDict({
    'hypothesis': Text(shape=(), dtype=string),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=2),
    'premise': Text(shape=(), dtype=string),
})

기능 문서 :

특징	수업	D타입
	풍모Dict
가설	텍스트	끈
상표	클래스 레이블	int64
전제	텍스트	끈

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
예 ( tfds.as_dataframe ):

인용 :

@inproceedings{khot2018scitail,
    title={Scitail: A textual entailment dataset from science question answering},
    author={Khot, Tushar and Sabharwal, Ashish and Clark, Peter},
    booktitle={Proceedings of the 32th AAAI Conference on Artificial Intelligence (AAAI 2018)},
    url = "http://ai2-website.s3.amazonaws.com/publications/scitail-aaai-2018_cameraready.pdf",
    year={2018}
}