품질

설명 :

객관식 장독해 데이터세트인 QualITY.

우리는 원시 버전만 제공합니다.

홈페이지 : https://github.com/nyu-mll/quality
소스 코드 : tfds.datasets.quality.Builder
버전 :
- 1.0.0 (기본값): 최초 릴리스.
다운로드 크기 : 17.26 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'dev'`	230
`'test'`	232
`'train'`	300

기능 구조 :

FeaturesDict({
    'article': Text(shape=(), dtype=string),
    'article_id': Text(shape=(), dtype=string),
    'difficults': Sequence(bool),
    'gold_labels': Sequence(int32),
    'options': Sequence(Sequence(Text(shape=(), dtype=string))),
    'question_ids': Sequence(Text(shape=(), dtype=string)),
    'questions': Sequence(Text(shape=(), dtype=string)),
    'set_unique_id': Text(shape=(), dtype=string),
    'source': Text(shape=(), dtype=string),
    'title': Text(shape=(), dtype=string),
    'topic': Text(shape=(), dtype=string),
    'url': Text(shape=(), dtype=string),
    'writer_id': Text(shape=(), dtype=string),
    'writer_labels': Sequence(int32),
})

기능 문서 :

특징	수업	모양	D타입
	풍모Dict
기사	텍스트		끈
기사 ID	텍스트		끈
어려움	시퀀스(텐서)	(없음,)	부울
gold_labels	시퀀스(텐서)	(없음,)	int32
옵션	시퀀스(시퀀스(텍스트))	(없음, 없음)	끈
질문 ID	시퀀스(텍스트)	(없음,)	끈
질문	시퀀스(텍스트)	(없음,)	끈
set_unique_id	텍스트		끈
원천	텍스트		끈
제목	텍스트		끈
주제	텍스트		끈
URL	텍스트		끈
writer_id	텍스트		끈
writer_labels	시퀀스(텐서)	(없음,)	int32

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
인용 :

@article{pang2021quality,
  title={ {QuALITY}: Question Answering with Long Input Texts, Yes!},
  author={Pang, Richard Yuanzhe and Parrish, Alicia and Joshi, Nitish and Nangia, Nikita and Phang, Jason and Chen, Angelica and Padmakumar, Vishakh and Ma, Johnny and Thompson, Jana and He, He and Bowman, Samuel R.},
  journal={arXiv preprint arXiv:2112.08608},
  year={2021}
}

품질/원시(기본 구성)

구성 설명 : HTML이 포함된 원시.
데이터 세트 크기 : 22.18 MiB
예 ( tfds.as_dataframe ):

품질/스트립

구성 설명 : HTML이 제거되었습니다.
데이터 세트 크기 : 20.73 MiB
예 ( tfds.as_dataframe ):

품질 컬렉션을 사용해 정리하기 내 환경설정을 기준으로 콘텐츠를 저장하고 분류하세요.

품질/원시(기본 구성)

품질/스트립

품질