갭

설명 :

GAP는 8,908개의 상호 참조 레이블 쌍(모호한 대명사, 선행 이름)을 포함하는 성별 균형 데이터 세트로 Wikipedia에서 샘플링하고 실제 응용 프로그램에서 상호 참조 해상도를 평가하기 위해 Google AI Language에서 발표했습니다.

추가 문서 : 코드가 있는 논문에서 탐색
홈페이지 : https://github.com/google-research-datasets/gap-coreference
소스 코드 : tfds.text.Gap
버전 :
- 0.1.0 : 최초 릴리스.
- 0.1.1 (기본값): 부울 필드 A-coref 및 B-coref 의 구문 분석을 수정합니다.
다운로드 크기 : 2.29 MiB
데이터 세트 크기 : 2.96 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	2,000
`'train'`	2,000
`'validation'`	454

기능 구조 :

FeaturesDict({
    'A': Text(shape=(), dtype=string),
    'A-coref': bool,
    'A-offset': int32,
    'B': Text(shape=(), dtype=string),
    'B-coref': bool,
    'B-offset': int32,
    'ID': Text(shape=(), dtype=string),
    'Pronoun': Text(shape=(), dtype=string),
    'Pronoun-offset': int32,
    'Text': Text(shape=(), dtype=string),
    'URL': Text(shape=(), dtype=string),
})

기능 문서 :

특징	수업	D타입
	풍모Dict
ㅏ	텍스트	끈
A-코어프	텐서	부울
A 오프셋	텐서	int32
비	텍스트	끈
B 코어프	텐서	부울
B 오프셋	텐서	int32
ID	텍스트	끈
대명사	텍스트	끈
대명사 오프셋	텐서	int32
텍스트	텍스트	끈
URL	텍스트	끈

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
예 ( tfds.as_dataframe ):

인용 :

@article{DBLP:journals/corr/abs-1810-05201,
  author    = {Kellie Webster and
               Marta Recasens and
               Vera Axelrod and
               Jason Baldridge},
  title     = {Mind the {GAP:} {A} Balanced Corpus of Gendered Ambiguous Pronouns},
  journal   = {CoRR},
  volume    = {abs/1810.05201},
  year      = {2018},
  url       = {http://arxiv.org/abs/1810.05201},
  archivePrefix = {arXiv},
  eprint    = {1810.05201},
  timestamp = {Tue, 30 Oct 2018 20:39:56 +0100},
  biburl    = {https://dblp.org/rec/bib/journals/corr/abs-1810-05201},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}