람바다

설명 :

LAMBADA 데이터 세트는 단어 예측 작업을 통해 텍스트 이해를 위한 계산 모델의 기능을 평가합니다. LAMBADA는 전체 구절에 노출되면 인간 피험자가 마지막 단어를 추측할 수 있지만 대상 단어 앞의 마지막 문장만 보면 그렇지 않다는 특징을 공유하는 내러티브 구절 모음입니다.

추가 문서 : 코드가 있는 논문에서 탐색
홈페이지 : https://zenodo.org/record/2630551#.X4Xzn5NKjUI
소스 코드 : tfds.datasets.lambada.Builder
버전 :
- 1.0.0 (기본값): 최초 릴리스.
다운로드 크기 : 319.03 MiB
데이터 세트 크기 : 3.49 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	5,153
`'train'`	4,869

기능 구조 :

FeaturesDict({
    'passage': Text(shape=(), dtype=string),
})

기능 문서 :

특징	수업	모양	D타입	설명
	풍모Dict
통로	텍스트		끈

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
예 ( tfds.as_dataframe ):

인용 :

@inproceedings{paperno-etal-2016-lambada,
    title = "The {LAMBADA} dataset: Word prediction requiring a broad discourse context",
    author = "Paperno, Denis  and
      Kruszewski, Germ{\'a}n  and
      Lazaridou, Angeliki  and
      Pham, Ngoc Quan  and
      Bernardi, Raffaella  and
      Pezzelle, Sandro  and
      Baroni, Marco  and
      Boleda, Gemma  and
      Fern{\'a}ndez, Raquel",
    booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = aug,
    year = "2016",
    address = "Berlin, Germany",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P16-1144",
    doi = "10.18653/v1/P16-1144",
    pages = "1525--1534",
}