TFDS는 이제 Croissant 🥐 형식을 지원합니다! 자세한 내용은 설명서를 읽어보세요.

이 페이지는 Cloud Translation API를 통해 번역되었습니다.

미르카

설명 :

MRQA 2019 Shared Task는 질문 답변의 일반화에 중점을 둡니다. 효과적인 질문 응답 시스템은 동일한 분포에서 가져온 테스트 예제에 응답하기 위해 단순히 훈련 세트에서 보간하는 것 이상을 수행해야 합니다. 또한 분포 외 예제로 추정할 수 있어야 합니다. 이는 훨씬 더 어려운 과제입니다.

MRQA는 여러 개의 고유한 질문 응답 데이터 세트(기존 데이터 세트의 신중하게 선택된 하위 집합)를 동일한 형식(SQuAD 형식)으로 조정하고 통합합니다. 그 중 6개의 데이터 세트는 학습용으로, 6개의 데이터 세트는 테스트용으로 제공되었습니다. 교육 데이터 세트의 작은 부분은 개발에 사용할 수 있는 도메인 내 데이터로 유지되었습니다. 테스트 데이터 세트에는 도메인 외부 데이터만 포함됩니다. 이 벤치마크는 MRQA 2019 Shared Task의 일부로 출시되었습니다.

자세한 내용은 <a href="https://mrqa.github.io/2019/shared.html">https://mrqa.github.io/2019/shared.html</a> 에서 확인할 수 있습니다.

추가 문서 : 코드가 있는 논문에서 탐색
홈페이지 : https://mrqa.github.io/2019/shared.html
소스 코드 : tfds.text.mrqa.MRQA
버전 :
- 1.0.0 (기본값): 최초 릴리스.
기능 구조 :

FeaturesDict({
    'answers': Sequence(string),
    'context': string,
    'context_tokens': Sequence({
        'offsets': int32,
        'tokens': string,
    }),
    'detected_answers': Sequence({
        'char_spans': Sequence({
            'end': int32,
            'start': int32,
        }),
        'text': string,
        'token_spans': Sequence({
            'end': int32,
            'start': int32,
        }),
    }),
    'qid': string,
    'question': string,
    'question_tokens': Sequence({
        'offsets': int32,
        'tokens': string,
    }),
    'subset': string,
})

기능 문서 :

특징	수업	모양	D타입
	풍모Dict
답변	시퀀스(텐서)	(없음,)	끈
문맥	텐서		끈
context_tokens	순서
context_tokens/offsets	텐서		int32
context_tokens/토큰	텐서		끈
감지_답변	순서
detected_answers/char_spans	순서
detected_answers/char_spans/end	텐서		int32
detected_answers/char_spans/start	텐서		int32
감지_답변/텍스트	텐서		끈
detected_answers/token_spans	순서
detected_answers/token_spans/end	텐서		int32
detected_answers/token_spans/start	텐서		int32
키드	텐서		끈
의문	텐서		끈
질문_토큰	순서
질문_토큰/오프셋	텐서		int32
질문_토큰/토큰	텐서		끈
부분 집합	텐서		끈

감독된 키 ( as_supervised 문서 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.

mrqa/분대(기본 구성)

구성 설명 : SQuAD(Stanford Question Answering Dataset) 데이터 세트는 공유 작업 형식의 기반으로 사용됩니다. Crowdworkers는 Wikipedia의 단락을 보여주고 추출 답변이 포함된 질문을 작성하라는 요청을 받습니다.
다운로드 크기 : 29.66 MiB
데이터 세트 크기 : 271.43 MiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'train'`	86,588
`'validation'`	10,507

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{rajpurkar-etal-2016-squad,
    title = "{SQ}u{AD}: 100,000+ Questions for Machine Comprehension of Text",
    author = "Rajpurkar, Pranav  and
      Zhang, Jian  and
      Lopyrev, Konstantin  and
      Liang, Percy",
    booktitle = "Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2016",
    address = "Austin, Texas",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D16-1264",
    doi = "10.18653/v1/D16-1264",
    pages = "2383--2392",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/news_qa

구성 설명 : 크라우드 작업자 두 세트가 CNN 뉴스 기사를 기반으로 질문하고 답변합니다. "질문자"는 기사의 제목과 요약만 보는 반면 "답변자"는 전체 기사를 봅니다. 답변이 없거나 애노테이터 동의 없이 데이터 세트에 플래그가 지정된 질문은 폐기됩니다.
다운로드 크기 : 56.83 MiB
데이터 세트 크기 : 654.25 MiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'train'`	74,160
`'validation'`	4,212

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{trischler-etal-2017-newsqa,
        title = "{N}ews{QA}: A Machine Comprehension Dataset",
        author = "Trischler, Adam  and
          Wang, Tong  and
          Yuan, Xingdi  and
          Harris, Justin  and
          Sordoni, Alessandro  and
          Bachman, Philip  and
          Suleman, Kaheer",
        booktitle = "Proceedings of the 2nd Workshop on Representation Learning for {NLP}",
        month = aug,
        year = "2017",
        address = "Vancouver, Canada",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/W17-2623",
        doi = "10.18653/v1/W17-2623",
        pages = "191--200",
    }
#
@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/trivia_qa

구성 설명 : 질문 및 답변 쌍은 퀴즈 및 퀴즈 리그 웹사이트에서 제공됩니다. Bing 검색 쿼리의 결과에서 컨텍스트를 검색하는 웹 버전의 TriviaQA가 사용됩니다.
다운로드 크기 : 383.14 MiB
데이터 세트 크기 : 772.75 MiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'train'`	61,688
`'validation'`	7,785

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{joshi-etal-2017-triviaqa,
    title = "{T}rivia{QA}: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension",
    author = "Joshi, Mandar  and
      Choi, Eunsol  and
      Weld, Daniel  and
      Zettlemoyer, Luke",
    booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P17-1147",
    doi = "10.18653/v1/P17-1147",
    pages = "1601--1611",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/search_qa

구성 설명 : 질문 및 답변 쌍은 Jeopardy! 티비 쇼. 컨텍스트는 Google 검색 쿼리에서 검색된 스니펫으로 구성됩니다.
다운로드 크기 : 699.86 MiB
데이터세트 크기 : 1.38 GiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'train'`	117,384
`'validation'`	16,980

예 ( tfds.as_dataframe ):

인용 :

@article{dunn2017searchqa,
    title={Searchqa: A new q\&a dataset augmented with context from a search engine},
    author={Dunn, Matthew and Sagun, Levent and Higgins, Mike and Guney, V Ugur and Cirik, Volkan and Cho, Kyunghyun},
    journal={arXiv preprint arXiv:1704.05179},
    year={2017}
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/hotpot_qa

구성 설명 : Crowdworkers는 Wikipedia에서 두 개의 엔터티 링크 단락을 표시하고 해결하기 위해 다중 홉 추론이 필요한 질문을 작성하고 답변하라는 요청을 받습니다. 원래 설정에서 이러한 단락은 추론을 더 어렵게 만들기 위해 추가 선택 항목 단락과 혼합됩니다. 여기서 선택 항목 단락은 포함되지 않습니다.
다운로드 크기 : 111.98 MiB
데이터 세트 크기 : 272.87 MiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'train'`	72,928
`'validation'`	5,901

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{yang-etal-2018-hotpotqa,
    title = "{H}otpot{QA}: A Dataset for Diverse, Explainable Multi-hop Question Answering",
    author = "Yang, Zhilin  and
      Qi, Peng  and
      Zhang, Saizheng  and
      Bengio, Yoshua  and
      Cohen, William  and
      Salakhutdinov, Ruslan  and
      Manning, Christopher D.",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1259",
    doi = "10.18653/v1/D18-1259",
    pages = "2369--2380",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/natural_questions

구성 설명 : 자연 조건에서 실제 사용자가 Google 검색 엔진에 대한 정보 검색 쿼리에서 질문을 수집합니다. 크라우드 워커가 검색한 Wikipedia 페이지에 질문에 대한 답변이 주석으로 추가됩니다. 두 가지 유형의 주석이 수집됩니다. 1) 질문에 대한 답변을 완전히 유추할 수 있는 충분한 정보를 포함하는 HTML 경계 상자(Long Answer) 및 2) 실제 답변을 구성하는 경계 상자 내의 하위 범위(Short Answer) ). 답변이 짧은 예만 사용하고 긴 답변을 컨텍스트로 사용합니다.
다운로드 크기 : 121.15 MiB
데이터 세트 크기 : 339.03 MiB
자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'train'`	104,071
`'validation'`	12,836

예 ( tfds.as_dataframe ):

인용 :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/bio_asq

구성 설명 : BioASQ는 대규모 생물 의학 의미론적 인덱싱 및 질문 답변에 대한 도전으로 도메인 전문가가 만든 질문 및 답변 쌍을 포함합니다. 그런 다음 여러 관련 과학(PubMed) 기사에 수동으로 연결됩니다. 연결된 각 기사의 전체 초록이 다운로드되어 개별 컨텍스트로 사용됩니다(예: 단일 질문을 여러 개의 독립적인 기사에 연결하여 여러 QA 컨텍스트 쌍을 생성할 수 있음). 답변을 정확히 포함하지 않는 초록은 폐기됩니다.
다운로드 크기 : 2.54 MiB
데이터 세트 크기 : 6.70 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	1,504

예 ( tfds.as_dataframe ):

인용 :

@article{tsatsaronis2015overview,
    title={An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition},
    author={Tsatsaronis, George and Balikas, Georgios and Malakasiotis, Prodromos and Partalas, Ioannis and Zschunke, Matthias and Alvers, Michael R and Weissenborn, Dirk and Krithara, Anastasia and Petridis, Sergios and Polychronopoulos, Dimitris and others},
    journal={BMC bioinformatics},
    volume={16},
    number={1},
    pages={1--28},
    year={2015},
    publisher={Springer}
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/드롭

구성 설명 : DROP(Discrete Reasoning Over the content of Paragraphs) 예제는 크라우드 작업자가 Wikipedia 단락에서 질문-답변 쌍을 생성하도록 요청받는 SQuAD와 유사하게 수집되었습니다. 질문은 정량적 추론에 중점을 두고 있으며 원래 데이터 세트에는 추출되지 않은 숫자 응답과 추출 텍스트 응답이 포함되어 있습니다. 추출 가능한 일련의 질문이 사용됩니다.
다운로드 크기 : 578.25 KiB
데이터 세트 크기 : 5.41 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	1,503

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{dua-etal-2019-drop,
    title = "{DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs",
    author = "Dua, Dheeru  and
      Wang, Yizhong  and
      Dasigi, Pradeep  and
      Stanovsky, Gabriel  and
      Singh, Sameer  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1246",
    doi = "10.18653/v1/N19-1246",
    pages = "2368--2378",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/duo_rc

구성 설명 : DuoRC 데이터 세트의 ParaphraseRC 분할이 사용됩니다. 이 설정에서는 동일한 영화의 두 가지 줄거리 요약이 수집됩니다. 하나는 Wikipedia에서, 다른 하나는 IMDb에서 수집됩니다. 크라우드 워커 두 그룹이 영화 줄거리에 대해 묻고 답합니다. 여기서 "질문자"는 Wikipedia 페이지만 표시되고 "답변자"는 IMDb 페이지만 표시됩니다. 답변 불가로 표시된 질문은 무시됩니다.
다운로드 크기 : 1.14 MiB
데이터 세트 크기 : 15.04 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	1,501

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{saha-etal-2018-duorc,
    title = "{D}uo{RC}: Towards Complex Language Understanding with Paraphrased Reading Comprehension",
    author = "Saha, Amrita  and
      Aralikatte, Rahul  and
      Khapra, Mitesh M.  and
      Sankaranarayanan, Karthik",
    booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = jul,
    year = "2018",
    address = "Melbourne, Australia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P18-1156",
    doi = "10.18653/v1/P18-1156",
    pages = "1683--1693",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/레이스

구성 설명 : RACE(ReAding Comprehension Dataset From Examinations)는 중국 중고등학교 학생을 위한 영어 독해 시험에서 수집됩니다. 고등학교 분할(더 어려운)이 사용되며 암묵적인 "빈칸 채우기" 스타일 질문(이 작업에 부자연스러움)도 필터링됩니다.
다운로드 크기 : 1.49 MiB
데이터 세트 크기 : 3.53 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	674

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{lai-etal-2017-race,
    title = "{RACE}: Large-scale {R}e{A}ding Comprehension Dataset From Examinations",
    author = "Lai, Guokun  and
      Xie, Qizhe  and
      Liu, Hanxiao  and
      Yang, Yiming  and
      Hovy, Eduard",
    booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing",
    month = sep,
    year = "2017",
    address = "Copenhagen, Denmark",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D17-1082",
    doi = "10.18653/v1/D17-1082",
    pages = "785--794",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/relation_extraction

구성 설명 : 슬롯 채우기 데이터 세트가 주어지면 엔티티 간의 관계가 템플릿을 사용하여 체계적으로 질문 답변 쌍으로 변환됩니다. 예를 들어, 한 문장에 등장하는 두 개체 x와 y 사이의 Educational_at(x, y) 관계는 “x는 어디에서 교육을 받았습니까?”로 표현될 수 있습니다. 답변 y로. 각 유형의 관계에 대한 여러 템플릿이 수집됩니다. 데이터 세트의 제로샷 벤치마크 분할(보이지 않는 관계에 대한 일반화)이 사용되며 긍정적인 예만 유지됩니다.
다운로드 크기 : 830.88 KiB
데이터 세트 크기 : 3.71 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	2,948

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{levy-etal-2017-zero,
    title = "Zero-Shot Relation Extraction via Reading Comprehension",
    author = "Levy, Omer  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Zettlemoyer, Luke",
    booktitle = "Proceedings of the 21st Conference on Computational Natural Language Learning ({C}o{NLL} 2017)",
    month = aug,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/K17-1034",
    doi = "10.18653/v1/K17-1034",
    pages = "333--342",
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."

mrqa/교과서_qa

구성 설명 : TextbookQA는 중학교 생명과학, 지구과학, 물리과학 교과서의 수업에서 수집됩니다. 다이어그램과 함께 제공되는 질문 또는 "참 또는 거짓" 질문은 포함되지 않습니다.
다운로드 크기 : 1.79 MiB
데이터 세트 크기 : 14.04 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	1,503

예 ( tfds.as_dataframe ):

인용 :

@inproceedings{kembhavi2017you,
    title={Are you smarter than a sixth grader? textbook question answering for multimodal machine comprehension},
    author={Kembhavi, Aniruddha and Seo, Minjoon and Schwenk, Dustin and Choi, Jonghyun and Farhadi, Ali and Hajishirzi, Hannaneh},
    booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern recognition},
    pages={4999--5007},
    year={2017}
}

@inproceedings{fisch-etal-2019-mrqa,
    title = "{MRQA} 2019 Shared Task: Evaluating Generalization in Reading Comprehension",
    author = "Fisch, Adam  and
      Talmor, Alon  and
      Jia, Robin  and
      Seo, Minjoon  and
      Choi, Eunsol  and
      Chen, Danqi",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5801",
    doi = "10.18653/v1/D19-5801",
    pages = "1--13",
}

Note that each MRQA dataset has its own citation. Please see the source to see
the correct citation for each contained dataset."