TFDS artık Kruvasan 🥐 formatını destekliyor! Daha fazlasını öğrenmek için belgeleri okuyun.

Bu sayfa, Cloud Translation API ile çevrilmiştir.

birleşik_qa

Açıklama :

UnifiedQA kıyaslaması, farklı biçimleri ve çeşitli karmaşık dil olaylarını hedefleyen 20 ana soru yanıtlama (QA) veri setinden (her birinin birden fazla sürümü olabilir) oluşur. Bu veri kümeleri, çeşitli biçimler/kategoriler halinde gruplandırılmıştır: çıkarımsal KG, soyutlamalı KG, çoktan seçmeli KG ve evet/hayır KG. Ek olarak, çeşitli veri kümeleri için kontrast kümeleri kullanılır ("kontrast kümeleri " ile gösterilir). Bu değerlendirme kümeleri, orijinal veri kümesinde yaygın olan kalıplardan sapan, uzmanlar tarafından oluşturulmuş tedirginliklerdir. Kanıt paragraflarıyla birlikte gelmeyen çeşitli veri kümeleri için iki değişken dahil edilmiştir: biri veri kümelerinin olduğu gibi kullanıldığı, diğeri ise ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları kullanan ve "_ir" etiketleriyle gösterilen.

Daha fazla bilgi şu adreste bulunabilir: https://github.com/allenai/unifiedqa

Ana Sayfa : https://github.com/allenai/unifiedqa
Kaynak kodu : tfds.text.unifiedqa.UnifiedQA
sürümler :
- 1.0.0 (varsayılan): İlk sürüm.
Özellik yapısı :

FeaturesDict({
    'input': string,
    'output': string,
})

Özellik belgeleri :

Özellik	Sınıf	Dtipi
	ÖzelliklerDict
giriş	tensör	sicim
çıktı	tensör	sicim

Denetlenen anahtarlar (Bkz as_supervised doc ): None
Şekil ( tfds.show_examples ): Desteklenmiyor.

unified_qa/ai2_science_elementary (varsayılan yapılandırma)

Yapılandırma açıklaması : AI2 Science Questions veri kümesi, Amerika Birleşik Devletleri'nde ilkokul ve ortaokul sınıf seviyelerinde öğrenci değerlendirmelerinde kullanılan sorulardan oluşur. Her soru 4'lü çoktan seçmeli formattadır ve bir diyagram unsuru içerebilir veya içermeyebilir. Bu set ilkokul sınıf seviyeleri için kullanılan sorulardan oluşmaktadır.
İndirme boyutu : 345.59 KiB
Veri kümesi boyutu : 390.02 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	542
`'train'`	623
`'validation'`	123

Örnekler ( tfds.as_dataframe ):

Alıntı :

http://data.allenai.org/ai2-science-questions

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/ai2_science_middle

Yapılandırma açıklaması : AI2 Science Questions veri kümesi, Amerika Birleşik Devletleri'nde ilkokul ve ortaokul sınıf seviyelerinde öğrenci değerlendirmelerinde kullanılan sorulardan oluşur. Her soru 4'lü çoktan seçmeli formattadır ve bir diyagram unsuru içerebilir veya içermeyebilir. Bu set ortaokul sınıf seviyeleri için kullanılan sorulardan oluşmaktadır.
İndirme boyutu : 428.41 KiB
Veri kümesi boyutu : 477.40 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	679
`'train'`	605
`'validation'`	125

Örnekler ( tfds.as_dataframe ):

Alıntı :

http://data.allenai.org/ai2-science-questions

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/ambigqa

Yapılandırma açıklaması : AmbigQA, her makul yanıtı bulmayı ve ardından belirsizliği çözmek için soruyu her biri için yeniden yazmayı içeren açık alanlı bir soru yanıtlama görevidir.
İndirme boyutu : 2.27 MiB
Veri kümesi boyutu : 3.04 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	19.806
`'validation'`	5.674

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{min-etal-2020-ambigqa,
    title = "{A}mbig{QA}: Answering Ambiguous Open-domain Questions",
    author = "Min, Sewon  and
      Michael, Julian  and
      Hajishirzi, Hannaneh  and
      Zettlemoyer, Luke",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.emnlp-main.466",
    doi = "10.18653/v1/2020.emnlp-main.466",
    pages = "5783--5797",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_easy

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır.
İndirme boyutu : 1.24 MiB
Veri kümesi boyutu : 1.42 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_easy_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır.
İndirme boyutu : 1.24 MiB
Veri kümesi boyutu : 1.42 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/arc_easy_with_ir

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 7.00 MiB
Veri kümesi boyutu : 7.17 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/arc_easy_with_ir_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 7.00 MiB
Veri kümesi boyutu : 7.17 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır.
İndirme boyutu : 758.03 KiB
Veri kümesi boyutu : 848.28 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır.
İndirme boyutu : 758.03 KiB
Veri kümesi boyutu : 848.28 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard_with_ir

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 3.53 MiB
Veri kümesi boyutu : 3.62 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard_with_ir_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 3.53 MiB
Veri kümesi boyutu : 3.62 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/boolq

Yapılandırma açıklaması : BoolQ, evet/hayır soruları için veri kümesini yanıtlayan bir sorudur. Bu sorular doğal olarak ortaya çıkıyor --- sorulmamış ve kısıtlanmamış ortamlarda üretiliyorlar. Her örnek, isteğe bağlı ek bağlam olarak sayfanın başlığıyla birlikte (soru, pasaj, cevap) üçlüsüdür. Metin çifti sınıflandırma kurulumu, mevcut doğal dil çıkarım görevlerine benzer.
İndirme boyutu : 7.77 MiB
Veri kümesi boyutu : 8.20 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	9.427
`'validation'`	3.270

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{clark-etal-2019-boolq,
    title = "{B}ool{Q}: Exploring the Surprising Difficulty of Natural Yes/No Questions",
    author = "Clark, Christopher  and
      Lee, Kenton  and
      Chang, Ming-Wei  and
      Kwiatkowski, Tom  and
      Collins, Michael  and
      Toutanova, Kristina",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1300",
    doi = "10.18653/v1/N19-1300",
    pages = "2924--2936",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/boolq_np

Yapılandırma açıklaması : BoolQ, evet/hayır soruları için veri kümesini yanıtlayan bir sorudur. Bu sorular doğal olarak ortaya çıkıyor --- sorulmamış ve kısıtlanmamış ortamlarda üretiliyorlar. Her örnek, isteğe bağlı ek bağlam olarak sayfanın başlığıyla birlikte (soru, pasaj, cevap) üçlüsüdür. Metin çifti sınıflandırma kurulumu, mevcut doğal dil çıkarım görevlerine benzer. Bu sürüm, orijinal sürüme doğal bozulmalar ekler.
İndirme boyutu : 10.80 MiB
Veri kümesi boyutu : 11.40 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	9.727
`'validation'`	7.596

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khashabi-etal-2020-bang,
    title = "More Bang for Your Buck: Natural Perturbation for Robust Question Answering",
    author = "Khashabi, Daniel  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.emnlp-main.12",
    doi = "10.18653/v1/2020.emnlp-main.12",
    pages = "163--170",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/commonsenseqa

Yapılandırma açıklaması : CommonsenseQA, doğru yanıtları tahmin etmek için farklı türde sağduyu bilgisi gerektiren yeni bir çoktan seçmeli soru yanıtlama veri kümesidir. Bir doğru yanıtı ve dört çeldirici yanıtı olan sorular içerir.
İndirme boyutu : 1.79 MiB
Veri kümesi boyutu : 2.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.140
`'train'`	9.741
`'validation'`	1.221

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{talmor-etal-2019-commonsenseqa,
    title = "{C}ommonsense{QA}: A Question Answering Challenge Targeting Commonsense Knowledge",
    author = "Talmor, Alon  and
      Herzig, Jonathan  and
      Lourie, Nicholas  and
      Berant, Jonathan",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1421",
    doi = "10.18653/v1/N19-1421",
    pages = "4149--4158",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/commonsenseqa_test

Yapılandırma açıklaması : CommonsenseQA, doğru yanıtları tahmin etmek için farklı türde sağduyu bilgisi gerektiren yeni bir çoktan seçmeli soru yanıtlama veri kümesidir. Bir doğru yanıtı ve dört çeldirici yanıtı olan sorular içerir.
İndirme boyutu : 1.79 MiB
Veri kümesi boyutu : 2.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.140
`'train'`	9.741
`'validation'`	1.221

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{talmor-etal-2019-commonsenseqa,
    title = "{C}ommonsense{QA}: A Question Answering Challenge Targeting Commonsense Knowledge",
    author = "Talmor, Alon  and
      Herzig, Jonathan  and
      Lourie, Nicholas  and
      Berant, Jonathan",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1421",
    doi = "10.18653/v1/N19-1421",
    pages = "4149--4158",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/contrast_sets_boolq

Yapılandırma açıklaması : BoolQ, evet/hayır soruları için veri kümesini yanıtlayan bir sorudur. Bu sorular doğal olarak ortaya çıkıyor --- sorulmamış ve kısıtlanmamış ortamlarda üretiliyorlar. Her örnek, isteğe bağlı ek bağlam olarak sayfanın başlığıyla birlikte (soru, pasaj, cevap) üçlüsüdür. Metin çifti sınıflandırma kurulumu, mevcut doğal dil çıkarım görevlerine benzer. Bu sürüm kontrast setleri kullanır. Bu değerlendirme kümeleri, orijinal veri kümesinde yaygın olan kalıplardan sapan, uzmanlar tarafından oluşturulmuş tedirginliklerdir.
İndirme boyutu : 438.51 KiB
Veri kümesi boyutu : 462.35 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	340
`'validation'`	340

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{clark-etal-2019-boolq,
    title = "{B}ool{Q}: Exploring the Surprising Difficulty of Natural Yes/No Questions",
    author = "Clark, Christopher  and
      Lee, Kenton  and
      Chang, Ming-Wei  and
      Kwiatkowski, Tom  and
      Collins, Michael  and
      Toutanova, Kristina",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1300",
    doi = "10.18653/v1/N19-1300",
    pages = "2924--2936",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/contrast_sets_drop

Yapılandırma açıklaması : DROP, bir sistemin bir sorudaki referansları, belki de birden fazla giriş konumuna çözümlemesi ve bunlar üzerinde ayrı işlemler (toplama, sayma veya sıralama gibi) gerçekleştirmesi gereken, kitle kaynaklı, çekişmeli olarak oluşturulmuş bir KG kıyaslamasıdır. Bu işlemler, önceki veri kümeleri için gerekli olandan çok daha kapsamlı bir paragraf içeriği anlayışı gerektirir. Bu sürüm kontrast setleri kullanır. Bu değerlendirme kümeleri, orijinal veri kümesinde yaygın olan kalıplardan sapan, uzmanlar tarafından oluşturulmuş tedirginliklerdir.
İndirme boyutu : 2.20 MiB
Veri kümesi boyutu : 2.26 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	947
`'validation'`	947

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dua-etal-2019-drop,
    title = "{DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs",
    author = "Dua, Dheeru  and
      Wang, Yizhong  and
      Dasigi, Pradeep  and
      Stanovsky, Gabriel  and
      Singh, Sameer  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1246",
    doi = "10.18653/v1/N19-1246",
    pages = "2368--2378",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/contrast_sets_quoref

Yapılandırma açıklaması : Bu veri kümesi, okuduğunu anlama sistemlerinin temel muhakeme yeteneğini test eder. Vikipedi'den paragraflar üzerinden sorular içeren bu açıklık-seçim kıyaslamasında, bir sistemin soruları yanıtlamak için paragraflarda uygun aralık(ları) seçmeden önce katı referansları çözmesi gerekir. Bu sürüm kontrast setleri kullanır. Bu değerlendirme kümeleri, orijinal veri kümesinde yaygın olan kalıplardan sapan, uzmanlar tarafından oluşturulmuş tedirginliklerdir.
İndirme boyutu : 2.60 MiB
Veri kümesi boyutu : 2.65 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	700
`'validation'`	700

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dasigi-etal-2019-quoref,
    title = "{Q}uoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning",
    author = "Dasigi, Pradeep  and
      Liu, Nelson F.  and
      Marasovi{'c}, Ana  and
      Smith, Noah A.  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1606",
    doi = "10.18653/v1/D19-1606",
    pages = "5925--5932",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/contrast_sets_ropes

Yapılandırma açıklaması : Bu veri kümesi, bir sistemin bilgiyi bir metin geçişinden yeni bir duruma uygulama becerisini test eder. Bir sistem, nedensel veya niteliksel ilişki(ler) içeren bir arka plan pasajı (örneğin, "hayvan tozlayıcıları çiçeklerde döllenmenin etkinliğini artırır"), bu arka planı kullanan yeni bir durum ve ilişkilerin etkileri hakkında akıl yürütmeyi gerektiren sorular sunulur. durum bağlamında arka plan pasajı. Bu sürüm kontrast setleri kullanır. Bu değerlendirme kümeleri, orijinal veri kümesinde yaygın olan kalıplardan sapan, uzmanlar tarafından oluşturulmuş tedirginliklerdir.
İndirme boyutu : 1.97 MiB
Veri kümesi boyutu : 2.04 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	974
`'validation'`	974

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lin-etal-2019-reasoning,
    title = "Reasoning Over Paragraph Effects in Situations",
    author = "Lin, Kevin  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5808",
    doi = "10.18653/v1/D19-5808",
    pages = "58--62",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/bırak

Yapılandırma açıklaması : DROP, bir sistemin bir sorudaki referansları, belki de birden fazla giriş konumuna çözümlemesi ve bunlar üzerinde ayrı işlemler (toplama, sayma veya sıralama gibi) gerçekleştirmesi gereken, kitle kaynaklı, çekişmeli olarak oluşturulmuş bir KG kıyaslamasıdır. Bu işlemler, önceki veri kümeleri için gerekli olandan çok daha kapsamlı bir paragraf içeriği anlayışı gerektirir.
İndirme boyutu : 105.18 MiB
Veri kümesi boyutu : 108.16 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	77.399
`'validation'`	9.536

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dua-etal-2019-drop,
    title = "{DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs",
    author = "Dua, Dheeru  and
      Wang, Yizhong  and
      Dasigi, Pradeep  and
      Stanovsky, Gabriel  and
      Singh, Sameer  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1246",
    doi = "10.18653/v1/N19-1246",
    pages = "2368--2378",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/mctest

Yapılandırma açıklaması : MCTest, makinelerin kurgusal hikayelerle ilgili çoktan seçmeli okuduğunu anlama sorularını yanıtlamasını gerektirir ve doğrudan açık alanlı makine kavrayışının üst düzey hedefini ele alır. Okuduğunu anlama, nedensel muhakeme ve dünyayı anlama gibi gelişmiş becerileri test edebilir, ancak çoktan seçmeli olması yine de net bir ölçüm sağlar. Kurgusal olmakla, cevap tipik olarak sadece hikayenin kendisinde bulunabilir. Hikayeler ve sorular da dikkatli bir şekilde küçük bir çocuğun anlayabileceği şekilde sınırlandırılmıştır, bu da görev için gerekli olan dünya bilgisini azaltır.
İndirme boyutu : 2.14 MiB
Veri kümesi boyutu : 2.20 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	1.480
`'validation'`	320

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{richardson-etal-2013-mctest,
    title = "{MCT}est: A Challenge Dataset for the Open-Domain Machine Comprehension of Text",
    author = "Richardson, Matthew  and
      Burges, Christopher J.C.  and
      Renshaw, Erin",
    booktitle = "Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing",
    month = oct,
    year = "2013",
    address = "Seattle, Washington, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D13-1020",
    pages = "193--203",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/mctest_corrected_the_separator

Yapılandırma açıklaması : MCTest, makinelerin kurgusal hikayelerle ilgili çoktan seçmeli okuduğunu anlama sorularını yanıtlamasını gerektirir ve doğrudan açık alanlı makine kavrayışının üst düzey hedefini ele alır. Okuduğunu anlama, nedensel muhakeme ve dünyayı anlama gibi gelişmiş becerileri test edebilir, ancak çoktan seçmeli olması yine de net bir ölçüm sağlar. Kurgusal olmakla, cevap tipik olarak sadece hikayenin kendisinde bulunabilir. Hikayeler ve sorular da dikkatli bir şekilde küçük bir çocuğun anlayabileceği şekilde sınırlandırılmıştır, bu da görev için gerekli olan dünya bilgisini azaltır.
İndirme boyutu : 2.15 MiB
Veri kümesi boyutu : 2.21 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	1.480
`'validation'`	320

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{richardson-etal-2013-mctest,
    title = "{MCT}est: A Challenge Dataset for the Open-Domain Machine Comprehension of Text",
    author = "Richardson, Matthew  and
      Burges, Christopher J.C.  and
      Renshaw, Erin",
    booktitle = "Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing",
    month = oct,
    year = "2013",
    address = "Seattle, Washington, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D13-1020",
    pages = "193--203",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/çoklu

Yapılandırma açıklaması : MultiRC, soruların yalnızca birden fazla cümleden alınan bilgiler dikkate alınarak yanıtlanabildiği bir okuduğunu anlama yarışmasıdır. Bu meydan okumaya ilişkin sorular ve yanıtlar, 4 adımlı bir kitle kaynak kullanımı deneyi aracılığıyla istendi ve doğrulandı. Veri seti, metinlere ve soru ifadelerine dilsel çeşitlilik getiren 7 farklı alanda (ilkokul bilimi, haberler, gezi rehberleri, kurgu hikayeleri vb.) paragraflar için sorular içerir.
İndirme boyutu : 897.09 KiB
Veri kümesi boyutu : 918.42 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	312
`'validation'`	312

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khashabi-etal-2018-looking,
    title = "Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences",
    author = "Khashabi, Daniel  and
      Chaturvedi, Snigdha  and
      Roth, Michael  and
      Upadhyay, Shyam  and
      Roth, Dan",
    booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)",
    month = jun,
    year = "2018",
    address = "New Orleans, Louisiana",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N18-1023",
    doi = "10.18653/v1/N18-1023",
    pages = "252--262",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/anlatıqa

Yapılandırma açıklaması : NarrativeQA, özellikle uzun belgeler üzerinde okuduğunu anlamayı test etmek için tasarlanmış öyküler ve ilgili sorulardan oluşan İngilizce bir veri kümesidir.
İndirme boyutu : 308.28 MiB
Veri kümesi boyutu : 311.22 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'test'`	21.114
`'train'`	65.494
`'validation'`	6.922

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kocisky-etal-2018-narrativeqa,
    title = "The {N}arrative{QA} Reading Comprehension Challenge",
    author = "Ko{
{c} }isk{'y}, Tom{'a}{
{s} }  and
      Schwarz, Jonathan  and
      Blunsom, Phil  and
      Dyer, Chris  and
      Hermann, Karl Moritz  and
      Melis, G{'a}bor  and
      Grefenstette, Edward",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "6",
    year = "2018",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q18-1023",
    doi = "10.1162/tacl_a_00023",
    pages = "317--328",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/narrativeqa_dev

Yapılandırma açıklaması : NarrativeQA, özellikle uzun belgeler üzerinde okuduğunu anlamayı test etmek için tasarlanmış öyküler ve ilgili sorulardan oluşan İngilizce bir veri kümesidir.
İndirme boyutu : 308.28 MiB
Veri kümesi boyutu : 311.22 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'test'`	21.114
`'train'`	65.494
`'validation'`	6.922

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kocisky-etal-2018-narrativeqa,
    title = "The {N}arrative{QA} Reading Comprehension Challenge",
    author = "Ko{
{c} }isk{'y}, Tom{'a}{
{s} }  and
      Schwarz, Jonathan  and
      Blunsom, Phil  and
      Dyer, Chris  and
      Hermann, Karl Moritz  and
      Melis, G{'a}bor  and
      Grefenstette, Edward",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "6",
    year = "2018",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q18-1023",
    doi = "10.1162/tacl_a_00023",
    pages = "317--328",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions

Yapılandırma açıklaması : NQ külliyatı, gerçek kullanıcılardan gelen soruları içerir ve QA sistemlerinin, sorunun yanıtını içerebilecek veya içermeyebilecek tüm bir Wikipedia makalesini okumasını ve anlamasını gerektirir. Gerçek kullanıcı sorularının dahil edilmesi ve çözümlerin yanıtı bulmak için tüm sayfayı okuması gerekliliği, NQ'nun önceki KG veri kümelerinden daha gerçekçi ve zorlu bir görev olmasına neden olur.
İndirme boyutu : 6.95 MiB
Veri kümesi boyutu : 9.88 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	96.075
`'validation'`	2.295

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_direct_ans

Yapılandırma açıklaması : NQ külliyatı, gerçek kullanıcılardan gelen soruları içerir ve QA sistemlerinin, sorunun yanıtını içerebilecek veya içermeyebilecek tüm bir Wikipedia makalesini okumasını ve anlamasını gerektirir. Gerçek kullanıcı sorularının dahil edilmesi ve çözümlerin yanıtı bulmak için tüm sayfayı okuması gerekliliği, NQ'nun önceki KG veri kümelerinden daha gerçekçi ve zorlu bir görev olmasına neden olur. Bu sürüm doğrudan cevaplı sorulardan oluşmaktadır.
İndirme boyutu : 6.82 MiB
Veri kümesi boyutu : 10.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	6.468
`'train'`	96.676
`'validation'`	10.693

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_direct_ans_test

Yapılandırma açıklaması : NQ külliyatı, gerçek kullanıcılardan gelen soruları içerir ve QA sistemlerinin, sorunun yanıtını içerebilecek veya içermeyebilecek tüm bir Wikipedia makalesini okumasını ve anlamasını gerektirir. Gerçek kullanıcı sorularının dahil edilmesi ve çözümlerin yanıtı bulmak için tüm sayfayı okuması gerekliliği, NQ'nun önceki KG veri kümelerinden daha gerçekçi ve zorlu bir görev olmasına neden olur. Bu sürüm doğrudan cevaplı sorulardan oluşmaktadır.
İndirme boyutu : 6.82 MiB
Veri kümesi boyutu : 10.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	6.468
`'train'`	96.676
`'validation'`	10.693

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_with_dpr_para

Yapılandırma açıklaması : NQ külliyatı, gerçek kullanıcılardan gelen soruları içerir ve QA sistemlerinin, sorunun yanıtını içerebilecek veya içermeyebilecek tüm bir Wikipedia makalesini okumasını ve anlamasını gerektirir. Gerçek kullanıcı sorularının dahil edilmesi ve çözümlerin yanıtı bulmak için tüm sayfayı okuması gerekliliği, NQ'nun önceki KG veri kümelerinden daha gerçekçi ve zorlu bir görev olmasına neden olur. Bu sürüm, her soruyu artırmak için (DPR alma motoru kullanılarak elde edilen) ek paragraflar içerir.
İndirme boyutu : 319.22 MiB
Veri kümesi boyutu : 322.91 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'train'`	96.676
`'validation'`	10.693

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_with_dpr_para_test

Yapılandırma açıklaması : NQ külliyatı, gerçek kullanıcılardan gelen soruları içerir ve QA sistemlerinin, sorunun yanıtını içerebilecek veya içermeyebilecek tüm bir Wikipedia makalesini okumasını ve anlamasını gerektirir. Gerçek kullanıcı sorularının dahil edilmesi ve çözümlerin yanıtı bulmak için tüm sayfayı okuması gerekliliği, NQ'nun önceki KG veri kümelerinden daha gerçekçi ve zorlu bir görev olmasına neden olur. Bu sürüm, her soruyu artırmak için (DPR alma motoru kullanılarak elde edilen) ek paragraflar içerir.
İndirme boyutu : 306.94 MiB
Veri kümesi boyutu : 310.48 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'test'`	6.468
`'train'`	96.676

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/newsqa

Yapılandırma açıklaması : NewsQA, insan yapımı soru-cevap çiftlerinden oluşan zorlu bir makine anlama veri kümesidir. Crowdworkers, CNN'den gelen bir dizi haber makalesine dayalı olarak sorular ve yanıtlar sağlar ve yanıtlar, ilgili makalelerden alınan uzun metinlerden oluşur.
İndirme boyutu : 283.33 MiB
Veri kümesi boyutu : 285.94 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'train'`	75.882
`'validation'`	4.309

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{trischler-etal-2017-newsqa,
    title = "{N}ews{QA}: A Machine Comprehension Dataset",
    author = "Trischler, Adam  and
      Wang, Tong  and
      Yuan, Xingdi  and
      Harris, Justin  and
      Sordoni, Alessandro  and
      Bachman, Philip  and
      Suleman, Kaheer",
    booktitle = "Proceedings of the 2nd Workshop on Representation Learning for {NLP}",
    month = aug,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W17-2623",
    doi = "10.18653/v1/W17-2623",
    pages = "191--200",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/openbookqa

Yapılandırma açıklaması : OpenBookQA, hem konunun (açık bir kitap olarak özetlenen, veri kümesiyle birlikte sağlanan göze çarpan gerçeklerle) hem de ifade edildiği dilin daha derin bir şekilde anlaşılmasını sağlayarak gelişmiş soru yanıtlama araştırmalarını teşvik etmeyi amaçlar. çok adımlı muhakeme, ek ortak ve sağduyu bilgisi ve zengin metin anlayışı gerektiren sorular içerir. OpenBookQA, insanın bir konuyu anlamasını değerlendirmek için açık kitap sınavlarından sonra modellenen yeni bir tür soru yanıtlama veri kümesidir.
İndirme boyutu : 942.34 KiB
Veri kümesi boyutu : 1.11 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4.957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/openbookqa_dev

Yapılandırma açıklaması : OpenBookQA, hem konunun (açık bir kitap olarak özetlenen, veri kümesiyle birlikte sağlanan göze çarpan gerçeklerle) hem de ifade edildiği dilin daha derin bir şekilde anlaşılmasını sağlayarak gelişmiş soru yanıtlama araştırmalarını teşvik etmeyi amaçlar. çok adımlı muhakeme, ek ortak ve sağduyu bilgisi ve zengin metin anlayışı gerektiren sorular içerir. OpenBookQA, insanın bir konuyu anlamasını değerlendirmek için açık kitap sınavlarından sonra modellenen yeni bir tür soru yanıtlama veri kümesidir.
İndirme boyutu : 942.34 KiB
Veri kümesi boyutu : 1.11 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4.957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/openbookqa_with_ir

Yapılandırma açıklaması : OpenBookQA, hem konunun (açık bir kitap olarak özetlenen, veri kümesiyle birlikte sağlanan göze çarpan gerçeklerle) hem de ifade edildiği dilin daha derin bir şekilde anlaşılmasını sağlayarak gelişmiş soru yanıtlama araştırmalarını teşvik etmeyi amaçlar. çok adımlı muhakeme, ek ortak ve sağduyu bilgisi ve zengin metin anlayışı gerektiren sorular içerir. OpenBookQA, insanın bir konuyu anlamasını değerlendirmek için açık kitap sınavlarından sonra modellenen yeni bir tür soru yanıtlama veri kümesidir. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 6.08 MiB
Veri kümesi boyutu : 6.28 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4.957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/openbookqa_with_ir_dev

Yapılandırma açıklaması : OpenBookQA, hem konunun (açık bir kitap olarak özetlenen, veri kümesiyle birlikte sağlanan göze çarpan gerçeklerle) hem de ifade edildiği dilin daha derin bir şekilde anlaşılmasını sağlayarak gelişmiş soru yanıtlama araştırmalarını teşvik etmeyi amaçlar. çok adımlı muhakeme, ek ortak ve sağduyu bilgisi ve zengin metin anlayışı gerektiren sorular içerir. OpenBookQA, insanın bir konuyu anlamasını değerlendirmek için açık kitap sınavlarından sonra modellenen yeni bir tür soru yanıtlama veri kümesidir. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 6.08 MiB
Veri kümesi boyutu : 6.28 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4.957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/fiziksel_iqa

Yapılandırma açıklaması : Bu, fiziksel sağduyu anlayışındaki ilerlemeyi kıyaslamak için bir veri kümesidir. Altta yatan görev çoktan seçmeli soru cevaplamadır: bir soru q ve iki olası çözüm s1, s2 verildiğinde, bir model veya bir insan en uygun çözümü seçmelidir ve bunlardan tam olarak biri doğrudur. Veri seti, atipik çözümleri tercih ederek günlük durumlara odaklanır. Veri kümesi, kullanıcılara günlük malzemeleri kullanarak nesneleri nasıl inşa edecekleri, zanaat yapacakları, pişirecekleri veya manipüle edecekleri konusunda talimatlar sağlayan Instructables.com'dan esinlenmiştir. Açıklayıcılardan, fiziksel bilginin hedeflenmesini sağlamak için sözdizimsel ve topikal olarak benzer olan semantik karışıklıklar veya alternatif yaklaşımlar sağlamaları istenir. Veri kümesi, AFLite algoritması kullanılarak temel yapılardan daha fazla temizlenir.
İndirme boyutu : 6.01 MiB
Veri kümesi boyutu : 6.59 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	16.113
`'validation'`	1.838

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{bisk2020piqa,
    title={Piqa: Reasoning about physical commonsense in natural language},
    author={Bisk, Yonatan and Zellers, Rowan and Gao, Jianfeng and Choi, Yejin and others},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={7432--7439},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/qasc

Yapılandırma açıklaması : QASC, cümle kompozisyonuna odaklanan soru yanıtlayan bir veri kümesidir. İlkokul bilimiyle ilgili 8'li çoktan seçmeli sorulardan oluşur ve 17 milyon cümlelik bir külliyatla birlikte gelir.
İndirme boyutu : 1.75 MiB
Veri kümesi boyutu : 2.09 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8.134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/qasc_testi

Yapılandırma açıklaması : QASC, cümle kompozisyonuna odaklanan soru yanıtlayan bir veri kümesidir. İlkokul bilimiyle ilgili 8'li çoktan seçmeli sorulardan oluşur ve 17 milyon cümlelik bir külliyatla birlikte gelir.
İndirme boyutu : 1.75 MiB
Veri kümesi boyutu : 2.09 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8.134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/qasc_with_ir

Yapılandırma açıklaması : QASC, cümle kompozisyonuna odaklanan soru yanıtlayan bir veri kümesidir. İlkokul bilimiyle ilgili 8'li çoktan seçmeli sorulardan oluşur ve 17 milyon cümlelik bir külliyatla birlikte gelir. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 16.95 MiB
Veri kümesi boyutu : 17.30 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8.134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/qasc_with_ir_test

Yapılandırma açıklaması : QASC, cümle kompozisyonuna odaklanan soru yanıtlayan bir veri kümesidir. İlkokul bilimiyle ilgili 8'li çoktan seçmeli sorulardan oluşur ve 17 milyon cümlelik bir külliyatla birlikte gelir. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 16.95 MiB
Veri kümesi boyutu : 17.30 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8.134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/quoref

Yapılandırma açıklaması : Bu veri kümesi, okuduğunu anlama sistemlerinin temel muhakeme yeteneğini test eder. Vikipedi'den paragraflar üzerinden sorular içeren bu açıklık-seçim kıyaslamasında, bir sistemin soruları yanıtlamak için paragraflarda uygun aralık(ları) seçmeden önce katı referansları çözmesi gerekir.
İndirme boyutu : 51.43 MiB
Veri kümesi boyutu : 52.29 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	22.265
`'validation'`	2.768

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dasigi-etal-2019-quoref,
    title = "{Q}uoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning",
    author = "Dasigi, Pradeep  and
      Liu, Nelson F.  and
      Marasovi{'c}, Ana  and
      Smith, Noah A.  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1606",
    doi = "10.18653/v1/D19-1606",
    pages = "5925--5932",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/yarış_dizesi

Yapılandırma açıklaması : Race, büyük ölçekli bir okuduğunu anlama veri kümesidir. Veri seti, Çin'deki ortaokul ve lise öğrencileri için tasarlanmış İngilizce sınavlarından toplanmıştır. Veri kümesi, makine kavrayışı için eğitim ve test kümeleri olarak sunulabilir.
İndirme boyutu : 167.97 MiB
Veri kümesi boyutu : 171.23 MiB
Otomatik önbelleğe alınmış ( belgeler ): Evet (test, doğrulama), Yalnızca shuffle_files=False (tren) olduğunda
bölmeler :

Bölmek	örnekler
`'test'`	4.934
`'train'`	87.863
`'validation'`	4.887

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lai-etal-2017-race,
    title = "{RACE}: Large-scale {R}e{A}ding Comprehension Dataset From Examinations",
    author = "Lai, Guokun  and
      Xie, Qizhe  and
      Liu, Hanxiao  and
      Yang, Yiming  and
      Hovy, Eduard",
    booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing",
    month = sep,
    year = "2017",
    address = "Copenhagen, Denmark",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D17-1082",
    doi = "10.18653/v1/D17-1082",
    pages = "785--794",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/race_string_dev

Yapılandırma açıklaması : Race, büyük ölçekli bir okuduğunu anlama veri kümesidir. Veri seti, Çin'deki ortaokul ve lise öğrencileri için tasarlanmış İngilizce sınavlarından toplanmıştır. Veri kümesi, makine kavrayışı için eğitim ve test kümeleri olarak sunulabilir.
İndirme boyutu : 167.97 MiB
Veri kümesi boyutu : 171.23 MiB
Otomatik önbelleğe alınmış ( belgeler ): Evet (test, doğrulama), Yalnızca shuffle_files=False (tren) olduğunda
bölmeler :

Bölmek	örnekler
`'test'`	4.934
`'train'`	87.863
`'validation'`	4.887

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lai-etal-2017-race,
    title = "{RACE}: Large-scale {R}e{A}ding Comprehension Dataset From Examinations",
    author = "Lai, Guokun  and
      Xie, Qizhe  and
      Liu, Hanxiao  and
      Yang, Yiming  and
      Hovy, Eduard",
    booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing",
    month = sep,
    year = "2017",
    address = "Copenhagen, Denmark",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D17-1082",
    doi = "10.18653/v1/D17-1082",
    pages = "785--794",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/ipler

Yapılandırma açıklaması : Bu veri kümesi, bir sistemin bilgiyi bir metin geçişinden yeni bir duruma uygulama becerisini test eder. Bir sistem, nedensel veya niteliksel ilişki(ler) içeren bir arka plan pasajı (örneğin, "hayvan tozlayıcıları çiçeklerde döllenmenin etkinliğini artırır"), bu arka planı kullanan yeni bir durum ve ilişkilerin etkileri hakkında akıl yürütmeyi gerektiren sorular sunulur. durum bağlamında arka plan pasajı.
İndirme boyutu : 12.91 MiB
Veri kümesi boyutu : 13.35 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	10.924
`'validation'`	1.688

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lin-etal-2019-reasoning,
    title = "Reasoning Over Paragraph Effects in Situations",
    author = "Lin, Kevin  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5808",
    doi = "10.18653/v1/D19-5808",
    pages = "58--62",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/social_iqa

Yapılandırma açıklaması : Bu, sosyal durumlar hakkında sağduyulu muhakeme için geniş ölçekli bir kıyaslamadır. Social IQa, çeşitli günlük durumlarda duygusal ve sosyal zekayı araştırmak için çoktan seçmeli sorular içerir. Kitle kaynak kullanımı yoluyla, sosyal etkileşimlerle ilgili doğru ve yanlış yanıtların yanı sıra sağduyulu sorular toplanır ve işçilerden farklı ama ilgili bir soruya doğru yanıtı vermeleri istenerek yanlış yanıtlardaki biçimsel bozuklukları azaltan yeni bir çerçeve kullanılır.
İndirme boyutu : 7.08 MiB
Veri kümesi boyutu : 8.22 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	33.410
`'validation'`	1.954

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sap-etal-2019-social,
    title = "Social {IQ}a: Commonsense Reasoning about Social Interactions",
    author = "Sap, Maarten  and
      Rashkin, Hannah  and
      Chen, Derek  and
      Le Bras, Ronan  and
      Choi, Yejin",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1454",
    doi = "10.18653/v1/D19-1454",
    pages = "4463--4473",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/squad1_1

Yapılandırma açıklaması : Bu, kalabalık çalışanlar tarafından bir dizi Wikipedia makalesinde sorulan sorulardan oluşan bir okuduğunu anlama veri kümesidir ve her sorunun yanıtı, ilgili okuma pasajından bir metin bölümüdür.
İndirme boyutu : 80.62 MiB
Veri kümesi boyutu : 83.99 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	87.514
`'validation'`	10.570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{rajpurkar-etal-2016-squad,
    title = "{SQ}u{AD}: 100,000+ Questions for Machine Comprehension of Text",
    author = "Rajpurkar, Pranav  and
      Zhang, Jian  and
      Lopyrev, Konstantin  and
      Liang, Percy",
    booktitle = "Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2016",
    address = "Austin, Texas",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D16-1264",
    doi = "10.18653/v1/D16-1264",
    pages = "2383--2392",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/squad2

Yapılandırma açıklaması : Bu veri kümesi, orijinal Stanford Soru Yanıt Veri Kümesi (SQuAD) veri kümesini, yanıtlanabilir sorulara benzer görünmek için kalabalık çalışanlar tarafından düşmanca yazılmış yanıtlanamaz sorularla birleştirir.
İndirme boyutu : 116.56 MiB
Veri kümesi boyutu : 121.43 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	130.149
`'validation'`	11.873

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{rajpurkar-etal-2018-know,
    title = "Know What You Don{'}t Know: Unanswerable Questions for {SQ}u{AD}",
    author = "Rajpurkar, Pranav  and
      Jia, Robin  and
      Liang, Percy",
    booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = jul,
    year = "2018",
    address = "Melbourne, Australia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P18-2124",
    doi = "10.18653/v1/P18-2124",
    pages = "784--789",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/winogrande_l

Yapılandırma açıklaması : Bu veri kümesi, orijinal Winograd Schema Challenge tasarımından esinlenmiştir, ancak veri kümesinin hem ölçeğini hem de sertliğini iyileştirmek için ayarlanmıştır. Veri kümesi oluşturmanın temel adımları, (1) dikkatlice tasarlanmış bir kitle kaynak kullanımı prosedüründen ve ardından (2) insan tarafından algılanabilen kelime ilişkilendirmelerini makine tarafından algılanabilen gömme ilişkilendirmelerine genelleştiren yeni bir AfLite algoritması kullanılarak sistematik önyargı azaltmadan oluşur. Farklı boyutlarda eğitim setleri verilmektedir. Bu set l bedene karşılık gelir.
İndirme boyutu : 1.49 MiB
Veri kümesi boyutu : 1.83 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	10.234
`'validation'`	1.267

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sakaguchi2020winogrande,
  title={Winogrande: An adversarial winograd schema challenge at scale},
  author={Sakaguchi, Keisuke and Le Bras, Ronan and Bhagavatula, Chandra and Choi, Yejin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8732--8740},
  year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/winogrande_m

Yapılandırma açıklaması : Bu veri kümesi, orijinal Winograd Schema Challenge tasarımından esinlenmiştir, ancak veri kümesinin hem ölçeğini hem de sertliğini iyileştirmek için ayarlanmıştır. Veri kümesi oluşturmanın temel adımları, (1) dikkatlice tasarlanmış bir kitle kaynak kullanımı prosedüründen ve ardından (2) insan tarafından algılanabilen kelime ilişkilendirmelerini makine tarafından algılanabilen gömme ilişkilendirmelerine genelleştiren yeni bir AfLite algoritması kullanılarak sistematik önyargı azaltmadan oluşur. Farklı boyutlarda eğitim setleri verilmektedir. Bu set m bedene karşılık gelir.
İndirme boyutu : 507.46 KiB
Veri kümesi boyutu : 623.15 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	2.558
`'validation'`	1.267

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sakaguchi2020winogrande,
  title={Winogrande: An adversarial winograd schema challenge at scale},
  author={Sakaguchi, Keisuke and Le Bras, Ronan and Bhagavatula, Chandra and Choi, Yejin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8732--8740},
  year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/winogrande_s

Yapılandırma açıklaması : Bu veri kümesi, orijinal Winograd Schema Challenge tasarımından esinlenmiştir, ancak veri kümesinin hem ölçeğini hem de sertliğini iyileştirmek için ayarlanmıştır. Veri kümesi oluşturmanın temel adımları, (1) dikkatlice tasarlanmış bir kitle kaynak kullanımı prosedüründen ve ardından (2) insan tarafından algılanabilen kelime ilişkilendirmelerini makine tarafından algılanabilen gömme ilişkilendirmelerine genelleştiren yeni bir AfLite algoritması kullanılarak sistematik önyargı azaltmadan oluşur. Farklı boyutlarda eğitim setleri verilmektedir. Bu set s bedene karşılık gelir.
İndirme boyutu : 479.24 KiB
Veri kümesi boyutu : 590.47 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.767
`'train'`	640
`'validation'`	1.267

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sakaguchi2020winogrande,
  title={Winogrande: An adversarial winograd schema challenge at scale},
  author={Sakaguchi, Keisuke and Le Bras, Ronan and Bhagavatula, Chandra and Choi, Yejin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8732--8740},
  year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

Açıklama :

Daha fazla bilgi şu adreste bulunabilir: https://github.com/allenai/unifiedqa

Ana Sayfa : https://github.com/allenai/unifiedqa
Kaynak kodu : tfds.text.unifiedqa.UnifiedQA
sürümler :
- 1.0.0 (varsayılan): İlk sürüm.
Özellik yapısı :

FeaturesDict({
    'input': string,
    'output': string,
})

Özellik belgeleri :

Özellik	Sınıf	Dtipi
	ÖzelliklerDict
giriş	tensör	sicim
çıktı	tensör	sicim

Denetlenen anahtarlar (Bkz as_supervised doc ): None
Şekil ( tfds.show_examples ): Desteklenmiyor.

unified_qa/ai2_science_elementary (varsayılan yapılandırma)

Yapılandırma açıklaması : AI2 Science Questions veri kümesi, Amerika Birleşik Devletleri'nde ilkokul ve ortaokul sınıf seviyelerinde öğrenci değerlendirmelerinde kullanılan sorulardan oluşur. Her soru 4'lü çoktan seçmeli formattadır ve bir diyagram unsuru içerebilir veya içermeyebilir. Bu set ilkokul sınıf seviyeleri için kullanılan sorulardan oluşmaktadır.
İndirme boyutu : 345.59 KiB
Veri kümesi boyutu : 390.02 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	542
`'train'`	623
`'validation'`	123

Örnekler ( tfds.as_dataframe ):

Alıntı :

http://data.allenai.org/ai2-science-questions

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/ai2_science_middle

Yapılandırma açıklaması : AI2 Science Questions veri kümesi, Amerika Birleşik Devletleri'nde ilkokul ve ortaokul sınıf seviyelerinde öğrenci değerlendirmelerinde kullanılan sorulardan oluşur. Her soru 4'lü çoktan seçmeli formattadır ve bir diyagram unsuru içerebilir veya içermeyebilir. Bu set ortaokul sınıf seviyeleri için kullanılan sorulardan oluşmaktadır.
İndirme boyutu : 428.41 KiB
Veri kümesi boyutu : 477.40 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	679
`'train'`	605
`'validation'`	125

Örnekler ( tfds.as_dataframe ):

Alıntı :

http://data.allenai.org/ai2-science-questions

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/ambigqa

Yapılandırma açıklaması : AmbigQA, her makul yanıtı bulmayı ve ardından belirsizliği çözmek için soruyu her biri için yeniden yazmayı içeren açık alanlı bir soru yanıtlama görevidir.
İndirme boyutu : 2.27 MiB
Veri kümesi boyutu : 3.04 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	19.806
`'validation'`	5.674

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{min-etal-2020-ambigqa,
    title = "{A}mbig{QA}: Answering Ambiguous Open-domain Questions",
    author = "Min, Sewon  and
      Michael, Julian  and
      Hajishirzi, Hannaneh  and
      Zettlemoyer, Luke",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.emnlp-main.466",
    doi = "10.18653/v1/2020.emnlp-main.466",
    pages = "5783--5797",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_easy

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır.
İndirme boyutu : 1.24 MiB
Veri kümesi boyutu : 1.42 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_easy_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır.
İndirme boyutu : 1.24 MiB
Veri kümesi boyutu : 1.42 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/arc_easy_with_ir

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 7.00 MiB
Veri kümesi boyutu : 7.17 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/arc_easy_with_ir_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "kolay" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 7.00 MiB
Veri kümesi boyutu : 7.17 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	2.376
`'train'`	2.251
`'validation'`	570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır.
İndirme boyutu : 758.03 KiB
Veri kümesi boyutu : 848.28 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır.
İndirme boyutu : 758.03 KiB
Veri kümesi boyutu : 848.28 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard_with_ir

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 3.53 MiB
Veri kümesi boyutu : 3.62 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/arc_hard_with_ir_dev

Yapılandırma açıklaması : Bu veri kümesi, gelişmiş soru yanıtlama alanında araştırmayı teşvik etmek için bir araya getirilmiş, gerçek ilkokul düzeyinde, çoktan seçmeli bilim sorularından oluşur. Veri kümesi, bir Zorluk Kümesi ve bir Kolay Küme olarak bölünmüştür; burada ilki, yalnızca hem alma tabanlı bir algoritma hem de bir kelime birlikte oluşum algoritması tarafından yanlış yanıtlanan soruları içerir. Bu set "zor" sorulardan oluşmaktadır. Bu sürüm, ek kanıt olarak bir bilgi alma sistemi aracılığıyla getirilen paragrafları içerir.
İndirme boyutu : 3.53 MiB
Veri kümesi boyutu : 3.62 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1.172
`'train'`	1.119
`'validation'`	299

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{clark2018think,
    title={Think you have solved question answering? try arc, the ai2 reasoning challenge},
    author={Clark, Peter and Cowhey, Isaac and Etzioni, Oren and Khot, Tushar and Sabharwal, Ashish and Schoenick, Carissa and Tafjord, Oyvind},
    journal={arXiv preprint arXiv:1803.05457},
    year={2018}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/boolq

Yapılandırma açıklaması : BoolQ, evet/hayır soruları için veri kümesini yanıtlayan bir sorudur. Bu sorular doğal olarak ortaya çıkıyor --- sorulmamış ve kısıtlanmamış ortamlarda üretiliyorlar. Her örnek, isteğe bağlı ek bağlam olarak sayfanın başlığıyla birlikte (soru, pasaj, cevap) üçlüsüdür. Metin çifti sınıflandırma kurulumu, mevcut doğal dil çıkarım görevlerine benzer.
İndirme boyutu : 7.77 MiB
Veri kümesi boyutu : 8.20 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	9.427
`'validation'`	3.270

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{clark-etal-2019-boolq,
    title = "{B}ool{Q}: Exploring the Surprising Difficulty of Natural Yes/No Questions",
    author = "Clark, Christopher  and
      Lee, Kenton  and
      Chang, Ming-Wei  and
      Kwiatkowski, Tom  and
      Collins, Michael  and
      Toutanova, Kristina",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1300",
    doi = "10.18653/v1/N19-1300",
    pages = "2924--2936",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/boolq_np

Yapılandırma açıklaması : BoolQ, evet/hayır soruları için veri kümesini yanıtlayan bir sorudur. Bu sorular doğal olarak ortaya çıkıyor --- sorulmamış ve kısıtlanmamış ortamlarda üretiliyorlar. Her örnek, isteğe bağlı ek bağlam olarak sayfanın başlığıyla birlikte (soru, pasaj, cevap) üçlüsüdür. Metin çifti sınıflandırma kurulumu, mevcut doğal dil çıkarım görevlerine benzer. Bu sürüm, orijinal sürüme doğal bozulmalar ekler.
İndirme boyutu : 10.80 MiB
Veri kümesi boyutu : 11.40 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	9.727
`'validation'`	7.596

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khashabi-etal-2020-bang,
    title = "More Bang for Your Buck: Natural Perturbation for Robust Question Answering",
    author = "Khashabi, Daniel  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.emnlp-main.12",
    doi = "10.18653/v1/2020.emnlp-main.12",
    pages = "163--170",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa/commonsenseqa

Config description : CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains questions with one correct answer and four distractor answers.
Download size : 1.79 MiB
Dataset size : 2.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1,140
`'train'`	9,741
`'validation'`	1,221

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{talmor-etal-2019-commonsenseqa,
    title = "{C}ommonsense{QA}: A Question Answering Challenge Targeting Commonsense Knowledge",
    author = "Talmor, Alon  and
      Herzig, Jonathan  and
      Lourie, Nicholas  and
      Berant, Jonathan",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1421",
    doi = "10.18653/v1/N19-1421",
    pages = "4149--4158",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/commonsenseqa_test

Config description : CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains questions with one correct answer and four distractor answers.
Download size : 1.79 MiB
Dataset size : 2.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1,140
`'train'`	9,741
`'validation'`	1,221

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{talmor-etal-2019-commonsenseqa,
    title = "{C}ommonsense{QA}: A Question Answering Challenge Targeting Commonsense Knowledge",
    author = "Talmor, Alon  and
      Herzig, Jonathan  and
      Lourie, Nicholas  and
      Berant, Jonathan",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1421",
    doi = "10.18653/v1/N19-1421",
    pages = "4149--4158",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/contrast_sets_boolq

Config description : BoolQ is a question answering dataset for yes/no questions. These questions are naturally occurring ---they are generated in unprompted and unconstrained settings. Each example is a triplet of (question, passage, answer), with the title of the page as optional additional context. The text-pair classification setup is similar to existing natural language inference tasks. This version uses contrast sets. These evaluation sets are expert-generated perturbations that deviate from the patterns common in the original dataset.
Download size : 438.51 KiB
Dataset size : 462.35 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	340
`'validation'`	340

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{clark-etal-2019-boolq,
    title = "{B}ool{Q}: Exploring the Surprising Difficulty of Natural Yes/No Questions",
    author = "Clark, Christopher  and
      Lee, Kenton  and
      Chang, Ming-Wei  and
      Kwiatkowski, Tom  and
      Collins, Michael  and
      Toutanova, Kristina",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1300",
    doi = "10.18653/v1/N19-1300",
    pages = "2924--2936",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/contrast_sets_drop

Config description : DROP is a crowdsourced, adversarially-created QA benchmark, in which a system must resolve references in a question, perhaps to multiple input positions, and perform discrete operations over them (such as addition, counting, or sorting). These operations require a much more comprehensive understanding of the content of paragraphs than what was necessary for prior datasets. This version uses contrast sets. These evaluation sets are expert-generated perturbations that deviate from the patterns common in the original dataset.
Download size : 2.20 MiB
Dataset size : 2.26 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	947
`'validation'`	947

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dua-etal-2019-drop,
    title = "{DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs",
    author = "Dua, Dheeru  and
      Wang, Yizhong  and
      Dasigi, Pradeep  and
      Stanovsky, Gabriel  and
      Singh, Sameer  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1246",
    doi = "10.18653/v1/N19-1246",
    pages = "2368--2378",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/contrast_sets_quoref

Config description : This dataset tests the coreferential reasoning capability of reading comprehension systems. In this span-selection benchmark containing questions over paragraphs from Wikipedia, a system must resolve hard coreferences before selecting the appropriate span(s) in the paragraphs for answering questions. This version uses contrast sets. These evaluation sets are expert-generated perturbations that deviate from the patterns common in the original dataset.
Download size : 2.60 MiB
Dataset size : 2.65 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	700
`'validation'`	700

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dasigi-etal-2019-quoref,
    title = "{Q}uoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning",
    author = "Dasigi, Pradeep  and
      Liu, Nelson F.  and
      Marasovi{'c}, Ana  and
      Smith, Noah A.  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1606",
    doi = "10.18653/v1/D19-1606",
    pages = "5925--5932",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/contrast_sets_ropes

Config description : This dataset tests a system's ability to apply knowledge from a passage of text to a new situation. A system is presented a background passage containing a causal or qualitative relation(s) (eg, "animal pollinators increase efficiency of fertilization in flowers"), a novel situation that uses this background, and questions that require reasoning about effects of the relationships in the background passage in the context of the situation. This version uses contrast sets. These evaluation sets are expert-generated perturbations that deviate from the patterns common in the original dataset.
Download size : 1.97 MiB
Dataset size : 2.04 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	974
`'validation'`	974

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lin-etal-2019-reasoning,
    title = "Reasoning Over Paragraph Effects in Situations",
    author = "Lin, Kevin  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5808",
    doi = "10.18653/v1/D19-5808",
    pages = "58--62",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/drop

Config description : DROP is a crowdsourced, adversarially-created QA benchmark, in which a system must resolve references in a question, perhaps to multiple input positions, and perform discrete operations over them (such as addition, counting, or sorting). These operations require a much more comprehensive understanding of the content of paragraphs than what was necessary for prior datasets.
Download size : 105.18 MiB
Dataset size : 108.16 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	77,399
`'validation'`	9,536

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dua-etal-2019-drop,
    title = "{DROP}: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs",
    author = "Dua, Dheeru  and
      Wang, Yizhong  and
      Dasigi, Pradeep  and
      Stanovsky, Gabriel  and
      Singh, Sameer  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2019",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N19-1246",
    doi = "10.18653/v1/N19-1246",
    pages = "2368--2378",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/mctest

Config description : MCTest requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Reading comprehension can test advanced abilities such as causal reasoning and understanding the world, yet, by being multiple-choice, still provide a clear metric. By being fictional, the answer typically can be found only in the story itself. The stories and questions are also carefully limited to those a young child would understand, reducing the world knowledge that is required for the task.
Download size : 2.14 MiB
Dataset size : 2.20 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	1,480
`'validation'`	320

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{richardson-etal-2013-mctest,
    title = "{MCT}est: A Challenge Dataset for the Open-Domain Machine Comprehension of Text",
    author = "Richardson, Matthew  and
      Burges, Christopher J.C.  and
      Renshaw, Erin",
    booktitle = "Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing",
    month = oct,
    year = "2013",
    address = "Seattle, Washington, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D13-1020",
    pages = "193--203",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/mctest_corrected_the_separator

Config description : MCTest requires machines to answer multiple-choice reading comprehension questions about fictional stories, directly tackling the high-level goal of open-domain machine comprehension. Reading comprehension can test advanced abilities such as causal reasoning and understanding the world, yet, by being multiple-choice, still provide a clear metric. By being fictional, the answer typically can be found only in the story itself. The stories and questions are also carefully limited to those a young child would understand, reducing the world knowledge that is required for the task.
Download size : 2.15 MiB
Dataset size : 2.21 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	1,480
`'validation'`	320

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{richardson-etal-2013-mctest,
    title = "{MCT}est: A Challenge Dataset for the Open-Domain Machine Comprehension of Text",
    author = "Richardson, Matthew  and
      Burges, Christopher J.C.  and
      Renshaw, Erin",
    booktitle = "Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing",
    month = oct,
    year = "2013",
    address = "Seattle, Washington, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D13-1020",
    pages = "193--203",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/multirc

Config description : MultiRC is a reading comprehension challenge in which questions can only be answered by taking into account information from multiple sentences. Questions and answers for this challenge were solicited and verified through a 4-step crowdsourcing experiment. The dataset contains questions for paragraphs across 7 different domains ( elementary school science, news, travel guides, fiction stories, etc) bringing in linguistic diversity to the texts and to the questions wordings.
Download size : 897.09 KiB
Dataset size : 918.42 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	312
`'validation'`	312

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khashabi-etal-2018-looking,
    title = "Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences",
    author = "Khashabi, Daniel  and
      Chaturvedi, Snigdha  and
      Roth, Michael  and
      Upadhyay, Shyam  and
      Roth, Dan",
    booktitle = "Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)",
    month = jun,
    year = "2018",
    address = "New Orleans, Louisiana",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N18-1023",
    doi = "10.18653/v1/N18-1023",
    pages = "252--262",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/narrativeqa

Config description : NarrativeQA is an English-lanaguage dataset of stories and corresponding questions designed to test reading comprehension, especially on long documents.
Download size : 308.28 MiB
Dataset size : 311.22 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'test'`	21,114
`'train'`	65,494
`'validation'`	6,922

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kocisky-etal-2018-narrativeqa,
    title = "The {N}arrative{QA} Reading Comprehension Challenge",
    author = "Ko{
{c} }isk{'y}, Tom{'a}{
{s} }  and
      Schwarz, Jonathan  and
      Blunsom, Phil  and
      Dyer, Chris  and
      Hermann, Karl Moritz  and
      Melis, G{'a}bor  and
      Grefenstette, Edward",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "6",
    year = "2018",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q18-1023",
    doi = "10.1162/tacl_a_00023",
    pages = "317--328",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/narrativeqa_dev

Config description : NarrativeQA is an English-lanaguage dataset of stories and corresponding questions designed to test reading comprehension, especially on long documents.
Download size : 308.28 MiB
Dataset size : 311.22 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'test'`	21,114
`'train'`	65,494
`'validation'`	6,922

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kocisky-etal-2018-narrativeqa,
    title = "The {N}arrative{QA} Reading Comprehension Challenge",
    author = "Ko{
{c} }isk{'y}, Tom{'a}{
{s} }  and
      Schwarz, Jonathan  and
      Blunsom, Phil  and
      Dyer, Chris  and
      Hermann, Karl Moritz  and
      Melis, G{'a}bor  and
      Grefenstette, Edward",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "6",
    year = "2018",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q18-1023",
    doi = "10.1162/tacl_a_00023",
    pages = "317--328",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions

Config description : The NQ corpus contains questions from real users, and it requires QA systems to read and comprehend an entire Wikipedia article that may or may not contain the answer to the question. The inclusion of real user questions, and the requirement that solutions should read an entire page to find the answer, cause NQ to be a more realistic and challenging task than prior QA datasets.
Download size : 6.95 MiB
Dataset size : 9.88 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	96,075
`'validation'`	2,295

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_direct_ans

Config description : The NQ corpus contains questions from real users, and it requires QA systems to read and comprehend an entire Wikipedia article that may or may not contain the answer to the question. The inclusion of real user questions, and the requirement that solutions should read an entire page to find the answer, cause NQ to be a more realistic and challenging task than prior QA datasets. This version consists of direct-answer questions.
Download size : 6.82 MiB
Dataset size : 10.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	6,468
`'train'`	96,676
`'validation'`	10,693

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_direct_ans_test

Config description : The NQ corpus contains questions from real users, and it requires QA systems to read and comprehend an entire Wikipedia article that may or may not contain the answer to the question. The inclusion of real user questions, and the requirement that solutions should read an entire page to find the answer, cause NQ to be a more realistic and challenging task than prior QA datasets. This version consists of direct-answer questions.
Download size : 6.82 MiB
Dataset size : 10.19 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	6,468
`'train'`	96,676
`'validation'`	10,693

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_with_dpr_para

Config description : The NQ corpus contains questions from real users, and it requires QA systems to read and comprehend an entire Wikipedia article that may or may not contain the answer to the question. The inclusion of real user questions, and the requirement that solutions should read an entire page to find the answer, cause NQ to be a more realistic and challenging task than prior QA datasets. This version includes additional paragraphs (obtained using the DPR retrieval engine) to augment each question.
Download size : 319.22 MiB
Dataset size : 322.91 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'train'`	96,676
`'validation'`	10,693

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/natural_questions_with_dpr_para_test

Config description : The NQ corpus contains questions from real users, and it requires QA systems to read and comprehend an entire Wikipedia article that may or may not contain the answer to the question. The inclusion of real user questions, and the requirement that solutions should read an entire page to find the answer, cause NQ to be a more realistic and challenging task than prior QA datasets. This version includes additional paragraphs (obtained using the DPR retrieval engine) to augment each question.
Download size : 306.94 MiB
Dataset size : 310.48 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'test'`	6,468
`'train'`	96,676

Örnekler ( tfds.as_dataframe ):

Alıntı :

@article{kwiatkowski-etal-2019-natural,
    title = "Natural Questions: A Benchmark for Question Answering Research",
    author = "Kwiatkowski, Tom  and
      Palomaki, Jennimaria  and
      Redfield, Olivia  and
      Collins, Michael  and
      Parikh, Ankur  and
      Alberti, Chris  and
      Epstein, Danielle  and
      Polosukhin, Illia  and
      Devlin, Jacob  and
      Lee, Kenton  and
      Toutanova, Kristina  and
      Jones, Llion  and
      Kelcey, Matthew  and
      Chang, Ming-Wei  and
      Dai, Andrew M.  and
      Uszkoreit, Jakob  and
      Le, Quoc  and
      Petrov, Slav",
    journal = "Transactions of the Association for Computational Linguistics",
    volume = "7",
    year = "2019",
    address = "Cambridge, MA",
    publisher = "MIT Press",
    url = "https://aclanthology.org/Q19-1026",
    doi = "10.1162/tacl_a_00276",
    pages = "452--466",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/newsqa

Config description : NewsQA is a challenging machine comprehension dataset of human-generated question-answer pairs. Crowdworkers supply questions and answers based on a set of news articles from CNN, with answers consisting of spans of text from the corresponding articles.
Download size : 283.33 MiB
Dataset size : 285.94 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Hayır
bölmeler :

Bölmek	örnekler
`'train'`	75,882
`'validation'`	4,309

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{trischler-etal-2017-newsqa,
    title = "{N}ews{QA}: A Machine Comprehension Dataset",
    author = "Trischler, Adam  and
      Wang, Tong  and
      Yuan, Xingdi  and
      Harris, Justin  and
      Sordoni, Alessandro  and
      Bachman, Philip  and
      Suleman, Kaheer",
    booktitle = "Proceedings of the 2nd Workshop on Representation Learning for {NLP}",
    month = aug,
    year = "2017",
    address = "Vancouver, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/W17-2623",
    doi = "10.18653/v1/W17-2623",
    pages = "191--200",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/openbookqa

Config description : OpenBookQA aims to promote research in advanced question-answering, probing a deeper understanding of both the topic (with salient facts summarized as an open book, also provided with the dataset) and the language it is expressed in. In particular, it contains questions that require multi-step reasoning, use of additional common and commonsense knowledge, and rich text comprehension. OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject.
Download size : 942.34 KiB
Dataset size : 1.11 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4,957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/openbookqa_dev

Config description : OpenBookQA aims to promote research in advanced question-answering, probing a deeper understanding of both the topic (with salient facts summarized as an open book, also provided with the dataset) and the language it is expressed in. In particular, it contains questions that require multi-step reasoning, use of additional common and commonsense knowledge, and rich text comprehension. OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject.
Download size : 942.34 KiB
Dataset size : 1.11 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4,957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/openbookqa_with_ir

Config description : OpenBookQA aims to promote research in advanced question-answering, probing a deeper understanding of both the topic (with salient facts summarized as an open book, also provided with the dataset) and the language it is expressed in. In particular, it contains questions that require multi-step reasoning, use of additional common and commonsense knowledge, and rich text comprehension. OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject. This version includes paragraphs fetched via an information retrieval system as additional evidence.
Download size : 6.08 MiB
Dataset size : 6.28 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4,957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/openbookqa_with_ir_dev

Config description : OpenBookQA aims to promote research in advanced question-answering, probing a deeper understanding of both the topic (with salient facts summarized as an open book, also provided with the dataset) and the language it is expressed in. In particular, it contains questions that require multi-step reasoning, use of additional common and commonsense knowledge, and rich text comprehension. OpenBookQA is a new kind of question-answering dataset modeled after open book exams for assessing human understanding of a subject. This version includes paragraphs fetched via an information retrieval system as additional evidence.
Download size : 6.08 MiB
Dataset size : 6.28 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	500
`'train'`	4,957
`'validation'`	500

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{mihaylov-etal-2018-suit,
    title = "Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering",
    author = "Mihaylov, Todor  and
      Clark, Peter  and
      Khot, Tushar  and
      Sabharwal, Ashish",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D18-1260",
    doi = "10.18653/v1/D18-1260",
    pages = "2381--2391",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/physical_iqa

Config description : This is a dataset for benchmarking progress in physical commonsense understanding. The underlying task is multiple choice question answering: given a question q and two possible solutions s1, s2, a model or a human must choose the most appropriate solution, of which exactly one is correct. The dataset focuses on everyday situations with a preference for atypical solutions. The dataset is inspired by instructables.com, which provides users with instructions on how to build, craft, bake, or manipulate objects using everyday materials. Annotators are asked to provide semantic perturbations or alternative approaches which are otherwise syntactically and topically similar to ensure physical knowledge is targeted. The dataset is further cleaned of basic artifacts using the AFLite algorithm.
Download size : 6.01 MiB
Dataset size : 6.59 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	16.113
`'validation'`	1.838

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{bisk2020piqa,
    title={Piqa: Reasoning about physical commonsense in natural language},
    author={Bisk, Yonatan and Zellers, Rowan and Gao, Jianfeng and Choi, Yejin and others},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={7432--7439},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/qasc

Config description : QASC is a question-answering dataset with a focus on sentence composition. It consists of 8-way multiple-choice questions about grade school science, and comes with a corpus of 17M sentences.
Download size : 1.75 MiB
Dataset size : 2.09 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8,134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/qasc_test

Config description : QASC is a question-answering dataset with a focus on sentence composition. It consists of 8-way multiple-choice questions about grade school science, and comes with a corpus of 17M sentences.
Download size : 1.75 MiB
Dataset size : 2.09 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8,134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/qasc_with_ir

Config description : QASC is a question-answering dataset with a focus on sentence composition. It consists of 8-way multiple-choice questions about grade school science, and comes with a corpus of 17M sentences. This version includes paragraphs fetched via an information retrieval system as additional evidence.
Download size : 16.95 MiB
Dataset size : 17.30 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8,134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/qasc_with_ir_test

Config description : QASC is a question-answering dataset with a focus on sentence composition. It consists of 8-way multiple-choice questions about grade school science, and comes with a corpus of 17M sentences. This version includes paragraphs fetched via an information retrieval system as additional evidence.
Download size : 16.95 MiB
Dataset size : 17.30 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	920
`'train'`	8,134
`'validation'`	926

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{khot2020qasc,
    title={Qasc: A dataset for question answering via sentence composition},
    author={Khot, Tushar and Clark, Peter and Guerquin, Michal and Jansen, Peter and Sabharwal, Ashish},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    volume={34},
    number={05},
    pages={8082--8090},
    year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/quoref

Config description : This dataset tests the coreferential reasoning capability of reading comprehension systems. In this span-selection benchmark containing questions over paragraphs from Wikipedia, a system must resolve hard coreferences before selecting the appropriate span(s) in the paragraphs for answering questions.
Download size : 51.43 MiB
Dataset size : 52.29 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	22,265
`'validation'`	2,768

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{dasigi-etal-2019-quoref,
    title = "{Q}uoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning",
    author = "Dasigi, Pradeep  and
      Liu, Nelson F.  and
      Marasovi{'c}, Ana  and
      Smith, Noah A.  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1606",
    doi = "10.18653/v1/D19-1606",
    pages = "5925--5932",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/race_string

Config description : Race is a large-scale reading comprehension dataset. The dataset is collected from English examinations in China, which are designed for middle school and high school students. The dataset can be served as the training and test sets for machine comprehension.
Download size : 167.97 MiB
Dataset size : 171.23 MiB
Auto-cached ( documentation ): Yes (test, validation), Only when shuffle_files=False (train)
bölmeler :

Bölmek	örnekler
`'test'`	4,934
`'train'`	87,863
`'validation'`	4,887

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lai-etal-2017-race,
    title = "{RACE}: Large-scale {R}e{A}ding Comprehension Dataset From Examinations",
    author = "Lai, Guokun  and
      Xie, Qizhe  and
      Liu, Hanxiao  and
      Yang, Yiming  and
      Hovy, Eduard",
    booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing",
    month = sep,
    year = "2017",
    address = "Copenhagen, Denmark",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D17-1082",
    doi = "10.18653/v1/D17-1082",
    pages = "785--794",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/race_string_dev

Config description : Race is a large-scale reading comprehension dataset. The dataset is collected from English examinations in China, which are designed for middle school and high school students. The dataset can be served as the training and test sets for machine comprehension.
Download size : 167.97 MiB
Dataset size : 171.23 MiB
Auto-cached ( documentation ): Yes (test, validation), Only when shuffle_files=False (train)
bölmeler :

Bölmek	örnekler
`'test'`	4,934
`'train'`	87,863
`'validation'`	4,887

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lai-etal-2017-race,
    title = "{RACE}: Large-scale {R}e{A}ding Comprehension Dataset From Examinations",
    author = "Lai, Guokun  and
      Xie, Qizhe  and
      Liu, Hanxiao  and
      Yang, Yiming  and
      Hovy, Eduard",
    booktitle = "Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing",
    month = sep,
    year = "2017",
    address = "Copenhagen, Denmark",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D17-1082",
    doi = "10.18653/v1/D17-1082",
    pages = "785--794",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/ropes

Config description : This dataset tests a system's ability to apply knowledge from a passage of text to a new situation. A system is presented a background passage containing a causal or qualitative relation(s) (eg, "animal pollinators increase efficiency of fertilization in flowers"), a novel situation that uses this background, and questions that require reasoning about effects of the relationships in the background passage in the context of the situation.
Download size : 12.91 MiB
Dataset size : 13.35 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	10,924
`'validation'`	1,688

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{lin-etal-2019-reasoning,
    title = "Reasoning Over Paragraph Effects in Situations",
    author = "Lin, Kevin  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Gardner, Matt",
    booktitle = "Proceedings of the 2nd Workshop on Machine Reading for Question Answering",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-5808",
    doi = "10.18653/v1/D19-5808",
    pages = "58--62",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/social_iqa

Config description : This is a large-scale benchmark for commonsense reasoning about social situations. Social IQa contains multiple choice questions for probing emotional and social intelligence in a variety of everyday situations. Through crowdsourcing, commonsense questions along with correct and incorrect answers about social interactions are collected, using a new framework that mitigates stylistic artifacts in incorrect answers by asking workers to provide the right answer to a different but related question.
Download size : 7.08 MiB
Dataset size : 8.22 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	33,410
`'validation'`	1,954

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sap-etal-2019-social,
    title = "Social {IQ}a: Commonsense Reasoning about Social Interactions",
    author = "Sap, Maarten  and
      Rashkin, Hannah  and
      Chen, Derek  and
      Le Bras, Ronan  and
      Choi, Yejin",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D19-1454",
    doi = "10.18653/v1/D19-1454",
    pages = "4463--4473",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/squad1_1

Config description : This is a reading comprehension dataset consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.
Download size : 80.62 MiB
Dataset size : 83.99 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	87,514
`'validation'`	10,570

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{rajpurkar-etal-2016-squad,
    title = "{SQ}u{AD}: 100,000+ Questions for Machine Comprehension of Text",
    author = "Rajpurkar, Pranav  and
      Zhang, Jian  and
      Lopyrev, Konstantin  and
      Liang, Percy",
    booktitle = "Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2016",
    address = "Austin, Texas",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/D16-1264",
    doi = "10.18653/v1/D16-1264",
    pages = "2383--2392",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/squad2

Config description : This dataset combines the original Stanford Question Answering Dataset (SQuAD) dataset with unanswerable questions written adversarially by crowdworkers to look similar to answerable ones.
Download size : 116.56 MiB
Dataset size : 121.43 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	130,149
`'validation'`	11.873

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{rajpurkar-etal-2018-know,
    title = "Know What You Don{'}t Know: Unanswerable Questions for {SQ}u{AD}",
    author = "Rajpurkar, Pranav  and
      Jia, Robin  and
      Liang, Percy",
    booktitle = "Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    month = jul,
    year = "2018",
    address = "Melbourne, Australia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P18-2124",
    doi = "10.18653/v1/P18-2124",
    pages = "784--789",
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/winogrande_l

Config description : This dataset is inspired by the original Winograd Schema Challenge design, but adjusted to improve both the scale and the hardness of the dataset. The key steps of the dataset construction consist of (1) a carefully designed crowdsourcing procedure, followed by (2) systematic bias reduction using a novel AfLite algorithm that generalizes human-detectable word associations to machine-detectable embedding associations. Training sets with differnt sizes are provided. This set corresponds to size l .
Download size : 1.49 MiB
Dataset size : 1.83 MiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	10,234
`'validation'`	1,267

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sakaguchi2020winogrande,
  title={Winogrande: An adversarial winograd schema challenge at scale},
  author={Sakaguchi, Keisuke and Le Bras, Ronan and Bhagavatula, Chandra and Choi, Yejin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8732--8740},
  year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/winogrande_m

Config description : This dataset is inspired by the original Winograd Schema Challenge design, but adjusted to improve both the scale and the hardness of the dataset. The key steps of the dataset construction consist of (1) a carefully designed crowdsourcing procedure, followed by (2) systematic bias reduction using a novel AfLite algorithm that generalizes human-detectable word associations to machine-detectable embedding associations. Training sets with differnt sizes are provided. This set corresponds to size m .
Download size : 507.46 KiB
Dataset size : 623.15 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'train'`	2,558
`'validation'`	1,267

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sakaguchi2020winogrande,
  title={Winogrande: An adversarial winograd schema challenge at scale},
  author={Sakaguchi, Keisuke and Le Bras, Ronan and Bhagavatula, Chandra and Choi, Yejin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8732--8740},
  year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

unified_qa/winogrande_s

Config description : This dataset is inspired by the original Winograd Schema Challenge design, but adjusted to improve both the scale and the hardness of the dataset. The key steps of the dataset construction consist of (1) a carefully designed crowdsourcing procedure, followed by (2) systematic bias reduction using a novel AfLite algorithm that generalizes human-detectable word associations to machine-detectable embedding associations. Training sets with differnt sizes are provided. This set corresponds to size s .
Download size : 479.24 KiB
Dataset size : 590.47 KiB
Otomatik önbelleğe alınmış ( belgeleme ): Evet
bölmeler :

Bölmek	örnekler
`'test'`	1,767
`'train'`	640
`'validation'`	1,267

Örnekler ( tfds.as_dataframe ):

Alıntı :

@inproceedings{sakaguchi2020winogrande,
  title={Winogrande: An adversarial winograd schema challenge at scale},
  author={Sakaguchi, Keisuke and Le Bras, Ronan and Bhagavatula, Chandra and Choi, Yejin},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={34},
  number={05},
  pages={8732--8740},
  year={2020}
}

@inproceedings{khashabi-etal-2020-unifiedqa,
    title = "{UNIFIEDQA}: Crossing Format Boundaries with a Single {QA} System",
    author = "Khashabi, Daniel  and
      Min, Sewon  and
      Khot, Tushar  and
      Sabharwal, Ashish  and
      Tafjord, Oyvind  and
      Clark, Peter  and
      Hajishirzi, Hannaneh",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2020",
    month = nov,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2020.findings-emnlp.171",
    doi = "10.18653/v1/2020.findings-emnlp.171",
    pages = "1896--1907",
}

Note that each UnifiedQA dataset has its own citation. Please see the source to
see the correct citation for each contained dataset."

birleşik_qa Koleksiyonlar ile düzeninizi koruyun İçeriği tercihlerinize göre kaydedin ve kategorilere ayırın.

unified_qa/ai2_science_elementary (varsayılan yapılandırma)

birleşik_qa/ai2_science_middle

birleşik_qa/ambigqa

birleşik_qa/arc_easy

birleşik_qa/arc_easy_dev

unified_qa/arc_easy_with_ir

unified_qa/arc_easy_with_ir_dev

birleşik_qa/arc_hard

birleşik_qa/arc_hard_dev

birleşik_qa/arc_hard_with_ir

birleşik_qa/arc_hard_with_ir_dev

birleşik_qa/boolq

birleşik_qa/boolq_np

birleşik_qa/commonsenseqa

birleşik_qa/commonsenseqa_test

birleşik_qa/contrast_sets_boolq

birleşik_qa/contrast_sets_drop

birleşik_qa/contrast_sets_quoref

birleşik_qa/contrast_sets_ropes

birleşik_qa/bırak

birleşik_qa/mctest

unified_qa/mctest_corrected_the_separator

birleşik_qa/çoklu

birleşik_qa/anlatıqa

birleşik_qa/narrativeqa_dev

unified_qa/natural_questions

unified_qa/natural_questions_direct_ans

unified_qa/natural_questions_direct_ans_test

unified_qa/natural_questions_with_dpr_para

unified_qa/natural_questions_with_dpr_para_test

birleşik_qa/newsqa

birleşik_qa/openbookqa

birleşik_qa/openbookqa_dev

birleşik_qa/openbookqa_with_ir

birleşik_qa/openbookqa_with_ir_dev

birleşik_qa/fiziksel_iqa

birleşik_qa/qasc

birleşik_qa/qasc_testi

birleşik_qa/qasc_with_ir

unified_qa/qasc_with_ir_test

birleşik_qa/quoref

birleşik_qa/yarış_dizesi

unified_qa/race_string_dev

birleşik_qa/ipler

birleşik_qa/social_iqa

birleşik_qa/squad1_1

birleşik_qa/squad2

birleşik_qa/winogrande_l

birleşik_qa/winogrande_m

birleşik_qa/winogrande_s

unified_qa/ai2_science_elementary (varsayılan yapılandırma)

birleşik_qa/ai2_science_middle

birleşik_qa/ambigqa

birleşik_qa/arc_easy

birleşik_qa/arc_easy_dev

unified_qa/arc_easy_with_ir

unified_qa/arc_easy_with_ir_dev

birleşik_qa/arc_hard

birleşik_qa/arc_hard_dev

birleşik_qa/arc_hard_with_ir

birleşik_qa/arc_hard_with_ir_dev

birleşik_qa/boolq

birleşik_qa/boolq_np

birleşik_qa/commonsenseqa

unified_qa/commonsenseqa_test

unified_qa/contrast_sets_boolq

unified_qa/contrast_sets_drop

unified_qa/contrast_sets_quoref

unified_qa/contrast_sets_ropes

unified_qa/drop

unified_qa/mctest

unified_qa/mctest_corrected_the_separator

unified_qa/multirc

unified_qa/narrativeqa

unified_qa/narrativeqa_dev

unified_qa/natural_questions

unified_qa/natural_questions_direct_ans

unified_qa/natural_questions_direct_ans_test

unified_qa/natural_questions_with_dpr_para

birleşik_qa