- Description:
Question generation using squad dataset using data splits described in 'Neural Question Generation from Text: A Preliminary Study' (Zhou et al, 2017) and 'Learning to Ask: Neural Question Generation for Reading Comprehension' (Du et al, 2017).
Homepage: https://github.com/xinyadu/nqg @inproceedings{du-etal-2017-learning, title = "Learning to Ask: Neural Question Generation for Reading Comprehension", author = "Du, Xinya and Shao, Junru and Cardie, Claire", booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)", month = jul, year = "2017", address = "Vancouver, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P17-1123", doi = "10.18653/v1/P17-1123", pages = "1342--1352", }", month = jul, year = "2017", address = "Vancouver, Canada", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/P17-1123", doi = "10.18653/v1/P17-1123", pages = "1342--1352", } )
Source code:
tfds.text.squad_question_generation.SquadQuestionGeneration
Versions:
1.0.0
: Initial build with unique SQuAD QAS ids in each split, using passage-level context (Zhou et al, 2017).2.0.0
: Matches the original split of (Zhou et al, 2017), allows both sentence- and passage-level contexts, and uses answers from (Zhou et al, 2017).3.0.0
(default): Added the split of (Du et al, 2017) also.
Auto-cached (documentation): Yes
Supervised keys (See
as_supervised
doc):('context_passage', 'question')
Figure (tfds.show_examples): Not supported.
Citation:
@inproceedings{du-etal-2017-learning,
title = "Learning to Ask: Neural Question Generation for Reading Comprehension",
author = "Du, Xinya and Shao, Junru and Cardie, Claire",
booktitle = "Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2017",
address = "Vancouver, Canada",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/P17-1123",
doi = "10.18653/v1/P17-1123",
pages = "1342--1352",
}
@inproceedings{rajpurkar-etal-2016-squad,
title = "{SQ}u{AD}: 100,000+ Questions for Machine Comprehension of Text",
author = "Rajpurkar, Pranav and Zhang, Jian and Lopyrev, Konstantin and Liang, Percy",
booktitle = "Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2016",
address = "Austin, Texas",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/D16-1264",
doi = "10.18653/v1/D16-1264",
pages = "2383--2392",
}
squad_question_generation/split_du (default config)
Config description: Answer independent question generation from passage-level contexts (Du et al, 2017).
Download size:
62.83 MiB
Dataset size:
84.67 MiB
Splits:
Split | Examples |
---|---|
'test' |
11,877 |
'train' |
75,722 |
'validation' |
10,570 |
- Feature structure:
FeaturesDict({
'answer': Text(shape=(), dtype=string),
'context_passage': Text(shape=(), dtype=string),
'question': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
answer | Text | string | ||
context_passage | Text | string | ||
question | Text | string |
- Examples (tfds.as_dataframe):
squad_question_generation/split_zhou
Config description: Answer-span dependent question generation from sentence- and passage-level contexts (Zhou et al, 2017).
Download size:
62.52 MiB
Dataset size:
111.02 MiB
Splits:
Split | Examples |
---|---|
'test' |
8,964 |
'train' |
86,635 |
'validation' |
8,965 |
- Feature structure:
FeaturesDict({
'answer': Text(shape=(), dtype=string),
'context_passage': Text(shape=(), dtype=string),
'context_sentence': Text(shape=(), dtype=string),
'question': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
answer | Text | string | ||
context_passage | Text | string | ||
context_sentence | Text | string | ||
question | Text | string |
- Examples (tfds.as_dataframe):