- Description:
Grounded SCAN (gSCAN) is a synthetic dataset for evaluating compositional generalization in situated language understanding. gSCAN pairs natural language instructions with action sequences, and requires the agent to interpret instructions within the context of a grid-based visual navigation environment.
More information can be found at:
For the
compositional_splits
and thetarget_length_split
: https://github.com/LauraRuis/groundedSCANFor the
spatial_relation_splits
: https://github.com/google-research/language/tree/master/language/gscan/dataSource code:
tfds.vision_language.grounded_scan.GroundedScan
Versions:
1.0.0
: Initial release.1.1.0
: Changedvector
feature to Text().2.0.0
(default): Adds the new spatial_relation_splits config.
Auto-cached (documentation): No
Feature structure:
FeaturesDict({
'command': Sequence(Text(shape=(), dtype=string)),
'manner': Text(shape=(), dtype=string),
'meaning': Sequence(Text(shape=(), dtype=string)),
'referred_target': Text(shape=(), dtype=string),
'situation': FeaturesDict({
'agent_direction': int32,
'agent_position': FeaturesDict({
'column': int32,
'row': int32,
}),
'direction_to_target': Text(shape=(), dtype=string),
'distance_to_target': int32,
'grid_size': int32,
'placed_objects': Sequence({
'object': FeaturesDict({
'color': Text(shape=(), dtype=string),
'shape': Text(shape=(), dtype=string),
'size': int32,
}),
'position': FeaturesDict({
'column': int32,
'row': int32,
}),
'vector': Text(shape=(), dtype=string),
}),
'target_object': FeaturesDict({
'object': FeaturesDict({
'color': Text(shape=(), dtype=string),
'shape': Text(shape=(), dtype=string),
'size': int32,
}),
'position': FeaturesDict({
'column': int32,
'row': int32,
}),
'vector': Text(shape=(), dtype=string),
}),
}),
'target_commands': Sequence(Text(shape=(), dtype=string)),
'verb_in_command': Text(shape=(), dtype=string),
})
- Feature documentation:
Feature | Class | Shape | Dtype | Description |
---|---|---|---|---|
FeaturesDict | ||||
command | Sequence(Text) | (None,) | string | |
manner | Text | string | ||
meaning | Sequence(Text) | (None,) | string | |
referred_target | Text | string | ||
situation | FeaturesDict | |||
situation/agent_direction | Tensor | int32 | ||
situation/agent_position | FeaturesDict | |||
situation/agent_position/column | Tensor | int32 | ||
situation/agent_position/row | Tensor | int32 | ||
situation/direction_to_target | Text | string | ||
situation/distance_to_target | Tensor | int32 | ||
situation/grid_size | Tensor | int32 | ||
situation/placed_objects | Sequence | |||
situation/placed_objects/object | FeaturesDict | |||
situation/placed_objects/object/color | Text | string | ||
situation/placed_objects/object/shape | Text | string | ||
situation/placed_objects/object/size | Tensor | int32 | ||
situation/placed_objects/position | FeaturesDict | |||
situation/placed_objects/position/column | Tensor | int32 | ||
situation/placed_objects/position/row | Tensor | int32 | ||
situation/placed_objects/vector | Text | string | ||
situation/target_object | FeaturesDict | |||
situation/target_object/object | FeaturesDict | |||
situation/target_object/object/color | Text | string | ||
situation/target_object/object/shape | Text | string | ||
situation/target_object/object/size | Tensor | int32 | ||
situation/target_object/position | FeaturesDict | |||
situation/target_object/position/column | Tensor | int32 | ||
situation/target_object/position/row | Tensor | int32 | ||
situation/target_object/vector | Text | string | ||
target_commands | Sequence(Text) | (None,) | string | |
verb_in_command | Text | string |
Supervised keys (See
as_supervised
doc):None
Figure (tfds.show_examples): Not supported.
Citation:
@inproceedings{NEURIPS2020_e5a90182,
author = {Ruis, Laura and Andreas, Jacob and Baroni, Marco and Bouchacourt, Diane and Lake, Brenden M},
booktitle = {Advances in Neural Information Processing Systems},
editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
pages = {19861--19872},
publisher = {Curran Associates, Inc.},
title = {A Benchmark for Systematic Generalization in Grounded Language Understanding},
url = {https://proceedings.neurips.cc/paper/2020/file/e5a90182cc81e12ab5e72d66e0b46fe3-Paper.pdf},
volume = {33},
year = {2020}
}
@inproceedings{qiu-etal-2021-systematic,
title = "Systematic Generalization on g{SCAN}: {W}hat is Nearly Solved and What is Next?",
author = "Qiu, Linlu and
Hu, Hexiang and
Zhang, Bowen and
Shaw, Peter and
Sha, Fei",
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2021",
address = "Online and Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.emnlp-main.166",
doi = "10.18653/v1/2021.emnlp-main.166",
pages = "2180--2188",
}
grounded_scan/compositional_splits (default config)
Config description: Examples for compositional generalization.
Download size:
82.10 MiB
Dataset size:
998.11 MiB
Splits:
Split | Examples |
---|---|
'adverb_1' |
112,880 |
'adverb_2' |
38,582 |
'contextual' |
11,460 |
'dev' |
3,716 |
'situational_1' |
88,642 |
'situational_2' |
16,808 |
'test' |
19,282 |
'train' |
367,933 |
'visual' |
37,436 |
'visual_easier' |
18,718 |
- Examples (tfds.as_dataframe):
grounded_scan/target_length_split
Config description: Examples for generalizing to larger target lengths.
Download size:
53.41 MiB
Dataset size:
546.73 MiB
Splits:
Split | Examples |
---|---|
'dev' |
1,821 |
'target_lengths' |
198,588 |
'test' |
37,784 |
'train' |
180,301 |
- Examples (tfds.as_dataframe):
grounded_scan/spatial_relation_splits
Config description: Examples for spatial relation reasoning.
Download size:
89.59 MiB
Dataset size:
675.09 MiB
Splits:
Split | Examples |
---|---|
'dev' |
2,617 |
'referent' |
30,492 |
'relation' |
6,285 |
'relative_position_1' |
41,576 |
'relative_position_2' |
41,529 |
'test' |
28,526 |
'train' |
259,088 |
'visual' |
62,250 |
- Examples (tfds.as_dataframe):