ग्राउंडेड_स्कैन

विवरण :

ग्राउंडेड स्कैन (जीएससीएएन) स्थित भाषा समझ में संरचनागत सामान्यीकरण के मूल्यांकन के लिए एक सिंथेटिक डेटासेट है। gSCAN प्राकृतिक भाषा निर्देशों को क्रिया क्रम के साथ जोड़ता है, और एजेंट को ग्रिड-आधारित विज़ुअल नेविगेशन वातावरण के संदर्भ में निर्देशों की व्याख्या करने की आवश्यकता होती है।

अधिक जानकारी यहां पाई जा सकती है:

compositional_splits और target_length_split के लिए: https://github.com/LauraRuis/groundedSCAN
spatial_relation_splits के लिए: https://github.com/google-research/language/tree/master/language/gscan/data
होमपेज : https://github.com/LauraRuis/groundedSCAN
स्रोत कोड : tfds.vision_language.grounded_scan.GroundedScan
संस्करण :
- 1.0.0 : प्रारंभिक रिलीज।
- 1.1.0 : vector फीचर को टेक्स्ट () में बदला गया।
- 2.0.0 (डिफ़ॉल्ट): नया spatial_relation_splits कॉन्फ़िगरेशन जोड़ता है।
ऑटो-कैश्ड ( दस्तावेज़ीकरण ): नहीं
फ़ीचर संरचना :

FeaturesDict({
    'command': Sequence(Text(shape=(), dtype=string)),
    'manner': Text(shape=(), dtype=string),
    'meaning': Sequence(Text(shape=(), dtype=string)),
    'referred_target': Text(shape=(), dtype=string),
    'situation': FeaturesDict({
        'agent_direction': int32,
        'agent_position': FeaturesDict({
            'column': int32,
            'row': int32,
        }),
        'direction_to_target': Text(shape=(), dtype=string),
        'distance_to_target': int32,
        'grid_size': int32,
        'placed_objects': Sequence({
            'object': FeaturesDict({
                'color': Text(shape=(), dtype=string),
                'shape': Text(shape=(), dtype=string),
                'size': int32,
            }),
            'position': FeaturesDict({
                'column': int32,
                'row': int32,
            }),
            'vector': Text(shape=(), dtype=string),
        }),
        'target_object': FeaturesDict({
            'object': FeaturesDict({
                'color': Text(shape=(), dtype=string),
                'shape': Text(shape=(), dtype=string),
                'size': int32,
            }),
            'position': FeaturesDict({
                'column': int32,
                'row': int32,
            }),
            'vector': Text(shape=(), dtype=string),
        }),
    }),
    'target_commands': Sequence(Text(shape=(), dtype=string)),
    'verb_in_command': Text(shape=(), dtype=string),
})

फ़ीचर दस्तावेज़ीकरण :

विशेषता	कक्षा	आकार	डीटाइप
	विशेषताएं डिक्ट
आज्ञा	अनुक्रम (पाठ)	(कोई भी नहीं,)	डोरी
तौर-तरीका	मूलपाठ		डोरी
अर्थ	अनुक्रम (पाठ)	(कोई भी नहीं,)	डोरी
संदर्भित_लक्ष्य	मूलपाठ		डोरी
परिस्थिति	विशेषताएं डिक्ट
स्थिति/Agent_direction	टेन्सर		int32
सिचुएशन/एजेंट_पोजिशन	विशेषताएं डिक्ट
सिचुएशन/एजेंट_पोजिशन/कॉलम	टेन्सर		int32
स्थिति/एजेंट_पोजिशन/पंक्ति	टेन्सर		int32
स्थिति/दिशा_से_लक्ष्य	मूलपाठ		डोरी
स्थिति/distance_to_target	टेन्सर		int32
स्थिति/ग्रिड_साइज	टेन्सर		int32
सिचुएशन/प्लेस्ड_ऑब्जेक्ट्स	क्रम
सिचुएशन/प्लेस्ड_ऑब्जेक्ट्स/ऑब्जेक्ट	विशेषताएं डिक्ट
सिचुएशन/प्लेस्ड_ऑब्जेक्ट्स/ऑब्जेक्ट/कलर	मूलपाठ		डोरी
सिचुएशन/प्लेस्ड_ऑब्जेक्ट्स/ऑब्जेक्ट/शेप	मूलपाठ		डोरी
स्थिति/रखी_वस्तु/वस्तु/आकार	टेन्सर		int32
सिचुएशन/प्लेस्ड_ऑब्जेक्ट्स/पोजिशन	विशेषताएं डिक्ट
सिचुएशन/प्लेस्ड_ऑब्जेक्ट्स/पोजिशन/कॉलम	टेन्सर		int32
स्थिति/रखी_वस्तु/स्थिति/पंक्ति	टेन्सर		int32
सिचुएशन/प्लेस्ड_ऑब्जेक्ट्स/वेक्टर	मूलपाठ		डोरी
स्थिति/target_object	विशेषताएं डिक्ट
स्थिति/target_object/object	विशेषताएं डिक्ट
स्थिति/target_object/ऑब्जेक्ट/color	मूलपाठ		डोरी
स्थिति/target_object/ऑब्जेक्ट/shape	मूलपाठ		डोरी
स्थिति/target_object/ऑब्जेक्ट/आकार	टेन्सर		int32
स्थिति/target_object/स्थिति	विशेषताएं डिक्ट
स्थिति/target_object/स्थिति/कॉलम	टेन्सर		int32
स्थिति/target_object/स्थिति/पंक्ति	टेन्सर		int32
स्थिति/target_object/vector	मूलपाठ		डोरी
target_commands	अनुक्रम (पाठ)	(कोई भी नहीं,)	डोरी
क्रिया_इन_कमांड	मूलपाठ		डोरी

पर्यवेक्षित कुंजियाँ ( as_supervised doc देखें): None
चित्र ( tfds.show_examples ): समर्थित नहीं है।
उद्धरण :

@inproceedings{NEURIPS2020_e5a90182,
 author = {Ruis, Laura and Andreas, Jacob and Baroni, Marco and Bouchacourt, Diane and Lake, Brenden M},
 booktitle = {Advances in Neural Information Processing Systems},
 editor = {H. Larochelle and M. Ranzato and R. Hadsell and M. F. Balcan and H. Lin},
 pages = {19861--19872},
 publisher = {Curran Associates, Inc.},
 title = {A Benchmark for Systematic Generalization in Grounded Language Understanding},
 url = {https://proceedings.neurips.cc/paper/2020/file/e5a90182cc81e12ab5e72d66e0b46fe3-Paper.pdf},
 volume = {33},
 year = {2020}
}

@inproceedings{qiu-etal-2021-systematic,
    title = "Systematic Generalization on g{SCAN}: {W}hat is Nearly Solved and What is Next?",
    author = "Qiu, Linlu  and
      Hu, Hexiang  and
      Zhang, Bowen  and
      Shaw, Peter  and
      Sha, Fei",
    booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2021",
    address = "Online and Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.emnlp-main.166",
    doi = "10.18653/v1/2021.emnlp-main.166",
    pages = "2180--2188",
}

ग्राउंडेड_स्कैन/कंपोज़िशनल_स्प्लिट्स (डिफ़ॉल्ट कॉन्फ़िगरेशन)

विन्यास विवरण : संरचनागत सामान्यीकरण के उदाहरण।
डाउनलोड आकार : 82.10 MiB
डेटासेट का आकार : 998.11 MiB
विभाजन :

विभाजित करना	उदाहरण
`'adverb_1'`	112,880
`'adverb_2'`	38,582
`'contextual'`	11,460
`'dev'`	3,716
`'situational_1'`	88,642
`'situational_2'`	16,808
`'test'`	19,282
`'train'`	367,933
`'visual'`	37,436
`'visual_easier'`	18,718

उदाहरण ( tfds.as_dataframe ):

ग्राउंडेड_स्कैन/target_length_split

कॉन्फ़िग विवरण : बड़े लक्ष्य लंबाई के लिए सामान्यीकरण के उदाहरण।
डाउनलोड आकार : 53.41 MiB
डेटासेट का आकार : 546.73 MiB
विभाजन :

विभाजित करना	उदाहरण
`'dev'`	1,821
`'target_lengths'`	198,588
`'test'`	37,784
`'train'`	180,301

उदाहरण ( tfds.as_dataframe ):

ग्राउंडेड_स्कैन/स्पेशियल_रिलेशन_स्प्लिट्स

विन्यास विवरण : स्थानिक संबंध तर्क के उदाहरण।
डाउनलोड आकार : 89.59 MiB
डेटासेट का आकार : 675.09 MiB
विभाजन :

विभाजित करना	उदाहरण
`'dev'`	2,617
`'referent'`	30,492
`'relation'`	6,285
`'relative_position_1'`	41,576
`'relative_position_2'`	41,529
`'test'`	28,526
`'train'`	259,088
`'visual'`	62,250

उदाहरण ( tfds.as_dataframe ):