common_gen

참고자료:

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:common_gen')
  • 설명 :
CommonGen is a constrained text generation task, associated with a benchmark dataset, 
to explicitly test machines for the ability of generative commonsense reasoning. Given 
a set of common concepts; the task is to generate a coherent sentence describing an 
everyday scenario using these concepts.

CommonGen is challenging because it inherently requires 1) relational reasoning using 
background commonsense knowledge, and 2) compositional generalization ability to work 
on unseen concept combinations. Our dataset, constructed through a combination of 
crowd-sourcing from AMT and existing caption corpora, consists of 30k concept-sets and 
50k sentences in total.
  • 라이센스 : 알려진 라이센스 없음
  • 버전 : 2020.5.30
  • 분할 :
나뉘다
'test' 1497
'train' 67389
'validation' 4018
  • 특징 :
{
    "concept_set_idx": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "concepts": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "target": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}