다트

참고자료:

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:dart')
  • 설명 :
DART is a large and open-domain structured DAta Record to Text generation corpus with high-quality
sentence annotations with each input being a set of entity-relation triples following a tree-structured ontology.
It consists of 82191 examples across different domains with each input being a semantic RDF triple set derived
from data records in tables and the tree ontology of table schema, annotated with sentence description that
covers all facts in the triple set.

DART is released in the following paper where you can find more details and baseline results:
https://arxiv.org/abs/2007.02871
  • 라이센스 : 알려진 라이센스 없음
  • 버전 : 0.0.0
  • 분할 :
나뉘다
'test' 6959
'train' 30526
'validation' 2768
  • 특징 :
{
    "tripleset": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "subtree_was_extended": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "annotations": {
        "feature": {
            "source": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}