break_data

مراجع:

QDMR-سطح بالا

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:break_data/QDMR-high-level')

توضیحات :

Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations
(QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases. 
This repository contains the Break dataset along with information on the exact data format.

مجوز : مجوز شناخته شده ای وجود ندارد
نسخه : 1.0.0
تقسیم ها :

تقسیم کنید	نمونه ها
`'test'`	3195
`'train'`	17503
`'validation'`	3130

ویژگی ها :

{
    "question_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question_text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "decomposition": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "operators": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "split": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

QDMR - واژگان سطح بالا

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:break_data/QDMR-high-level-lexicon')

توضیحات :

Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations
(QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases. 
This repository contains the Break dataset along with information on the exact data format.

مجوز : مجوز شناخته شده ای وجود ندارد
نسخه : 1.0.0
تقسیم ها :

تقسیم کنید	نمونه ها
`'test'`	3195
`'train'`	17503
`'validation'`	3130

ویژگی ها :

{
    "source": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "allowed_tokens": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

QDMR

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:break_data/QDMR')

توضیحات :

Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations
(QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases. 
This repository contains the Break dataset along with information on the exact data format.

مجوز : مجوز شناخته شده ای وجود ندارد
نسخه : 1.0.0
تقسیم ها :

تقسیم کنید	نمونه ها
`'test'`	8069
`'train'`	44321
`'validation'`	7760

ویژگی ها :

{
    "question_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question_text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "decomposition": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "operators": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "split": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

واژگان QDMR

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:break_data/QDMR-lexicon')

توضیحات :

Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations
(QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases. 
This repository contains the Break dataset along with information on the exact data format.

مجوز : مجوز شناخته شده ای وجود ندارد
نسخه : 1.0.0
تقسیم ها :

تقسیم کنید	نمونه ها
`'test'`	8069
`'train'`	44321
`'validation'`	7760

ویژگی ها :

{
    "source": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "allowed_tokens": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

اشکال منطقی

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:break_data/logical-forms')

توضیحات :

Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations
(QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases. 
This repository contains the Break dataset along with information on the exact data format.

مجوز : مجوز شناخته شده ای وجود ندارد
نسخه : 1.0.0
تقسیم ها :

تقسیم کنید	نمونه ها
`'test'`	8006
`'train'`	44098
`'validation'`	7719

ویژگی ها :

{
    "question_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question_text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "decomposition": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "operators": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "split": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "program": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}