multi_eurlex

مراجع:

en

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/en')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

دا

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/da')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

de

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/de')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

nl

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/nl')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sv

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/sv')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 42490
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

bg

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/bg')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 15986
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cs

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/cs')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23187
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ساعت

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/hr')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 7944
'validation' 2500
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

pl

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/pl')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23197
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sk

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/sk')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 22971
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sl

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/sl')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23184
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

es

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/es')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 52785
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

fr

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/fr')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

آن را

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/it')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

pt

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/pt')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 52370
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ro

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/ro')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 15921
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

et

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/et')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23126
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

فی

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/fi')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 42497
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

هو

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/hu')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 22664
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

آن

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/lt')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23188
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

lv

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/lv')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23208
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

el

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/el')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

mt

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/mt')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 17521
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

all_languages

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/all_languages')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "languages": [
            "en",
            "da",
            "de",
            "nl",
            "sv",
            "bg",
            "cs",
            "hr",
            "pl",
            "sk",
            "sl",
            "es",
            "fr",
            "it",
            "pt",
            "ro",
            "et",
            "fi",
            "hu",
            "lt",
            "lv",
            "el",
            "mt"
        ],
        "id": null,
        "_type": "Translation"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}
،

مراجع:

en

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/en')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

دا

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/da')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

de

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/de')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

nl

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/nl')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sv

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/sv')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 42490
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

bg

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/bg')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 15986
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cs

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/cs')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23187
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ساعت

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/hr')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 7944
'validation' 2500
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

pl

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/pl')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23197
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sk

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/sk')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 22971
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

sl

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/sl')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23184
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

es

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/es')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 52785
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

fr

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/fr')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

آن را

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/it')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

pt

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/pt')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 52370
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ro

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/ro')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 15921
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

et

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/et')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23126
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

فی

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/fi')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 42497
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

هو

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/hu')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 22664
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

آن

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/lt')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23188
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

lv

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/lv')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 23208
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

el

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/el')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

mt

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/mt')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 17521
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

all_languages

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:multi_eurlex/all_languages')
  • توضیحات :
MultiEURLEX comprises 65k EU laws in 23 official EU languages (some low-ish resource).
Each EU law has been annotated with EUROVOC concepts (labels) by the Publication Office of EU.
As with the English EURLEX, the goal is to predict the relevant EUROVOC concepts (labels);
this is multi-label classification task (given the text, predict multiple labels).
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.0.0
  • تقسیم ها :
تقسیم کنید نمونه ها
'test' 5000
'train' 55000
'validation' 5000
  • ویژگی ها :
{
    "celex_id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "languages": [
            "en",
            "da",
            "de",
            "nl",
            "sv",
            "bg",
            "cs",
            "hr",
            "pl",
            "sk",
            "sl",
            "es",
            "fr",
            "it",
            "pt",
            "ro",
            "et",
            "fi",
            "hu",
            "lt",
            "lv",
            "el",
            "mt"
        ],
        "id": null,
        "_type": "Translation"
    },
    "labels": {
        "feature": {
            "num_classes": 21,
            "names": [
                "100149",
                "100160",
                "100148",
                "100147",
                "100152",
                "100143",
                "100156",
                "100158",
                "100154",
                "100153",
                "100142",
                "100145",
                "100150",
                "100162",
                "100159",
                "100144",
                "100151",
                "100157",
                "100161",
                "100146",
                "100155"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}