xglue

הפניות:

ner

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/ner')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 3007
'test.en' 3454
'test.es' 1523
'test.nl' 5202
'train' 14042
'validation.de' 2874
'validation.en' 3252
'validation.es' 1923
'validation.nl' 2895
  • תכונות :
{
    "words": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner": {
        "feature": {
            "num_classes": 9,
            "names": [
                "O",
                "B-PER",
                "I-PER",
                "B-ORG",
                "I-ORG",
                "B-LOC",
                "I-LOC",
                "B-MISC",
                "I-MISC"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

pos

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/pos')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.ar' 679
'test.bg' 1115
'test.de' 976
'test.el' 455
'test.en' 2076
'test.es' 425
'test.fr' 415
'test.hi' 1683
'test.it' 481
'test.nl' 595
'test.pl' 2214
'test.ru' 600
'test.th' 497
'test.tr' 982
'test.ur' 534
'test.vi' 799
'test.zh' 499
'train' 25376
'validation.ar' 908
'validation.bg' 1114
'validation.de' 798
'validation.el' 402
'validation.en' 2001
'validation.es' 1399
'validation.fr' 1475
'validation.hi' 1658
'validation.it' 563
'validation.nl' 717
'validation.pl' 2214
'validation.ru' 578
'validation.th' 497
'validation.tr' 987
'validation.ur' 551
'validation.vi' 799
'validation.zh' 499
  • תכונות :
{
    "words": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "pos": {
        "feature": {
            "num_classes": 17,
            "names": [
                "ADJ",
                "ADP",
                "ADV",
                "AUX",
                "CCONJ",
                "DET",
                "INTJ",
                "NOUN",
                "NUM",
                "PART",
                "PRON",
                "PROPN",
                "PUNCT",
                "SCONJ",
                "SYM",
                "VERB",
                "X"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

mlqa

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/mlqa')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.ar' 5335
'test.de' 4517
'test.en' 11590
'test.es' 5253
'test.hi' 4918
'test.vi' 5495
'test.zh' 5137
'train' 87599
'validation.ar' 517
'validation.de' 512
'validation.en' 1148
'validation.es' 500
'validation.hi' 507
'validation.vi' 511
'validation.zh' 504
  • תכונות :
{
    "context": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answers": {
        "feature": {
            "answer_start": {
                "dtype": "int32",
                "id": null,
                "_type": "Value"
            },
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

nc

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/nc')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 10000
'test.en' 10000
'test.es' 10000
'test.fr' 10000
'test.ru' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.es' 10000
'validation.fr' 10000
'validation.ru' 10000
  • תכונות :
{
    "news_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "news_body": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "news_category": {
        "num_classes": 10,
        "names": [
            "foodanddrink",
            "sports",
            "travel",
            "finance",
            "lifestyle",
            "news",
            "entertainment",
            "health",
            "video",
            "autos"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

xnli

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/xnli')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.ar' 5010
'test.bg' 5010
'test.de' 5010
'test.el' 5010
'test.en' 5010
'test.es' 5010
'test.fr' 5010
'test.hi' 5010
'test.ru' 5010
'test.sw' 5010
'test.th' 5010
'test.tr' 5010
'test.ur' 5010
'test.vi' 5010
'test.zh' 5010
'train' 392702
'validation.ar' 2490
'validation.bg' 2490
'validation.de' 2490
'validation.el' 2490
'validation.en' 2490
'validation.es' 2490
'validation.fr' 2490
'validation.hi' 2490
'validation.ru' 2490
'validation.sw' 2490
'validation.th' 2490
'validation.tr' 2490
'validation.ur' 2490
'validation.vi' 2490
'validation.zh' 2490
  • תכונות :
{
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "hypothesis": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "entailment",
            "neutral",
            "contradiction"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

כפות-x

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/paws-x')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 2000
'test.en' 2000
'test.es' 2000
'test.fr' 2000
'train' 49401
'validation.de' 2000
'validation.en' 2000
'validation.es' 2000
'validation.fr' 2000
  • תכונות :
{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "different",
            "same"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

qadsm

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/qadsm')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 10000
'test.en' 10000
'test.fr' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.fr' 10000
  • תכונות :
{
    "query": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ad_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ad_description": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "relevance_label": {
        "num_classes": 2,
        "names": [
            "Bad",
            "Good"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

wpr

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/wpr')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 9997
'test.en' 10004
'test.es' 10006
'test.fr' 10020
'test.it' 10001
'test.pt' 10015
'test.zh' 9999
'train' 99997
'validation.de' 10004
'validation.en' 10008
'validation.es' 10004
'validation.fr' 10005
'validation.it' 10003
'validation.pt' 10001
'validation.zh' 10002
  • תכונות :
{
    "query": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "web_page_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "web_page_snippet": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "relavance_label": {
        "num_classes": 5,
        "names": [
            "Bad",
            "Fair",
            "Good",
            "Excellent",
            "Perfect"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

qam

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/qam')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 10000
'test.en' 10000
'test.fr' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.fr' 10000
  • תכונות :
{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "False",
            "True"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

qg

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/qg')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 10000
'test.en' 10000
'test.es' 10000
'test.fr' 10000
'test.it' 10000
'test.pt' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.es' 10000
'validation.fr' 10000
'validation.it' 10000
'validation.pt' 10000
  • תכונות :
{
    "answer_passage": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

ntg

השתמש בפקודה הבאה כדי לטעון מערך נתונים זה ב-TFDS:

ds = tfds.load('huggingface:xglue/ntg')
  • תיאור :
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • רישיון : אין רישיון ידוע
  • גרסה : 1.0.0
  • פיצולים :
לְפַצֵל דוגמאות
'test.de' 10000
'test.en' 10000
'test.es' 10000
'test.fr' 10000
'test.ru' 10000
'train' 300000
'validation.de' 10000
'validation.en' 10000
'validation.es' 10000
'validation.fr' 10000
'validation.ru' 10000
  • תכונות :
{
    "news_body": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "news_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}