xglue

参考文献:

ナー

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/ner')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 3007
'test.en' 3454
'test.es' 1523
'test.nl' 5202
'train' 14042
'validation.de' 2874
'validation.en' 3252
'validation.es' 1923年
'validation.nl' 2895
  • 特徴
{
    "words": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner": {
        "feature": {
            "num_classes": 9,
            "names": [
                "O",
                "B-PER",
                "I-PER",
                "B-ORG",
                "I-ORG",
                "B-LOC",
                "I-LOC",
                "B-MISC",
                "I-MISC"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

位置

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/pos')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.ar' 679
'test.bg' 1115
'test.de' 976
'test.el' 455
'test.en' 2076年
'test.es' 425
'test.fr' 415
'test.hi' 1683年
'test.it' 481
'test.nl' 595
'test.pl' 2214
'test.ru' 600
'test.th' 497
'test.tr' 982
'test.ur' 534
'test.vi' 799
'test.zh' 499
'train' 25376
'validation.ar' 908
'validation.bg' 1114
'validation.de' 798
'validation.el' 402
'validation.en' 2001年
'validation.es' 1399
'validation.fr' 1475年
'validation.hi' 1658年
'validation.it' 563
'validation.nl' 717
'validation.pl' 2214
'validation.ru' 578
'validation.th' 497
'validation.tr' 987
'validation.ur' 551
'validation.vi' 799
'validation.zh' 499
  • 特徴
{
    "words": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "pos": {
        "feature": {
            "num_classes": 17,
            "names": [
                "ADJ",
                "ADP",
                "ADV",
                "AUX",
                "CCONJ",
                "DET",
                "INTJ",
                "NOUN",
                "NUM",
                "PART",
                "PRON",
                "PROPN",
                "PUNCT",
                "SCONJ",
                "SYM",
                "VERB",
                "X"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

mlqa

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/mlqa')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.ar' 5335
'test.de' 4517
'test.en' 11590
'test.es' 5253
'test.hi' 4918
'test.vi' 5495
'test.zh' 5137
'train' 87599
'validation.ar' 517
'validation.de' 512
'validation.en' 1148
'validation.es' 500
'validation.hi' 507
'validation.vi' 511
'validation.zh' 504
  • 特徴
{
    "context": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answers": {
        "feature": {
            "answer_start": {
                "dtype": "int32",
                "id": null,
                "_type": "Value"
            },
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ノースカロライナ州

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/nc')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 10000
'test.en' 10000
'test.es' 10000
'test.fr' 10000
'test.ru' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.es' 10000
'validation.fr' 10000
'validation.ru' 10000
  • 特徴
{
    "news_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "news_body": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "news_category": {
        "num_classes": 10,
        "names": [
            "foodanddrink",
            "sports",
            "travel",
            "finance",
            "lifestyle",
            "news",
            "entertainment",
            "health",
            "video",
            "autos"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

xnli

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/xnli')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.ar' 5010
'test.bg' 5010
'test.de' 5010
'test.el' 5010
'test.en' 5010
'test.es' 5010
'test.fr' 5010
'test.hi' 5010
'test.ru' 5010
'test.sw' 5010
'test.th' 5010
'test.tr' 5010
'test.ur' 5010
'test.vi' 5010
'test.zh' 5010
'train' 392702
'validation.ar' 2490
'validation.bg' 2490
'validation.de' 2490
'validation.el' 2490
'validation.en' 2490
'validation.es' 2490
'validation.fr' 2490
'validation.hi' 2490
'validation.ru' 2490
'validation.sw' 2490
'validation.th' 2490
'validation.tr' 2490
'validation.ur' 2490
'validation.vi' 2490
'validation.zh' 2490
  • 特徴
{
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "hypothesis": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "entailment",
            "neutral",
            "contradiction"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

足x

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/paws-x')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 2000年
'test.en' 2000年
'test.es' 2000年
'test.fr' 2000年
'train' 49401
'validation.de' 2000年
'validation.en' 2000年
'validation.es' 2000年
'validation.fr' 2000年
  • 特徴
{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "different",
            "same"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

カドズム

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/qadsm')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 10000
'test.en' 10000
'test.fr' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.fr' 10000
  • 特徴
{
    "query": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ad_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ad_description": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "relevance_label": {
        "num_classes": 2,
        "names": [
            "Bad",
            "Good"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

wpr

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/wpr')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 9997
'test.en' 10004
'test.es' 10006
'test.fr' 10020
'test.it' 10001
'test.pt' 10015
'test.zh' 9999
'train' 99997
'validation.de' 10004
'validation.en' 10008
'validation.es' 10004
'validation.fr' 10005
'validation.it' 10003
'validation.pt' 10001
'validation.zh' 10002
  • 特徴
{
    "query": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "web_page_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "web_page_snippet": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "relavance_label": {
        "num_classes": 5,
        "names": [
            "Bad",
            "Fair",
            "Good",
            "Excellent",
            "Perfect"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

カム

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/qam')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 10000
'test.en' 10000
'test.fr' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.fr' 10000
  • 特徴
{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "False",
            "True"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

qg

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/qg')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 10000
'test.en' 10000
'test.es' 10000
'test.fr' 10000
'test.it' 10000
'test.pt' 10000
'train' 100000
'validation.de' 10000
'validation.en' 10000
'validation.es' 10000
'validation.fr' 10000
'validation.it' 10000
'validation.pt' 10000
  • 特徴
{
    "answer_passage": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

NTG

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:xglue/ntg')
  • 説明
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)

For more information, please take a look at https://microsoft.github.io/XGLUE/.
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.0.0
  • 分割:
スプリット
'test.de' 10000
'test.en' 10000
'test.es' 10000
'test.fr' 10000
'test.ru' 10000
'train' 300000
'validation.de' 10000
'validation.en' 10000
'validation.es' 10000
'validation.fr' 10000
'validation.ru' 10000
  • 特徴
{
    "news_body": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "news_title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}