References:
ner
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/ner')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
3007 |
'test.en' |
3454 |
'test.es' |
1523 |
'test.nl' |
5202 |
'train' |
14042 |
'validation.de' |
2874 |
'validation.en' |
3252 |
'validation.es' |
1923 |
'validation.nl' |
2895 |
- Features:
{
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
pos
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/pos')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.ar' |
679 |
'test.bg' |
1115 |
'test.de' |
976 |
'test.el' |
455 |
'test.en' |
2076 |
'test.es' |
425 |
'test.fr' |
415 |
'test.hi' |
1683 |
'test.it' |
481 |
'test.nl' |
595 |
'test.pl' |
2214 |
'test.ru' |
600 |
'test.th' |
497 |
'test.tr' |
982 |
'test.ur' |
534 |
'test.vi' |
799 |
'test.zh' |
499 |
'train' |
25376 |
'validation.ar' |
908 |
'validation.bg' |
1114 |
'validation.de' |
798 |
'validation.el' |
402 |
'validation.en' |
2001 |
'validation.es' |
1399 |
'validation.fr' |
1475 |
'validation.hi' |
1658 |
'validation.it' |
563 |
'validation.nl' |
717 |
'validation.pl' |
2214 |
'validation.ru' |
578 |
'validation.th' |
497 |
'validation.tr' |
987 |
'validation.ur' |
551 |
'validation.vi' |
799 |
'validation.zh' |
499 |
- Features:
{
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos": {
"feature": {
"num_classes": 17,
"names": [
"ADJ",
"ADP",
"ADV",
"AUX",
"CCONJ",
"DET",
"INTJ",
"NOUN",
"NUM",
"PART",
"PRON",
"PROPN",
"PUNCT",
"SCONJ",
"SYM",
"VERB",
"X"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
mlqa
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/mlqa')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.ar' |
5335 |
'test.de' |
4517 |
'test.en' |
11590 |
'test.es' |
5253 |
'test.hi' |
4918 |
'test.vi' |
5495 |
'test.zh' |
5137 |
'train' |
87599 |
'validation.ar' |
517 |
'validation.de' |
512 |
'validation.en' |
1148 |
'validation.es' |
500 |
'validation.hi' |
507 |
'validation.vi' |
511 |
'validation.zh' |
504 |
- Features:
{
"context": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"question": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"answers": {
"feature": {
"answer_start": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
}
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
nc
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/nc')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
10000 |
'test.en' |
10000 |
'test.es' |
10000 |
'test.fr' |
10000 |
'test.ru' |
10000 |
'train' |
100000 |
'validation.de' |
10000 |
'validation.en' |
10000 |
'validation.es' |
10000 |
'validation.fr' |
10000 |
'validation.ru' |
10000 |
- Features:
{
"news_title": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"news_body": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"news_category": {
"num_classes": 10,
"names": [
"foodanddrink",
"sports",
"travel",
"finance",
"lifestyle",
"news",
"entertainment",
"health",
"video",
"autos"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
xnli
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/xnli')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.ar' |
5010 |
'test.bg' |
5010 |
'test.de' |
5010 |
'test.el' |
5010 |
'test.en' |
5010 |
'test.es' |
5010 |
'test.fr' |
5010 |
'test.hi' |
5010 |
'test.ru' |
5010 |
'test.sw' |
5010 |
'test.th' |
5010 |
'test.tr' |
5010 |
'test.ur' |
5010 |
'test.vi' |
5010 |
'test.zh' |
5010 |
'train' |
392702 |
'validation.ar' |
2490 |
'validation.bg' |
2490 |
'validation.de' |
2490 |
'validation.el' |
2490 |
'validation.en' |
2490 |
'validation.es' |
2490 |
'validation.fr' |
2490 |
'validation.hi' |
2490 |
'validation.ru' |
2490 |
'validation.sw' |
2490 |
'validation.th' |
2490 |
'validation.tr' |
2490 |
'validation.ur' |
2490 |
'validation.vi' |
2490 |
'validation.zh' |
2490 |
- Features:
{
"premise": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"hypothesis": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"label": {
"num_classes": 3,
"names": [
"entailment",
"neutral",
"contradiction"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
paws-x
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/paws-x')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
2000 |
'test.en' |
2000 |
'test.es' |
2000 |
'test.fr' |
2000 |
'train' |
49401 |
'validation.de' |
2000 |
'validation.en' |
2000 |
'validation.es' |
2000 |
'validation.fr' |
2000 |
- Features:
{
"sentence1": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence2": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"label": {
"num_classes": 2,
"names": [
"different",
"same"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
qadsm
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/qadsm')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
10000 |
'test.en' |
10000 |
'test.fr' |
10000 |
'train' |
100000 |
'validation.de' |
10000 |
'validation.en' |
10000 |
'validation.fr' |
10000 |
- Features:
{
"query": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"ad_title": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"ad_description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"relevance_label": {
"num_classes": 2,
"names": [
"Bad",
"Good"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
wpr
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/wpr')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
9997 |
'test.en' |
10004 |
'test.es' |
10006 |
'test.fr' |
10020 |
'test.it' |
10001 |
'test.pt' |
10015 |
'test.zh' |
9999 |
'train' |
99997 |
'validation.de' |
10004 |
'validation.en' |
10008 |
'validation.es' |
10004 |
'validation.fr' |
10005 |
'validation.it' |
10003 |
'validation.pt' |
10001 |
'validation.zh' |
10002 |
- Features:
{
"query": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"web_page_title": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"web_page_snippet": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"relavance_label": {
"num_classes": 5,
"names": [
"Bad",
"Fair",
"Good",
"Excellent",
"Perfect"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
qam
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/qam')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
10000 |
'test.en' |
10000 |
'test.fr' |
10000 |
'train' |
100000 |
'validation.de' |
10000 |
'validation.en' |
10000 |
'validation.fr' |
10000 |
- Features:
{
"question": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"answer": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"label": {
"num_classes": 2,
"names": [
"False",
"True"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
}
}
qg
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/qg')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
10000 |
'test.en' |
10000 |
'test.es' |
10000 |
'test.fr' |
10000 |
'test.it' |
10000 |
'test.pt' |
10000 |
'train' |
100000 |
'validation.de' |
10000 |
'validation.en' |
10000 |
'validation.es' |
10000 |
'validation.fr' |
10000 |
'validation.it' |
10000 |
'validation.pt' |
10000 |
- Features:
{
"answer_passage": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"question": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ntg
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:xglue/ntg')
- Description:
XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained
models with respect to cross-lingual natural language understanding and generation.
The benchmark is composed of the following 11 tasks:
- NER
- POS Tagging (POS)
- News Classification (NC)
- MLQA
- XNLI
- PAWS-X
- Query-Ad Matching (QADSM)
- Web Page Ranking (WPR)
- QA Matching (QAM)
- Question Generation (QG)
- News Title Generation (NTG)
For more information, please take a look at https://microsoft.github.io/XGLUE/.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test.de' |
10000 |
'test.en' |
10000 |
'test.es' |
10000 |
'test.fr' |
10000 |
'test.ru' |
10000 |
'train' |
300000 |
'validation.de' |
10000 |
'validation.en' |
10000 |
'validation.es' |
10000 |
'validation.fr' |
10000 |
'validation.ru' |
10000 |
- Features:
{
"news_body": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"news_title": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}