References:
lid_spaeng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/lid_spaeng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
8289 |
'train' |
21030 |
'validation' |
3332 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
lid_hineng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/lid_hineng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
1854 |
'train' |
4823 |
'validation' |
744 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
lid_msaea
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/lid_msaea')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
1663 |
'train' |
8464 |
'validation' |
1116 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
lid_nepeng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/lid_nepeng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
3228 |
'train' |
8451 |
'validation' |
1332 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
pos_spaeng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/pos_spaeng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
10720 |
'train' |
27893 |
'validation' |
4298 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
pos_hineng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/pos_hineng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
299 |
'train' |
1030 |
'validation' |
160 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ner_spaeng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/ner_spaeng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
23527 |
'train' |
33611 |
'validation' |
10085 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ner_msaea
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/ner_msaea')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
1110 |
'train' |
10103 |
'validation' |
1122 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ner_hineng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/ner_hineng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
522 |
'train' |
1243 |
'validation' |
314 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
sa_spaeng
Use the following command to load this dataset in TFDS:
ds = tfds.load('huggingface:lince/sa_spaeng')
- Description:
LinCE is a centralized Linguistic Code-switching Evaluation benchmark
(https://ritual.uh.edu/lince/) that contains data for training and evaluating
NLP systems on code-switching tasks.
- License: No known license
- Version: 1.0.0
- Splits:
Split | Examples |
---|---|
'test' |
4736 |
'train' |
12194 |
'validation' |
1859 |
- Features:
{
"idx": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"words": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lid": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"sa": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}