Referências:
ah
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/amh')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 500 |
'train' | 1750 |
'validation' | 250 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
hau
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/hau')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 552 |
'train' | 1912 |
'validation' | 276 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ibo
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/ibo')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 638 |
'train' | 2235 |
'validation' | 320 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
parente
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/kin')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 605 |
'train' | 2116 |
'validation' | 302 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
arrastar
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/lug')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 407 |
'train' | 1428 |
'validation' | 200 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
luo
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/luo')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 186 |
'train' | 644 |
'validation' | 92 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
pcm
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/pcm')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 600 |
'train' | 2124 |
'validation' | 306 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
swa
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/swa')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 604 |
'train' | 2109 |
'validation' | 300 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
wol
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/wol')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 539 |
'train' | 1871 |
'validation' | 267 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
seu
Use o seguinte comando para carregar esse conjunto de dados no TFDS:
ds = tfds.load('huggingface:masakhaner/yor')
- Descrição :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- Licença : Nenhuma licença conhecida
- Versão : 1.0.0
- Divisões :
Dividir | Exemplos |
---|---|
'test' | 645 |
'train' | 2171 |
'validation' | 305 |
- Características :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}