مراجع:
أمه
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/amh')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 500 |
'train' | 1750 |
'validation' | 250 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
هاو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/hau')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 552 |
'train' | 1912 |
'validation' | 276 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
ايبو
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/ibo')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 638 |
'train' | 2235 |
'validation' | 320 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
أقرباء
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/kin')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 605 |
'train' | 2116 |
'validation' | 302 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
العروة
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/lug')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 407 |
'train' | 1428 |
'validation' | 200 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
لوه
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/luo')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 186 |
'train' | 644 |
'validation' | 92 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
pcm
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/pcm')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 600 |
'train' | 2124 |
'validation' | 306 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
سوا
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/swa')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 604 |
'train' | 2109 |
'validation' | 300 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
wol
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/wol')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 539 |
'train' | 1871 |
'validation' | 267 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
الخاص بك
استخدم الأمر التالي لتحميل مجموعة البيانات هذه في TFDS:
ds = tfds.load('huggingface:masakhaner/yor')
- وصف :
MasakhaNER is the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages.
Named entities are phrases that contain the names of persons, organizations, locations, times and quantities.
Example:
[PER Wolff] , currently a journalist in [LOC Argentina] , played with [PER Del Bosque] in the final years of the seventies in [ORG Real Madrid] .
MasakhaNER is a named entity dataset consisting of PER, ORG, LOC, and DATE entities annotated by Masakhane for ten African languages:
- Amharic
- Hausa
- Igbo
- Kinyarwanda
- Luganda
- Luo
- Nigerian-Pidgin
- Swahili
- Wolof
- Yoruba
The train/validation/test sets are available for all the ten languages.
For more details see https://arxiv.org/abs/2103.11811
- الترخيص : لا يوجد ترخيص معروف
- الإصدار : 1.0.0
- الإنشقاقات :
ينقسم | أمثلة |
---|---|
'test' | 645 |
'train' | 2171 |
'validation' | 305 |
- سمات :
{
"id": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 9,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-LOC",
"I-LOC",
"B-DATE",
"I-DATE"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}