Referencias:
bokmal
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/bokmaal')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 1939 |
'train' | 15696 |
'validation' | 2410 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 19,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-GPE_LOC",
"I-GPE_LOC",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-GPE_ORG",
"I-GPE_ORG",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
nynorsk
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/nynorsk')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 1511 |
'train' | 14174 |
'validation' | 1890 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 19,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-GPE_LOC",
"I-GPE_LOC",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-GPE_ORG",
"I-GPE_ORG",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
conjunto
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/combined')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 3450 |
'train' | 29870 |
'validation' | 4300 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 19,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-GPE_LOC",
"I-GPE_LOC",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-GPE_ORG",
"I-GPE_ORG",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
bokmaal-7
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/bokmaal-7')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 1939 |
'train' | 15696 |
'validation' | 2410 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 15,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
nynorsk-7
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/nynorsk-7')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 1511 |
'train' | 14174 |
'validation' | 1890 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 15,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
combinado-7
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/combined-7')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 3450 |
'train' | 29870 |
'validation' | 4300 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 15,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
bokmaal-8
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/bokmaal-8')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 1939 |
'train' | 15696 |
'validation' | 2410 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 17,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-GPE",
"I-GPE",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
nynorsk-8
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/nynorsk-8')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 1511 |
'train' | 14174 |
'validation' | 1890 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 17,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-GPE",
"I-GPE",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
combinado-8
Utilice el siguiente comando para cargar este conjunto de datos en TFDS:
ds = tfds.load('huggingface:norne/combined-8')
- Descripción :
NorNE is a manually annotated
corpus of named entities which extends the annotation of the existing
Norwegian Dependency Treebank. Comprising both of the official standards of
written Norwegian (Bokmål and Nynorsk), the corpus contains around 600,000
tokens and annotates a rich set of entity types including persons,
organizations, locations, geo-political entities, products, and events,
in addition to a class corresponding to nominals derived from names.
- Licencia : Ninguna licencia conocida
- Versión : 1.0.0
- Divisiones :
Dividir | Ejemplos |
---|---|
'test' | 3450 |
'train' | 29870 |
'validation' | 4300 |
- Características :
{
"idx": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"lang": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"tokens": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"lemmas": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"pos_tags": {
"feature": {
"num_classes": 17,
"names": [
"NOUN",
"PUNCT",
"ADP",
"NUM",
"SYM",
"SCONJ",
"ADJ",
"PART",
"DET",
"CCONJ",
"PROPN",
"PRON",
"X",
"ADV",
"INTJ",
"VERB",
"AUX"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
},
"ner_tags": {
"feature": {
"num_classes": 17,
"names": [
"O",
"B-PER",
"I-PER",
"B-ORG",
"I-ORG",
"B-PROD",
"I-PROD",
"B-LOC",
"I-LOC",
"B-GPE",
"I-GPE",
"B-DRV",
"I-DRV",
"B-EVT",
"I-EVT",
"B-MISC",
"I-MISC"
],
"names_file": null,
"id": null,
"_type": "ClassLabel"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}