TFDS ora supporta il formato Croissant 🥐 ! Leggi la documentazione per saperne di più.

Questa pagina è stata tradotta dall'API Cloud Translation.

indica_colla

Riferimenti:

wnli.en

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wnli.en')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Winograd Schema Challenge (Levesque et al., 2011) is a reading comprehension task
in which a system must read a sentence with a pronoun and select the referent of that pronoun from
a list of choices. The examples are manually constructed to foil simple statistical methods: Each
one is contingent on contextual information provided by a single word or phrase in the sentence.
To convert the problem into sentence pair classification, we construct sentence pairs by replacing
the ambiguous pronoun with each possible referent. The task is to predict if the sentence with the
pronoun substituted is entailed by the original sentence. We use a small evaluation set consisting of
new examples derived from fiction books that was shared privately by the authors of the original
corpus. While the included training set is balanced between two classes, the test set is imbalanced
between them (65% not entailment). Also, due to a data quirk, the development set is adversarial:
hypotheses are sometimes shared between training and development examples, so if a model memorizes the
training examples, they will predict the wrong label on corresponding development set
example. As with QNLI, each example is evaluated separately, so there is not a systematic correspondence
between a model's score on this task and its score on the unconverted original task. We
call converted dataset WNLI (Winograd NLI). This dataset is translated and publicly released for 3
Indian languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	146
`'train'`	635
`'validation'`	71

Caratteristiche :

{
    "hypothesis": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "not_entailment",
            "entailment",
            "None"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

wnli.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wnli.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Winograd Schema Challenge (Levesque et al., 2011) is a reading comprehension task
in which a system must read a sentence with a pronoun and select the referent of that pronoun from
a list of choices. The examples are manually constructed to foil simple statistical methods: Each
one is contingent on contextual information provided by a single word or phrase in the sentence.
To convert the problem into sentence pair classification, we construct sentence pairs by replacing
the ambiguous pronoun with each possible referent. The task is to predict if the sentence with the
pronoun substituted is entailed by the original sentence. We use a small evaluation set consisting of
new examples derived from fiction books that was shared privately by the authors of the original
corpus. While the included training set is balanced between two classes, the test set is imbalanced
between them (65% not entailment). Also, due to a data quirk, the development set is adversarial:
hypotheses are sometimes shared between training and development examples, so if a model memorizes the
training examples, they will predict the wrong label on corresponding development set
example. As with QNLI, each example is evaluated separately, so there is not a systematic correspondence
between a model's score on this task and its score on the unconverted original task. We
call converted dataset WNLI (Winograd NLI). This dataset is translated and publicly released for 3
Indian languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	146
`'train'`	635
`'validation'`	71

Caratteristiche :

{
    "hypothesis": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "not_entailment",
            "entailment",
            "None"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

wnli.gu

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wnli.gu')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Winograd Schema Challenge (Levesque et al., 2011) is a reading comprehension task
in which a system must read a sentence with a pronoun and select the referent of that pronoun from
a list of choices. The examples are manually constructed to foil simple statistical methods: Each
one is contingent on contextual information provided by a single word or phrase in the sentence.
To convert the problem into sentence pair classification, we construct sentence pairs by replacing
the ambiguous pronoun with each possible referent. The task is to predict if the sentence with the
pronoun substituted is entailed by the original sentence. We use a small evaluation set consisting of
new examples derived from fiction books that was shared privately by the authors of the original
corpus. While the included training set is balanced between two classes, the test set is imbalanced
between them (65% not entailment). Also, due to a data quirk, the development set is adversarial:
hypotheses are sometimes shared between training and development examples, so if a model memorizes the
training examples, they will predict the wrong label on corresponding development set
example. As with QNLI, each example is evaluated separately, so there is not a systematic correspondence
between a model's score on this task and its score on the unconverted original task. We
call converted dataset WNLI (Winograd NLI). This dataset is translated and publicly released for 3
Indian languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	146
`'train'`	635
`'validation'`	71

Caratteristiche :

{
    "hypothesis": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "not_entailment",
            "entailment",
            "None"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

wnli.mr

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wnli.mr')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Winograd Schema Challenge (Levesque et al., 2011) is a reading comprehension task
in which a system must read a sentence with a pronoun and select the referent of that pronoun from
a list of choices. The examples are manually constructed to foil simple statistical methods: Each
one is contingent on contextual information provided by a single word or phrase in the sentence.
To convert the problem into sentence pair classification, we construct sentence pairs by replacing
the ambiguous pronoun with each possible referent. The task is to predict if the sentence with the
pronoun substituted is entailed by the original sentence. We use a small evaluation set consisting of
new examples derived from fiction books that was shared privately by the authors of the original
corpus. While the included training set is balanced between two classes, the test set is imbalanced
between them (65% not entailment). Also, due to a data quirk, the development set is adversarial:
hypotheses are sometimes shared between training and development examples, so if a model memorizes the
training examples, they will predict the wrong label on corresponding development set
example. As with QNLI, each example is evaluated separately, so there is not a systematic correspondence
between a model's score on this task and its score on the unconverted original task. We
call converted dataset WNLI (Winograd NLI). This dataset is translated and publicly released for 3
Indian languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	146
`'train'`	635
`'validation'`	71

Caratteristiche :

{
    "hypothesis": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "not_entailment",
            "entailment",
            "None"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

copa.en

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/copa.en')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Choice Of Plausible Alternatives (COPA) evaluation provides researchers with a tool for assessing
progress in open-domain commonsense causal reasoning. COPA consists of 1000 questions, split equally
into development and test sets of 500 questions each. Each question is composed of a premise and two
alternatives, where the task is to select the alternative that more plausibly has a causal relation
with the premise. The correct alternative is randomized so that the expected performance of randomly
guessing is 50%. This dataset is translated and publicly released for 3 languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	500
`'train'`	400
`'validation'`	100

Caratteristiche :

{
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

copa.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/copa.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Choice Of Plausible Alternatives (COPA) evaluation provides researchers with a tool for assessing
progress in open-domain commonsense causal reasoning. COPA consists of 1000 questions, split equally
into development and test sets of 500 questions each. Each question is composed of a premise and two
alternatives, where the task is to select the alternative that more plausibly has a causal relation
with the premise. The correct alternative is randomized so that the expected performance of randomly
guessing is 50%. This dataset is translated and publicly released for 3 languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	449
`'train'`	362
`'validation'`	88

Caratteristiche :

{
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

copa.gu

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/copa.gu')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Choice Of Plausible Alternatives (COPA) evaluation provides researchers with a tool for assessing
progress in open-domain commonsense causal reasoning. COPA consists of 1000 questions, split equally
into development and test sets of 500 questions each. Each question is composed of a premise and two
alternatives, where the task is to select the alternative that more plausibly has a causal relation
with the premise. The correct alternative is randomized so that the expected performance of randomly
guessing is 50%. This dataset is translated and publicly released for 3 languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	448
`'train'`	362
`'validation'`	88

Caratteristiche :

{
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

copa.mr

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/copa.mr')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Choice Of Plausible Alternatives (COPA) evaluation provides researchers with a tool for assessing
progress in open-domain commonsense causal reasoning. COPA consists of 1000 questions, split equally
into development and test sets of 500 questions each. Each question is composed of a premise and two
alternatives, where the task is to select the alternative that more plausibly has a causal relation
with the premise. The correct alternative is randomized so that the expected performance of randomly
guessing is 50%. This dataset is translated and publicly released for 3 languages by AI4Bharat.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	449
`'train'`	362
`'validation'`	88

Caratteristiche :

{
    "premise": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "choice2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

sna.bn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/sna.bn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


This dataset is a collection of Bengali News articles. The dataset is used for classifying articles into
5 different classes namely international, state, kolkata, entertainment and sports.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1411
`'train'`	11284
`'validation'`	1411

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 6,
        "names": [
            "kolkata",
            "state",
            "national",
            "sports",
            "entertainment",
            "international"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

csqa.as

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.as')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	2942

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.bn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.bn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	38845

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.gu

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.gu')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	22861

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	35140

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.kn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.kn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	13666

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.ml

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.ml')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	26537

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.mr

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.mr')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	11370

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.or

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.or')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1975

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.pa

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.pa')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5667

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.ta

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.ta')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	38590

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

csqa.te

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/csqa.te')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Given a text with an entity randomly masked, the task is to predict that masked entity from a list of 4
candidate entities. The dataset contains around 239k examples across 11 languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	41338

Caratteristiche :

{
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answer": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "category": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "out_of_context_options": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wstp.as

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.as')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	626
`'train'`	5000
`'validation'`	625

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.bn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.bn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5948
`'train'`	47580
`'validation'`	5947

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.gu

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.gu')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1251
`'train'`	10004
`'validation'`	1251

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5509
`'train'`	44069
`'validation'`	5509

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.kn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.kn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	4423
`'train'`	35379
`'validation'`	4422

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.ml

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.ml')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	3441
`'train'`	27527
`'validation'`	3441

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.mr

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.mr')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1306
`'train'`	10446
`'validation'`	1306

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.o

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.or')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	502
`'train'`	4015
`'validation'`	502

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.pa

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.pa')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1097
`'train'`	8772
`'validation'`	1097

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.ta

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.ta')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	6118
`'train'`	48940
`'validation'`	6117

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

wstp.te

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wstp.te')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Predict the correct title for a Wikipedia section from a given list of four candidate titles.
The dataset has 400k examples across 11 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	10000
`'train'`	80000
`'validation'`	10000

Caratteristiche :

{
    "sectionText": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "correctTitle": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleA": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleB": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleC": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "titleD": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

inltkh.gu

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/inltkh.gu')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Obtained from inltk project. The corpus is a collection of headlines tagged with their news category.
Available for langauges: gu, ml, mr and ta.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	659
`'train'`	5269
`'validation'`	659

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 10,
        "names": [
            "entertainment",
            "business",
            "tech",
            "sports",
            "state",
            "spirituality",
            "tamil-cinema",
            "positive",
            "negative",
            "neutral"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

inltkh.ml

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/inltkh.ml')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Obtained from inltk project. The corpus is a collection of headlines tagged with their news category.
Available for langauges: gu, ml, mr and ta.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	630
`'train'`	5036
`'validation'`	630

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 10,
        "names": [
            "entertainment",
            "business",
            "tech",
            "sports",
            "state",
            "spirituality",
            "tamil-cinema",
            "positive",
            "negative",
            "neutral"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

inltkh.mr

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/inltkh.mr')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Obtained from inltk project. The corpus is a collection of headlines tagged with their news category.
Available for langauges: gu, ml, mr and ta.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1210
`'train'`	9672
`'validation'`	1210

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 10,
        "names": [
            "entertainment",
            "business",
            "tech",
            "sports",
            "state",
            "spirituality",
            "tamil-cinema",
            "positive",
            "negative",
            "neutral"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

inltkh.ta

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/inltkh.ta')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Obtained from inltk project. The corpus is a collection of headlines tagged with their news category.
Available for langauges: gu, ml, mr and ta.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	669
`'train'`	5346
`'validation'`	669

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 10,
        "names": [
            "entertainment",
            "business",
            "tech",
            "sports",
            "state",
            "spirituality",
            "tamil-cinema",
            "positive",
            "negative",
            "neutral"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

inltkh.te

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/inltkh.te')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


Obtained from inltk project. The corpus is a collection of headlines tagged with their news category.
Available for langauges: gu, ml, mr and ta.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	541
`'train'`	4328
`'validation'`	541

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 10,
        "names": [
            "entertainment",
            "business",
            "tech",
            "sports",
            "state",
            "spirituality",
            "tamil-cinema",
            "positive",
            "negative",
            "neutral"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

bbca.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/bbca.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


This release consists of 4335 Hindi documents with tags from the BBC Hindi News website.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	866
`'train'`	3467

Caratteristiche :

{
    "label": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-bn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-bn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5522

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-gu

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-gu')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	6463

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5169

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-ml

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-ml')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	4886

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-mr

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-mr')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5760

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-or

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-or')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	752

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-ta

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-ta')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5637

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-te

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-te')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	5049

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

cvit-mkb-clsr.en-ur

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/cvit-mkb-clsr.en-ur')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


CVIT Maan ki Baat Dataset - Given a sentence in language $L_1$ the task is to retrieve its translation
from a set of candidate sentences in language $L_2$.
The dataset contains around 39k parallel sentence pairs across 8 Indian languages.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1006

Caratteristiche :

{
    "sentence1": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence2": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

iitp-signor.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/iitp-mr.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


IIT Patna Product Reviews: Sentiment analysis corpus for product reviews posted in Hindi.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	310
`'train'`	2480
`'validation'`	310

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "negative",
            "neutral",
            "positive"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

iitp-pr.hi

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/iitp-pr.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


IIT Patna Product Reviews: Sentiment analysis corpus for product reviews posted in Hindi.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	523
`'train'`	4182
`'validation'`	523

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 3,
        "names": [
            "negative",
            "neutral",
            "positive"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

acta-sc.te

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/actsa-sc.te')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


ACTSA Corpus: Sentiment analysis corpus for Telugu sentences.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	541
`'train'`	4328
`'validation'`	541

Caratteristiche :

{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "label": {
        "num_classes": 2,
        "names": [
            "positive",
            "negative"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}

md.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/md.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The Hindi Discourse Analysis dataset is a corpus for analyzing discourse modes present in its sentences.
It contains sentences from stories written by 11 famous authors from the 20th Century. 4-5 stories by
each author have been selected which were available in the public domain resulting in a collection of 53 stories.
Most of these short stories were originally written in Hindi but some of them were written in other Indian languages
and later translated to Hindi.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	997
`'train'`	7974
`'validation'`	997

Caratteristiche :

{
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "discourse_mode": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "story_number": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    },
    "id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    }
}

wiki-ner.as

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.as')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	160
`'train'`	1021
`'validation'`	157

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.bn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.bn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	2690
`'train'`	20223
`'validation'`	2985

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.gu

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.gu')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	255
`'train'`	2343
`'validation'`	297

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.ciao

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.hi')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1256
`'train'`	9463
`'validation'`	1114

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.kn

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.kn')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	476
`'train'`	2679
`'validation'`	412

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.ml

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.ml')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	2042
`'train'`	15620
`'validation'`	2067

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.mr

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.mr')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1329
`'train'`	12151
`'validation'`	1498

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.or

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.or')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	153
`'train'`	1077
`'validation'`	132

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.pa

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.pa')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	179
`'train'`	1408
`'validation'`	186

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.ta

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.ta')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	2611
`'train'`	20466
`'validation'`	2586

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

wiki-ner.te

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:indic_glue/wiki-ner.te')

Descrizione :

IndicGLUE is a natural language understanding benchmark for Indian languages. It contains a wide
    variety of tasks and covers 11 major Indian languages - as, bn, gu, hi, kn, ml, mr, or, pa, ta, te.


The WikiANN dataset (Pan et al. 2017) is a dataset with NER annotations for PER, ORG and LOC. It has been constructed using
the linked entities in Wikipedia pages for 282 different languages including Danish.

Licenza : nessuna licenza conosciuta
Versione : 1.0.0
Divide :

Diviso	Esempi
`'test'`	1110
`'train'`	7978
`'validation'`	841

Caratteristiche :

{
    "tokens": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "ner_tags": {
        "feature": {
            "num_classes": 7,
            "names": [
                "B-LOC",
                "B-ORG",
                "B-PER",
                "I-LOC",
                "I-ORG",
                "I-PER",
                "O"
            ],
            "names_file": null,
            "id": null,
            "_type": "ClassLabel"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "additional_info": {
        "feature": {
            "feature": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "length": -1,
            "id": null,
            "_type": "Sequence"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}