Aprenda o que há de mais recente em aprendizado de máquina, IA generativa e muito mais no WiML Symposium 2023 Registre-se

Esta página foi traduzida pela API Cloud Translation.

cmrc2018

Referências:

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:cmrc2018')

Descrição :

A Span-Extraction dataset for Chinese machine reading comprehension to add language
diversities in this area. The dataset is composed by near 20,000 real questions annotated
on Wikipedia paragraphs by human experts. We also annotated a challenge set which
contains the questions that need comprehensive understanding and multi-sentence
inference throughout the context.

Licença : Nenhuma licença conhecida
Versão : 0.1.0
Divisões :

Dividir	Exemplos
`'test'`	1002
`'train'`	10142
`'validation'`	3219

Características :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "context": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "question": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "answers": {
        "feature": {
            "text": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "answer_start": {
                "dtype": "int32",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}