¡TFDS ahora admite el formato Croissant 🥐 ! Lea la documentación para saber más.

Esta página se ha traducido con Cloud Translation API.

wiki_dpr

Referencias:

psgs_w100.nq.exact

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.exact')

Descripción :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licencia : Ninguna licencia conocida
Versión : 0.0.0
Divisiones :

Dividir	Ejemplos
`'train'`	21015300

Características :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.comprimido

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.compressed')

Descripción :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licencia : Ninguna licencia conocida
Versión : 0.0.0
Divisiones :

Dividir	Ejemplos
`'train'`	21015300

Características :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.no_index

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.no_index')

Descripción :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licencia : Ninguna licencia conocida
Versión : 0.0.0
Divisiones :

Dividir	Ejemplos
`'train'`	21015300

Características :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.exact

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.exact')

Descripción :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licencia : Ninguna licencia conocida
Versión : 0.0.0
Divisiones :

Dividir	Ejemplos
`'train'`	21015300

Características :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.comprimido

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.compressed')

Descripción :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licencia : Ninguna licencia conocida
Versión : 0.0.0
Divisiones :

Dividir	Ejemplos
`'train'`	21015300

Características :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.no_index

Utilice el siguiente comando para cargar este conjunto de datos en TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.no_index')

Descripción :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licencia : Ninguna licencia conocida
Versión : 0.0.0
Divisiones :

Dividir	Ejemplos
`'train'`	21015300

Características :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}