TFDS ora supporta il formato Croissant 🥐 ! Leggi la documentazione per saperne di più.

Questa pagina è stata tradotta dall'API Cloud Translation.

wiki_dpr

Riferimenti:

psgs_w100.nq.exact

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.exact')

Descrizione :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licenza : nessuna licenza conosciuta
Versione : 0.0.0
Divide :

Diviso	Esempi
`'train'`	21015300

Caratteristiche :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.compressed

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.compressed')

Descrizione :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licenza : nessuna licenza conosciuta
Versione : 0.0.0
Divide :

Diviso	Esempi
`'train'`	21015300

Caratteristiche :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.no_index

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.no_index')

Descrizione :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licenza : nessuna licenza conosciuta
Versione : 0.0.0
Divide :

Diviso	Esempi
`'train'`	21015300

Caratteristiche :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.exact

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.exact')

Descrizione :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licenza : nessuna licenza conosciuta
Versione : 0.0.0
Divide :

Diviso	Esempi
`'train'`	21015300

Caratteristiche :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.compressed

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.compressed')

Descrizione :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licenza : nessuna licenza conosciuta
Versione : 0.0.0
Divide :

Diviso	Esempi
`'train'`	21015300

Caratteristiche :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.no_index

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.no_index')

Descrizione :

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

Licenza : nessuna licenza conosciuta
Versione : 0.0.0
Divide :

Diviso	Esempi
`'train'`	21015300

Caratteristiche :

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}