TFDS はCroissant 🥐 形式をサポートするようになりました。詳細については、ドキュメントをお読みください。

このページは Cloud Translation API によって翻訳されました。

wiki_dpr

参考文献:

psgs_w100.nq.exact

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.exact')

説明：

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

ライセンス: 既知のライセンスはありません
バージョン: 0.0.0
分割:

スプリット	例
`'train'`	21015300

特徴：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.compressed

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.compressed')

説明：

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

ライセンス: 既知のライセンスはありません
バージョン: 0.0.0
分割:

スプリット	例
`'train'`	21015300

特徴：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.nq.no_index

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.nq.no_index')

説明：

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

ライセンス: 既知のライセンスはありません
バージョン: 0.0.0
分割:

スプリット	例
`'train'`	21015300

特徴：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.exact

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.exact')

説明：

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

ライセンス: 既知のライセンスはありません
バージョン: 0.0.0
分割:

スプリット	例
`'train'`	21015300

特徴：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.compressed

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.compressed')

説明：

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

ライセンス: 既知のライセンスはありません
バージョン: 0.0.0
分割:

スプリット	例
`'train'`	21015300

特徴：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

psgs_w100.multiset.no_index

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:wiki_dpr/psgs_w100.multiset.no_index')

説明：

This is the wikipedia split used to evaluate the Dense Passage Retrieval (DPR) model.
It contains 21M passages from wikipedia along with their DPR embeddings.
The wikipedia articles were split into multiple, disjoint text blocks of 100 words as passages.

ライセンス: 既知のライセンスはありません
バージョン: 0.0.0
分割:

スプリット	例
`'train'`	21015300

特徴：

{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "embeddings": {
        "feature": {
            "dtype": "float32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}