tmu_gfm_dataset

参考文献:

次のコマンドを使用して、このデータセットを TFDS にロードします。

ds = tfds.load('huggingface:tmu_gfm_dataset')
  • 説明
A dataset for GEC metrics with manual evaluations of grammaticality, fluency, and meaning preservation for system outputs. More detail about the creation of the dataset can be found in Yoshimura et al. (2020).
  • ライセンス: 既知のライセンスはありません
  • バージョン: 1.1.0
  • 分割:
スプリット
'train' 4221
  • 特徴
{
    "source": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "output": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "grammer": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "fluency": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "meaning": {
        "feature": {
            "dtype": "int32",
            "id": null,
            "_type": "Value"
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    },
    "system": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "ave_g": {
        "dtype": "float32",
        "id": null,
        "_type": "Value"
    },
    "ave_f": {
        "dtype": "float32",
        "id": null,
        "_type": "Value"
    },
    "ave_m": {
        "dtype": "float32",
        "id": null,
        "_type": "Value"
    }
}