wikihow

مراجع:

همه

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:wikihow/all')
  • توضیحات :
WikiHow is a new large-scale dataset using the online WikiHow
(http://www.wikihow.com/) knowledge base.

There are two features:
  - text: wikihow answers texts.
  - headline: bold lines as summary.

There are two separate versions:
  - all: consisting of the concatenation of all paragraphs as the articles and
         the bold lines as the reference summaries.
  - sep: consisting of each paragraph and its summary.

Download "wikihowAll.csv" and "wikihowSep.csv" from
https://github.com/mahnazkoupaee/WikiHow-Dataset and place them in manual folder
https://www.tensorflow.org/datasets/api_docs/python/tfds/download/DownloadConfig.
Train/validation/test splits are provided by the authors.
Preprocessing is applied to remove short articles
(abstract length < 0.75 article length) and clean up extra commas.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.2.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 5577
'train' 157252
'validation' 5599
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "headline": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}

سپتامبر

برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:

ds = tfds.load('huggingface:wikihow/sep')
  • توضیحات :
WikiHow is a new large-scale dataset using the online WikiHow
(http://www.wikihow.com/) knowledge base.

There are two features:
  - text: wikihow answers texts.
  - headline: bold lines as summary.

There are two separate versions:
  - all: consisting of the concatenation of all paragraphs as the articles and
         the bold lines as the reference summaries.
  - sep: consisting of each paragraph and its summary.

Download "wikihowAll.csv" and "wikihowSep.csv" from
https://github.com/mahnazkoupaee/WikiHow-Dataset and place them in manual folder
https://www.tensorflow.org/datasets/api_docs/python/tfds/download/DownloadConfig.
Train/validation/test splits are provided by the authors.
Preprocessing is applied to remove short articles
(abstract length < 0.75 article length) and clean up extra commas.
  • مجوز : مجوز شناخته شده ای وجود ندارد
  • نسخه : 1.2.0
  • تقسیمات :
تقسیم کنید نمونه ها
'test' 37800
'train' 1060732
'validation' 37932
  • ویژگی ها :
{
    "text": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "headline": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "title": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "overview": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sectionLabel": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}