wiki_lingua

Referensi:

Arab

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/arabic')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 9995
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Cina

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/chinese')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 6541
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Ceko

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/czech')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 2520
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Belanda

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/dutch')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 10862
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

bahasa inggris

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/english')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 57945
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Perancis

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/french')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 21690
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Jerman

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/german')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 20103
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Hindi

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/hindi')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 3402
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Indonesia

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/indonesian')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 16308
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Italia

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/italian')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 17673
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Jepang

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/japanese')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 4372
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Korea

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/korean')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 4111
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Portugis

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/portuguese')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 28143
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Rusia

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/russian')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 18143
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Spanyol

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/spanish')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 38795
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Thai

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/thai')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 5093
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Turki

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/turkish')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 1512
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

Vietnam

Gunakan perintah berikut untuk memuat kumpulan data ini di TFDS:

ds = tfds.load('huggingface:wiki_lingua/vietnamese')
  • Keterangan :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Lisensi : CC BY-NC-SA 3.0
  • Versi : 1.1.1
  • Perpecahan :
Membelah Contoh
'train' 6616
  • Fitur :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}