wiki_lingua

Riferimenti:

arabo

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/arabic')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 9995
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

cinese

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/chinese')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 6541
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

ceco

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/czech')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 2520
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

olandese

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/dutch')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 10862
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

inglese

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/english')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 57945
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

francese

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/french')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 21690
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

tedesco

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/german')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 20103
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

hindi

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/hindi')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 3402
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

indonesiano

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/indonesian')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 16308
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

italiano

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/italian')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 17673
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

giapponese

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/japanese')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 4372
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

coreano

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/korean')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 4111
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

portoghese

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/portuguese')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 28143
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

russo

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/russian')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 18143
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

spagnolo

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/spanish')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 38795
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

tailandese

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/thai')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 5093
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

turco

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/turkish')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 1512
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}

vietnamita

Utilizzare il comando seguente per caricare questo set di dati in TFDS:

ds = tfds.load('huggingface:wiki_lingua/vietnamese')
  • Descrizione :
WikiLingua is a large-scale multilingual dataset for the evaluation of
crosslingual abstractive summarization systems. The dataset includes ~770k
article and summary pairs in 18 languages from WikiHow. The gold-standard
article-summary alignments across languages was done by aligning the images
that are used to describe each how-to step in an article.
  • Licenza : CC BY-NC-SA 3.0
  • Versione : 1.1.1
  • Divide :
Diviso Esempi
'train' 6616
  • Caratteristiche :
{
    "url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "article": {
        "feature": {
            "section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "document": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "summary": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_url": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            },
            "english_section_name": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
            }
        },
        "length": -1,
        "id": null,
        "_type": "Sequence"
    }
}