TFDS hiện hỗ trợ định dạng Croissant 🥐 ! Đọc tài liệu để biết thêm.

Trang này được dịch bởi Cloud Translation API.

opus100

Tài liệu tham khảo:

af-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/af-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	275512
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "af",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

am-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/am-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	89027
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "am",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

an-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/an-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'train'`	6961

Đặc trưng :

{
    "translation": {
        "languages": [
            "an",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ar-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ar-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ar",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

như-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/as-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	138479
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "as",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

az-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/az-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	262089
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "az",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

là

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/be-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	67312
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "be",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bg-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/bg-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "bg",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bn-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/bn-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "bn",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

anh-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/br-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	153447
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "br",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

bs-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/bs-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "bs",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ca-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ca-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ca",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cs-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/cs-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "cs",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

cy-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/cy-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	289521
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "cy",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

daen

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/da-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "da",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/de-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "de",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

dz-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/dz-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'train'`	624

Đặc trưng :

{
    "translation": {
        "languages": [
            "dz",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

el-en

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/el-en')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "el",
            "en"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-eo

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-eo')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	337106
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "eo"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-es

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-es')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "es"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-et

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-et')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "et"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-eu

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-eu')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "eu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-fa

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-fa')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "fa"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-fi

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-fi')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "fi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-fr

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-fr')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-fy

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-fy')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	54342
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "fy"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ga

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ga')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	289524
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ga"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-gd

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-gd')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	1606
`'train'`	16316
`'validation'`	1605

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "gd"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-gl

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-gl')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	515344
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "gl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-gu

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-gu')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	318306
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "gu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ha

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ha')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	97983
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ha"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-he

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-he')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "he"
        ],
        "id": null,
        "_type": "Translation"
    }
}

chào

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-hi')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	534319
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "hi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-hr

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-hr')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "hr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-hu

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-hu')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "hu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-hy

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-hy')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'train'`	7059

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "hy"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-id

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-id')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "id"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-ig

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ig')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	1843
`'train'`	18415
`'validation'`	1843

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ig"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-là

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-is')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "is"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-nó

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-it')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "it"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ja

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ja')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ja"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ka

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ka')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	377306
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ka"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-kk

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-kk')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	79927
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "kk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-km

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-km')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	111483
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "km"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ko

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ko')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ko"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-kn

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-kn')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	918
`'train'`	14537
`'validation'`	917

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "kn"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ku

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ku')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	144844
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ku"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ky

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ky')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	27215
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ky"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-li

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-li')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	25535
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "li"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-lt

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-lt')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "lt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-lv

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-lv')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "lv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-mg

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-mg')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	590771
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "mg"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-mk

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-mk')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "mk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ml

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ml')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	822746
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ml"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-mn

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-mn')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'train'`	4294

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "mn"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-mr

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-mr')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	27007
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "mr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ms

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ms')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ms"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-mt

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-mt')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "mt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-của tôi

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-my')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	24594
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "my"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-nb

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-nb')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	142906
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "nb"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ne

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ne')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	406381
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ne"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-nl

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-nl')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-nn

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-nn')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	486055
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "nn"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-không

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-no')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "no"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-oc

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-oc')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	35791
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "oc"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-hoặc

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-or')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	1318
`'train'`	14273
`'validation'`	1317

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "or"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-pa

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-pa')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	107296
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "pa"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-pl

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-pl')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "pl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ps

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ps')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	79127
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ps"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-pt

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-pt')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "pt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ro

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ro')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ro"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ru

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ru')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ru"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-rw

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-rw')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	173823
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "rw"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-se

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-se')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	35907
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "se"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-sh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-sh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	267211
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "sh"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-si

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-si')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	979109
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "si"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-sk

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-sk')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "sk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-sl

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-sl')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "sl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-sq

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-sq')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "sq"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-sr

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-sr')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "sr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-sv

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-sv')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "sv"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ta

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ta')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	227014
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ta"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-te

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-te')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	64352
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "te"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-tg

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-tg')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	193882
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "tg"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-th

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-th')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "th"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-tk

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-tk')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	1852
`'train'`	13110
`'validation'`	1852

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "tk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-tr

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-tr')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "tr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-tt

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-tt')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	100843
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "tt"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ug

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ug')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	72170
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ug"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-uk

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-uk')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "uk"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-ur

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-ur')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	753913
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "ur"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-uz

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-uz')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	173157
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "uz"
        ],
        "id": null,
        "_type": "Translation"
    }
}

vi-vi

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-vi')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "vi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-wa

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-wa')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	104496
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "wa"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-xh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-xh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	439671
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "xh"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-yi

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-yi')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	15010
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "yi"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-yo

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-yo')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'train'`	10375

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "yo"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-zh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-zh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	1000000
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "zh"
        ],
        "id": null,
        "_type": "Translation"
    }
}

en-zu

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/en-zu')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000
`'train'`	38616
`'validation'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "en",
            "zu"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ar-de

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ar-de')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ar",
            "de"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ar-fr

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ar-fr')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ar",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ar-nl

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ar-nl')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ar",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ar-ru

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ar-ru')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ar",
            "ru"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ar-zh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ar-zh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ar",
            "zh"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-fr

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/de-fr')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "de",
            "fr"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-nl

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/de-nl')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "de",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-ru

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/de-ru')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "de",
            "ru"
        ],
        "id": null,
        "_type": "Translation"
    }
}

de-zh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/de-zh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "de",
            "zh"
        ],
        "id": null,
        "_type": "Translation"
    }
}

fr-nl

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/fr-nl')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "fr",
            "nl"
        ],
        "id": null,
        "_type": "Translation"
    }
}

fr-ru

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/fr-ru')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "fr",
            "ru"
        ],
        "id": null,
        "_type": "Translation"
    }
}

fr-zh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/fr-zh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "fr",
            "zh"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl-ru

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/nl-ru')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "nl",
            "ru"
        ],
        "id": null,
        "_type": "Translation"
    }
}

nl-zh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/nl-zh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "nl",
            "zh"
        ],
        "id": null,
        "_type": "Translation"
    }
}

ru-zh

Sử dụng lệnh sau để tải tập dữ liệu này trong TFDS:

ds = tfds.load('huggingface:opus100/ru-zh')

Sự miêu tả :

OPUS-100 is English-centric, meaning that all training pairs include English on either the source or target side.
The corpus covers 100 languages (including English).OPUS-100 contains approximately 55M sentence pairs.
Of the 99 language pairs, 44 have 1M sentence pairs of training data, 73 have at least 100k, and 95 have at least 10k.

Giấy phép : Không có giấy phép được biết đến
Phiên bản : 0.0.0
Chia tách :

Tách ra	Ví dụ
`'test'`	2000

Đặc trưng :

{
    "translation": {
        "languages": [
            "ru",
            "zh"
        ],
        "id": null,
        "_type": "Translation"
    }
}