컨셉_12분

참고자료:

TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.

ds = tfds.load('huggingface:conceptual_12m')

설명 :

Conceptual 12M is a large-scale dataset of 12 million
image-text pairs specifically meant to be used for visionand-language pre-training.
Its data collection pipeline is a relaxed version of the one used in Conceptual Captions 3M.

라이센스 : 데이터 세트는 어떤 목적으로든 자유롭게 사용할 수 있지만 Google LLC("Google")를 데이터 소스로 인정하는 것이 좋습니다. 데이터 세트는 명시적이든 묵시적이든 어떠한 보증도 없이 "있는 그대로" 제공됩니다. Google은 데이터 세트 사용으로 인해 발생하는 직간접적인 피해에 대해 모든 책임을 지지 않습니다.
버전 : 0.0.0
분할 :

나뉘다	예
`'train'`	12423374

특징 :

{
    "image_url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "caption": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    }
}