open_images_v4

説明：

Open Images は、画像レベルのラベルとオブジェクト境界ボックスで注釈が付けられた約 900 万個の画像のデータセットです。

V4 のトレーニングセットには、174 万枚の画像上の 600 のオブジェクトクラスに対する 1460 万個のバウンディングボックスが含まれており、オブジェクト位置アノテーションを備えた既存のデータセットとしては最大のものになります。ボックスは、正確さと一貫性を確保するために、プロのアノテーターによって大部分が手作業で描かれています。画像は非常に多様で、多くの場合、複数のオブジェクト (画像あたり平均 8.4) を含む複雑なシーンが含まれています。さらに、データセットには、数千のクラスにわたる画像レベルのラベルで注釈が付けられます。

追加ドキュメント:コード付きの論文について調べる
ホームページ: https://storage.googleapis.com/openimages/web/index.html
ソースコード: tfds.datasets.open_images_v4.Builder
バージョン:
- 2.0.0 (デフォルト): 新しい分割 API ( https://tensorflow.org/datasets/splits )
ダウンロードサイズ: 565.11 GiB
自動キャッシュ(ドキュメント): いいえ
分割:

スプリット	例
`'test'`	125,436
`'train'`	1,743,042
`'validation'`	41,620

機能の構造:

FeaturesDict({
    'bobjects': Sequence({
        'bbox': BBoxFeature(shape=(4,), dtype=float32),
        'is_depiction': int8,
        'is_group_of': int8,
        'is_inside': int8,
        'is_occluded': int8,
        'is_truncated': int8,
        'label': ClassLabel(shape=(), dtype=int64, num_classes=601),
        'source': ClassLabel(shape=(), dtype=int64, num_classes=6),
    }),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'image/filename': Text(shape=(), dtype=string),
    'objects': Sequence({
        'confidence': int32,
        'label': ClassLabel(shape=(), dtype=int64, num_classes=19995),
        'source': ClassLabel(shape=(), dtype=int64, num_classes=6),
    }),
    'objects_trainable': Sequence({
        'confidence': int32,
        'label': ClassLabel(shape=(), dtype=int64, num_classes=7186),
        'source': ClassLabel(shape=(), dtype=int64, num_classes=6),
    }),
})

機能ドキュメント:

特徴	クラス	形	Dタイプ
	特徴辞書
オブジェクト	順序
ボオブジェクト/Bボックス	BBox機能	(4,)	float32
bobjects/is_depiction	テンソル		int8
bobjects/is_group_of	テンソル		int8
bobjects/is_inside	テンソル		int8
bobjects/is_occluded	テンソル		int8
bobjects/is_truncated	テンソル		int8
オブジェクト/ラベル	クラスラベル		int64
オブジェクト/ソース	クラスラベル		int64
画像	画像	(なし、なし、3)	uint8
画像/ファイル名	文章		弦
オブジェクト	順序
オブジェクト/信頼	テンソル		int32
オブジェクト/ラベル	クラスラベル		int64
オブジェクト/ソース	クラスラベル		int64
オブジェクト_トレーニング可能	順序
オブジェクト_訓練可能/信頼性	テンソル		int32
オブジェクト_トレーニング可能/ラベル	クラスラベル		int64
オブジェクト_トレーニング可能/ソース	クラスラベル		int64

監視キー( as_supervised docを参照): None
引用：

@article{OpenImages,
  author = {Alina Kuznetsova and
            Hassan Rom and
            Neil Alldrin and
            Jasper Uijlings and
            Ivan Krasin and
            Jordi Pont-Tuset and
            Shahab Kamali and
            Stefan Popov and
            Matteo Malloci and
            Tom Duerig and
            Vittorio Ferrari},
  title = {The Open Images Dataset V4: Unified image classification,
           object detection, and visual relationship detection at scale},
  year = {2018},
  journal = {arXiv:1811.00982}
}
@article{OpenImages2,
  author = {Krasin, Ivan and
            Duerig, Tom and
            Alldrin, Neil and
            Ferrari, Vittorio
            and Abu-El-Haija, Sami and
            Kuznetsova, Alina and
            Rom, Hassan and
            Uijlings, Jasper and
            Popov, Stefan and
            Kamali, Shahab and
            Malloci, Matteo and
            Pont-Tuset, Jordi and
            Veit, Andreas and
            Belongie, Serge and
            Gomes, Victor and
            Gupta, Abhinav and
            Sun, Chen and
            Chechik, Gal and
            Cai, David and
            Feng, Zheyun and
            Narayanan, Dhyanesh and
            Murphy, Kevin},
  title = {OpenImages: A public dataset for large-scale multi-label and
           multi-class image classification.},
  journal = {Dataset available from
             https://storage.googleapis.com/openimages/web/index.html},
  year={2017}
}