타오

설명 :

TAO 데이터 세트는 2,907개의 고해상도 비디오와 833개의 개체 카테고리로 구성된 대규모 비디오 개체 감지 데이터 세트입니다. 이 데이터 세트를 저장하려면 최소 300GB의 여유 공간이 필요합니다.

추가 문서 : 코드 가 포함된 논문 탐색
홈페이지 : https://taodataset.org/
소스코드 : tfds.video.tao.Tao
버전 :
- 1.0.0 (기본값): 릴리스 노트가 없습니다.
- 1.1.0 : 테스트 분할이 추가되었습니다.
다운로드 크기 : 113.96 GiB
수동 다운로드 지침 : 이 데이터세트에서는 소스 데이터를 download_config.manual_dir 에 수동으로 다운로드해야 합니다(기본값은 ~/tensorflow_datasets/downloads/manual/ ).
일부 TAO 파일(HVACS 및 AVA 비디오)은 MOT에 로그인해야 하므로 수동으로 다운로드해야 합니다. https://motchallenge.net/tao_download.php 의 지침에 따라 해당 데이터를 다운로드하세요.

이 데이터를 다운로드하고 결과 .zip 파일을 ~/tensorflow_datasets/downloads/manual/로 이동합니다.

수동 다운로드가 필요한 데이터가 없으면 건너뛰고 수동 다운로드가 필요하지 않은 데이터만 사용됩니다.

자동 캐시 ( 문서 ): 아니요
분할 :

나뉘다	예
`'train'`	500
`'validation'`	988

감독되는 키 ( as_supervised doc 참조): None
그림 ( tfds.show_examples ): 지원되지 않습니다.
인용 :

@article{Dave_2020,
   title={TAO: A Large-Scale Benchmark for Tracking Any Object},
   ISBN={9783030585587},
   ISSN={1611-3349},
   url={http://dx.doi.org/10.1007/978-3-030-58558-7_26},
   DOI={10.1007/978-3-030-58558-7_26},
   journal={Lecture Notes in Computer Science},
   publisher={Springer International Publishing},
   author={Dave, Achal and Khurana, Tarasha and Tokmakov, Pavel and Schmid, Cordelia and Ramanan, Deva},
   year={2020},
   pages={436-454}
}

tao/480_640 (기본 구성)

구성 설명 : 모든 이미지는 이중선형으로 크기가 480 X 640으로 조정됩니다.
데이터세트 크기 : 482.30 GiB
기능 구조 :

FeaturesDict({
    'metadata': FeaturesDict({
        'dataset': string,
        'height': int32,
        'neg_category_ids': Tensor(shape=(None,), dtype=int32),
        'not_exhaustive_category_ids': Tensor(shape=(None,), dtype=int32),
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=363),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'scale_category': string,
        'track_id': int32,
    }),
    'video': Video(Image(shape=(480, 640, 3), dtype=uint8)),
})

기능 문서 :

특징	수업	모양	Dtype
	특징Dict
메타데이터	특징Dict
메타데이터/데이터세트	텐서		끈
메타데이터/높이	텐서		정수32
메타데이터/neg_category_ids	텐서	(없음,)	정수32
메타데이터/not_exhaustive_category_ids	텐서	(없음,)	정수32
메타데이터/num_frames	텐서		정수32
메타데이터/video_name	텐서		끈
메타데이터/너비	텐서		정수32
트랙	순서
트랙/비박스	시퀀스(BBoxFeature)	(없음, 4)	float32
트랙/카테고리	클래스 라벨		정수64
트랙/프레임	시퀀스(텐서)	(없음,)	정수32
트랙/is_crowd	텐서		부울
트랙/규모_카테고리	텐서		끈
트랙/track_id	텐서		정수32
동영상	영상(이미지)	(없음, 480, 640, 3)	uint8

예 ( tfds.as_dataframe ):

타오/전체 해상도

구성 설명 : 데이터세트의 전체 해상도 버전입니다.
데이터세트 크기 : 171.24 GiB
기능 구조 :

FeaturesDict({
    'metadata': FeaturesDict({
        'dataset': string,
        'height': int32,
        'neg_category_ids': Tensor(shape=(None,), dtype=int32),
        'not_exhaustive_category_ids': Tensor(shape=(None,), dtype=int32),
        'num_frames': int32,
        'video_name': string,
        'width': int32,
    }),
    'tracks': Sequence({
        'bboxes': Sequence(BBoxFeature(shape=(4,), dtype=float32)),
        'category': ClassLabel(shape=(), dtype=int64, num_classes=363),
        'frames': Sequence(int32),
        'is_crowd': bool,
        'scale_category': string,
        'track_id': int32,
    }),
    'video': Video(Image(shape=(None, None, 3), dtype=uint8)),
})

기능 문서 :

특징	수업	모양	Dtype
	특징Dict
메타데이터	특징Dict
메타데이터/데이터세트	텐서		끈
메타데이터/높이	텐서		정수32
메타데이터/neg_category_ids	텐서	(없음,)	정수32
메타데이터/not_exhaustive_category_ids	텐서	(없음,)	정수32
메타데이터/num_frames	텐서		정수32
메타데이터/video_name	텐서		끈
메타데이터/너비	텐서		정수32
트랙	순서
트랙/비박스	시퀀스(BBoxFeature)	(없음, 4)	float32
트랙/카테고리	클래스 라벨		정수64
트랙/프레임	시퀀스(텐서)	(없음,)	정수32
트랙/is_crowd	텐서		부울
트랙/규모_카테고리	텐서		끈
트랙/track_id	텐서		정수32
동영상	영상(이미지)	(없음, 없음, 없음, 3)	uint8

예 ( tfds.as_dataframe ):