imagenet2012_multilabel

الوصف :

تحتوي مجموعة البيانات هذه على صور تحقق ILSVRC-2012 (ImageNet) مشروحة بملصقات متعددة الفئات من "تقييم دقة الماكينة على ImageNet" ، ICML ، 2020. تمت مراجعة الملصقات متعددة الفئات بواسطة لجنة من الخبراء المدربين على نطاق واسع في تعقيدات الدقة الدقيقة الفروق بين الفئات المحببة في التسلسل الهرمي لفئة ImageNet (انظر الورقة لمزيد من التفاصيل). مقارنةً بالملصقات الأصلية ، تتيح هذه الملصقات متعددة الفئات التي راجعها الخبراء تقييمًا أكثر تماسكًا للدقة.

يحتوي الإصدار 3.0.0 من مجموعة البيانات هذه على المزيد من الملصقات المصححة من "متى يصبح العجين خبزًا؟"

تحتوي 20000 صورة فقط من 50000 صورة تحقق من صحة ImageNet على تعليقات توضيحية متعددة التسميات. تم إنشاء مجموعة الملصقات المتعددة لأول مرة بواسطة اختبار مكون من 67 نموذجًا تم تدريبه من نماذج ImageNet ، ثم قام الخبراء بتعليق كل توقع نموذج فردي يدويًا على أنه إما correct (التسمية صحيحة للصورة) ، أو wrong (التسمية غير صحيحة لـ الصورة) ، أو unclear (لم يتم التوصل إلى إجماع بين الخبراء).

بالإضافة إلى ذلك ، أثناء التعليق التوضيحي ، حددت لجنة الخبراء مجموعة من الصور الإشكالية . كانت الصورة مشكلة إذا استوفت أيًا من المعايير التالية:

كانت تسمية ImageNet الأصلية (أعلى تصنيف 1) غير صحيحة أو غير واضحة
كانت الصورة عبارة عن رسم أو لوحة أو رسم أو رسوم متحركة أو تم تقديمها بواسطة الكمبيوتر
تم تحرير الصورة بشكل مفرط
الصورة بها محتوى غير لائق

يتم تضمين الصور التي بها مشكلات في مجموعة البيانات هذه ولكن يجب تجاهلها عند حساب دقة العلامات المتعددة. بالإضافة إلى ذلك ، نظرًا لأن المجموعة الأولية المكونة من 20000 تعليق توضيحي متوازنة مع الفئة ، ولكن مجموعة الصور التي بها مشكلات ليست كذلك ، فإننا نوصي بحساب الدقة لكل فئة ثم حساب متوسطها. نوصي أيضًا بحساب التنبؤ على أنه صحيح إذا تم تحديده على أنه صحيح أو غير واضح (على سبيل المثال ، التساهل مع التسميات غير الواضحة).

إحدى الطرق الممكنة للقيام بذلك هي باستخدام كود NumPy التالي:

import tensorflow_datasets as tfds

ds = tfds.load('imagenet2012_multilabel', split='validation')

# We assume that predictions is a dictionary from file_name to a class index between 0 and 999

num_correct_per_class = {}
num_images_per_class = {}

for example in ds:
    # We ignore all problematic images
    if example[‘is_problematic’].numpy():
        continue

    # The label of the image in ImageNet
    cur_class = example['original_label'].numpy()

    # If we haven't processed this class yet, set the counters to 0
    if cur_class not in num_correct_per_class:
        num_correct_per_class[cur_class] = 0
        assert cur_class not in num_images_per_class
        num_images_per_class[cur_class] = 0

    num_images_per_class[cur_class] += 1

    # Get the predictions for this image
    cur_pred = predictions[example['file_name'].numpy()]

    # We count a prediction as correct if it is marked as correct or unclear
    # (i.e., we are lenient with the unclear labels)
    if cur_pred is in example['correct_multi_labels'].numpy() or cur_pred is in example['unclear_multi_labels'].numpy():
        num_correct_per_class[cur_class] += 1

# Check that we have collected accuracy data for each of the 1,000 classes
num_classes = 1000
assert len(num_correct_per_class) == num_classes
assert len(num_images_per_class) == num_classes

# Compute the per-class accuracies and then average them
final_avg = 0
for cid in range(num_classes):
  assert cid in num_correct_per_class
  assert cid in num_images_per_class
  final_avg += num_correct_per_class[cid] / num_images_per_class[cid]
final_avg /= num_classes

الصفحة الرئيسية : https://github.com/modestyachts/evaluating_machine_accuracy_on_imagenet
كود المصدر : tfds.datasets.imagenet2012_multilabel.Builder
إصدارات :
- 1.0.0 : الإصدار الأولي.
- 2.0.0 : ملف ILSVRC2012_img_val.tar ثابت.
- 3.0.0 (افتراضي): تصحيح التسميات وتقسيم ImageNet-M.
حجم التحميل : 191.13 MiB
حجم مجموعة البيانات : 2.50 GiB
إرشادات التنزيل اليدوي : تتطلب مجموعة البيانات هذه تنزيل بيانات المصدر يدويًا إلى download_config.manual_dir (الإعدادات الافتراضية على ~/tensorflow_datasets/downloads/manual/ ):
يجب أن يحتوي manual_dir على ملف ILSVRC2012_img_val.tar . تحتاج إلى التسجيل في http://www.image-net.org/download-images للحصول على الرابط لتنزيل مجموعة البيانات.
التخزين المؤقت التلقائي ( التوثيق ): لا
الانقسامات :

انشق، مزق	أمثلة
`'imagenet_m'`	68
`'validation'`	20000

هيكل الميزة :

FeaturesDict({
    'correct_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
    'file_name': Text(shape=(), dtype=string),
    'image': Image(shape=(None, None, 3), dtype=uint8),
    'is_problematic': bool,
    'original_label': ClassLabel(shape=(), dtype=int64, num_classes=1000),
    'unclear_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
    'wrong_multi_labels': Sequence(ClassLabel(shape=(), dtype=int64, num_classes=1000)),
})

وثائق الميزة :

ميزة	فصل	شكل	نوع
	الميزات
تصحيح_التصنيفات المتعددة	تسلسل (ClassLabel)	(لا أحد،)	int64
اسم الملف	نص		سلسلة
صورة	صورة	(لا شيء ، لا شيء ، 3)	uint8
مشكوك فيه	موتر		منطقي
original_label	ClassLabel		int64
غير واضح_التسميات_المتعددة	تسلسل (ClassLabel)	(لا أحد،)	int64
عدة تصنيفات خاطئة	تسلسل (ClassLabel)	(لا أحد،)	int64

المفاتيح الخاضعة للإشراف (راجع المستند as_supervised ): ('image', 'correct_multi_labels')
الشكل ( tfds.show_examples ):

التصور

أمثلة ( tfds.as_dataframe ):

الاقتباس :

@article{shankar2019evaluating,
  title={Evaluating Machine Accuracy on ImageNet},
  author={Vaishaal Shankar* and Rebecca Roelofs* and Horia Mania and Alex Fang and Benjamin Recht and Ludwig Schmidt},
  journal={ICML},
  year={2020},
  note={\url{http://proceedings.mlr.press/v119/shankar20c.html} }
}
@article{ImageNetChallenge,
  title={ {ImageNet} large scale visual recognition challenge},
  author={Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause
   and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and
   Alexander C. Berg and Fei-Fei Li},
  journal={International Journal of Computer Vision},
  year={2015},
  note={\url{https://arxiv.org/abs/1409.0575} }
}
@inproceedings{ImageNet,
   author={Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei},
   booktitle={Conference on Computer Vision and Pattern Recognition (CVPR)},
   title={ {ImageNet}: A large-scale hierarchical image database},
   year={2009},
   note={\url{http://www.image-net.org/papers/imagenet_cvpr09.pdf} }
}
@article{vasudevan2022does,
  title={When does dough become a bagel? Analyzing the remaining mistakes on ImageNet},
  author={Vasudevan, Vijay and Caine, Benjamin and Gontijo-Lopes, Raphael and Fridovich-Keil, Sara and Roelofs, Rebecca},
  journal={arXiv preprint arXiv:2205.04596},
  year={2022}
}