TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

omniglot

Description:

Omniglot data set for one-shot learning. This dataset contains 1623 different handwritten characters from 50 different alphabets.

Additional Documentation: Explore on Papers With Code
Homepage: https://github.com/brendenlake/omniglot/
Source code: tfds.image_classification.Omniglot
Versions:
- 3.0.0 (default): New split API (https://tensorflow.org/datasets/splits)
Download size: 17.95 MiB
Dataset size: 12.29 MiB
Auto-cached (documentation): Yes
Splits:

Split	Examples
`'small1'`	2,720
`'small2'`	3,120
`'test'`	13,180
`'train'`	19,280

Feature structure:

FeaturesDict({
    'alphabet': ClassLabel(shape=(), dtype=int64, num_classes=50),
    'alphabet_char_id': int64,
    'image': Image(shape=(105, 105, 3), dtype=uint8),
    'label': ClassLabel(shape=(), dtype=int64, num_classes=1623),
})

Feature documentation:

Feature	Class	Shape	Dtype
	FeaturesDict
alphabet	ClassLabel		int64
alphabet_char_id	Tensor		int64
image	Image	(105, 105, 3)	uint8
label	ClassLabel		int64

Supervised keys (See as_supervised doc): ('image', 'label')
Figure (tfds.show_examples):

Visualization

Examples (tfds.as_dataframe):

Citation:

@article{lake2015human,
  title={Human-level concept learning through probabilistic program induction},
  author={Lake, Brenden M and Salakhutdinov, Ruslan and Tenenbaum, Joshua B},
  journal={Science},
  volume={350},
  number={6266},
  pages={1332--1338},
  year={2015},
  publisher={American Association for the Advancement of Science}
}