TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

glove100_angular

Description:

Pre-trained Global Vectors for Word Representation (GloVe) embeddings for approximate nearest neighbor search. This dataset consists of two splits:

'database': consists of 1,183,514 data points, each has features: 'embedding' (100 floats), 'index' (int64), 'neighbors' (empty list).
'test': consists of 10,000 data points, each has features: 'embedding' (100 floats), 'index' (int64), 'neighbors' (list of 'index' and 'distance' of the nearest neighbors in the database.)

Homepage: https://nlp.stanford.edu/projects/glove/
Source code: tfds.nearest_neighbors.glove_100_angular.Glove100Angular
Versions:
- 1.0.0 (default): Initial release.
Download size: 462.93 MiB
Dataset size: 567.90 MiB
Auto-cached (documentation): No
Splits:

Split	Examples
`'database'`	1,183,514
`'test'`	10,000

Feature structure:

FeaturesDict({
    'embedding': Tensor(shape=(100,), dtype=float32),
    'index': Scalar(shape=(), dtype=int64, description=Index within the split.),
    'neighbors': Sequence({
        'distance': Scalar(shape=(), dtype=float32, description=Neighbor distance.),
        'index': Scalar(shape=(), dtype=int64, description=Neighbor index.),
    }),
})

Feature documentation:

Feature	Class	Shape	Dtype	Description
	FeaturesDict
embedding	Tensor	(100,)	float32
index	Scalar		int64	Index within the split.
neighbors	Sequence			The computed neighbors, which is only available for the test split.
neighbors/distance	Scalar		float32	Neighbor distance.
neighbors/index	Scalar		int64	Neighbor index.

Supervised keys (See as_supervised doc): None
Figure (tfds.show_examples): Not supported.
Examples (tfds.as_dataframe):

Citation:

@inproceedings{pennington2014glove,
  author = {Jeffrey Pennington and Richard Socher and Christopher D. Manning},
  booktitle = {Empirical Methods in Natural Language Processing (EMNLP)},
  title = {GloVe: Global Vectors for Word Representation},
  year = {2014},
  pages = {1532--1543},
  url = {http://www.aclweb.org/anthology/D14-1162},
}