TFDS now supports the Croissant 🥐 format! Read the documentation to know more.

sem_eval_2018_task_1

References:

subtask5.english

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:sem_eval_2018_task_1/subtask5.english')

Description:

SemEval-2018 Task 1: Affect in Tweets: SubTask 5: Emotion Classification.
 This is a dataset for multilabel emotion classification for tweets.
 'Given a tweet, classify it as 'neutral or no emotion' or as one, or more, of eleven given emotions that best represent the mental state of the tweeter.'
 It contains 22467 tweets in three languages manually annotated by crowdworkers using Best–Worst Scaling.

License: No known license
Version: 1.1.0
Splits:

Split	Examples
`'test'`	3259
`'train'`	6838
`'validation'`	886

Features:

{
    "ID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "Tweet": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "anger": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "anticipation": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "disgust": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "fear": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "joy": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "love": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "optimism": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pessimism": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "sadness": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "surprise": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "trust": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    }
}

subtask5.spanish

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:sem_eval_2018_task_1/subtask5.spanish')

Description:

SemEval-2018 Task 1: Affect in Tweets: SubTask 5: Emotion Classification.
 This is a dataset for multilabel emotion classification for tweets.
 'Given a tweet, classify it as 'neutral or no emotion' or as one, or more, of eleven given emotions that best represent the mental state of the tweeter.'
 It contains 22467 tweets in three languages manually annotated by crowdworkers using Best–Worst Scaling.

License: No known license
Version: 1.1.0
Splits:

Split	Examples
`'test'`	2854
`'train'`	3561
`'validation'`	679

Features:

{
    "ID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "Tweet": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "anger": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "anticipation": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "disgust": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "fear": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "joy": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "love": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "optimism": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pessimism": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "sadness": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "surprise": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "trust": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    }
}

subtask5.arabic

Use the following command to load this dataset in TFDS:

ds = tfds.load('huggingface:sem_eval_2018_task_1/subtask5.arabic')

Description:

SemEval-2018 Task 1: Affect in Tweets: SubTask 5: Emotion Classification.
 This is a dataset for multilabel emotion classification for tweets.
 'Given a tweet, classify it as 'neutral or no emotion' or as one, or more, of eleven given emotions that best represent the mental state of the tweeter.'
 It contains 22467 tweets in three languages manually annotated by crowdworkers using Best–Worst Scaling.

License: No known license
Version: 1.1.0
Splits:

Split	Examples
`'test'`	1518
`'train'`	2278
`'validation'`	585

Features:

{
    "ID": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "Tweet": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "anger": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "anticipation": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "disgust": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "fear": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "joy": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "love": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "optimism": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "pessimism": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "sadness": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "surprise": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    },
    "trust": {
        "dtype": "bool",
        "id": null,
        "_type": "Value"
    }
}