مراجع:
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:hate_speech_pl')
- توضیحات :
HateSpeech corpus in the current version contains over 2000 posts crawled from public Polish web. They represent various types and degrees of offensive language, expressed toward minorities (eg. ethnical, racial). The data were annotated manually.
- مجوز : CC BY-NC-SA
- نسخه : 1.1.0
- تقسیم ها :
تقسیم کنید | نمونه ها |
---|---|
'train' | 13887 |
- ویژگی ها :
{
"id": {
"dtype": "uint16",
"id": null,
"_type": "Value"
},
"text_id": {
"dtype": "uint32",
"id": null,
"_type": "Value"
},
"annotator_id": {
"dtype": "uint8",
"id": null,
"_type": "Value"
},
"minority_id": {
"dtype": "uint8",
"id": null,
"_type": "Value"
},
"negative_emotions": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"call_to_action": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"source_of_knowledge": {
"dtype": "uint8",
"id": null,
"_type": "Value"
},
"irony_sarcasm": {
"dtype": "bool",
"id": null,
"_type": "Value"
},
"topic": {
"dtype": "uint8",
"id": null,
"_type": "Value"
},
"text": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"rating": {
"dtype": "uint8",
"id": null,
"_type": "Value"
}
}