টুনিজি

তথ্যসূত্র:

TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:

ds = tfds.load('huggingface:tunizi')
  • বর্ণনা :
On social media, Arabic speakers tend to express themselves in their own local dialect. To do so, Tunisians use "Tunisian Arabizi", which consists in supplementing numerals to the Latin script rather than the Arabic alphabet. TUNIZI is the first Tunisian Arabizi Dataset including 3K sentences, balanced, covering different topics, preprocessed and annotated as positive and negative.
  • লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
  • সংস্করণ : 0.9.1
  • বিভাজন :
বিভক্ত উদাহরণ
'train' 3000
  • বৈশিষ্ট্য :
{
    "id": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "sentence": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    },
    "target": {
        "num_classes": 2,
        "names": [
            "1",
            "-1"
        ],
        "names_file": null,
        "id": null,
        "_type": "ClassLabel"
    }
}