مراجع:
ساده سازی
برای بارگذاری این مجموعه داده در TFDS از دستور زیر استفاده کنید:
ds = tfds.load('huggingface:turk/simplification')
- توضیحات :
TURKCorpus is a dataset for evaluating sentence simplification systems that focus on lexical paraphrasing,
as described in "Optimizing Statistical Machine Translation for Text Simplification". The corpus is composed of 2000 validation and 359 test original sentences that were each simplified 8 times by different annotators.
- مجوز : مجوز عمومی عمومی گنو نسخه 3.0
- نسخه : 1.0.0
- تقسیم ها :
تقسیم کنید | نمونه ها |
---|---|
'test' | 359 |
'validation' | 2000 |
- ویژگی ها :
{
"original": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"simplifications": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}