তথ্যসূত্র:
জার্মান_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/german_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | ৩৩৪৩৪০৩ |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
জার্মান_মোছা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/german_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 1994329 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ইংরেজি_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/english_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 13737796 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ইংরেজি_মোছা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/english_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 9352389 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
স্প্যানিশ_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/spanish_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 1380934 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
স্প্যানিশ_মুছে ফেলা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/spanish_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 908276 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ফরাসি_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/french_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 2038305 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ফ্রেঞ্চ_মোছা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/french_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 2060242 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ইতালিয়ান_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/italian_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 1078814 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
ইতালিয়ান_মুছে ফেলা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/italian_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 583316 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
জাপানি_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/japanese_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 2249527 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
জাপানি_মুছে ফেলা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/japanese_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 1352162 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
রাশিয়ান_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/russian_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 1471638 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
রাশিয়ান_মোছা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/russian_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 960976 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
চীনা_ সন্নিবেশ
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/chinese_insertions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 746509 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
চীনা_মুছে ফেলা
TFDS এ এই ডেটাসেট লোড করতে নিম্নলিখিত কমান্ডটি ব্যবহার করুন:
ds = tfds.load('huggingface:wiki_atomic_edits/chinese_deletions')
- বর্ণনা :
A dataset of atomic wikipedia edits containing insertions and deletions of a contiguous chunk of text in a sentence. This dataset contains ~43 million edits across 8 languages.
An atomic edit is defined as an edit e applied to a natural language expression S as the insertion, deletion, or substitution of a sub-expression P such that both the original expression S and the resulting expression e(S) are well-formed semantic constituents (MacCartney, 2009). In this corpus, we release such atomic insertions and deletions made to sentences in wikipedia.
- লাইসেন্স : কোনো পরিচিত লাইসেন্স নেই
- সংস্করণ : 1.0.0
- বিভাজন :
বিভক্ত | উদাহরণ |
---|---|
'train' | 467271 |
- বৈশিষ্ট্য :
{
"id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"base_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"phrase": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"edited_sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}