참고자료:
ko
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:medical_dialog/en')
- 설명 :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'train' | 229674 |
- 특징 :
{
"file_name": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"dialogue_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"dialogue_url": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"dialogue_turns": {
"feature": {
"speaker": {
"num_classes": 2,
"names": [
"Patient",
"Doctor"
],
"id": null,
"_type": "ClassLabel"
},
"utterance": {
"dtype": "string",
"id": null,
"_type": "Value"
}
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
zh
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:medical_dialog/zh')
- 설명 :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
- 라이센스 : 알려진 라이센스 없음
- 버전 : 1.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'train' | 1921127 |
- 특징 :
{
"file_name": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"dialogue_id": {
"dtype": "int32",
"id": null,
"_type": "Value"
},
"dialogue_url": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"dialogue_turns": {
"feature": {
"speaker": {
"num_classes": 2,
"names": [
"\u75c5\u4eba",
"\u533b\u751f"
],
"id": null,
"_type": "ClassLabel"
},
"utterance": {
"dtype": "string",
"id": null,
"_type": "Value"
}
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
가공.en
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:medical_dialog/processed.en')
- 설명 :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
- 라이센스 : 저작권
- 버전 : 2.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'test' | 61 |
'train' | 482 |
'validation' | 60 |
- 특징 :
{
"description": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"utterances": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}
처리됨.zh
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:medical_dialog/processed.zh')
- 설명 :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
- 라이센스 : 저작권
- 버전 : 2.0.0
- 분할 :
나뉘다 | 예 |
---|---|
'test' | 340754 |
'train' | 2725989 |
'validation' | 340748 |
- 특징 :
{
"utterances": {
"feature": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"length": -1,
"id": null,
"_type": "Sequence"
}
}