


Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:

ds = tfds.load('huggingface:medical_dialog/en')
  • Описание :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
  • Лицензия : Нет известной лицензии.
  • Версия : 1.0.0
  • Расколы :
Расколоть Примеры
'train' 229674
  • Функции :
    "file_name": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    "dialogue_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    "dialogue_url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    "dialogue_turns": {
        "feature": {
            "speaker": {
                "num_classes": 2,
                "names": [
                "id": null,
                "_type": "ClassLabel"
            "utterance": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
        "length": -1,
        "id": null,
        "_type": "Sequence"


Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:

ds = tfds.load('huggingface:medical_dialog/zh')
  • Описание :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
  • Лицензия : Нет известной лицензии.
  • Версия : 1.0.0
  • Расколы :
Расколоть Примеры
'train' 1921127
  • Функции :
    "file_name": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    "dialogue_id": {
        "dtype": "int32",
        "id": null,
        "_type": "Value"
    "dialogue_url": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    "dialogue_turns": {
        "feature": {
            "speaker": {
                "num_classes": 2,
                "names": [
                "id": null,
                "_type": "ClassLabel"
            "utterance": {
                "dtype": "string",
                "id": null,
                "_type": "Value"
        "length": -1,
        "id": null,
        "_type": "Sequence"


Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:

ds = tfds.load('huggingface:medical_dialog/processed.en')
  • Описание :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
  • Лицензия : Авторское право
  • Версия : 2.0.0
  • Расколы :
Расколоть Примеры
'test' 61
'train' 482
'validation' 60
  • Функции :
    "description": {
        "dtype": "string",
        "id": null,
        "_type": "Value"
    "utterances": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        "length": -1,
        "id": null,
        "_type": "Sequence"


Используйте следующую команду, чтобы загрузить этот набор данных в TFDS:

ds = tfds.load('huggingface:medical_dialog/processed.zh')
  • Описание :
The MedDialog dataset (English) contains conversations (in English) between doctors and patients.It has 0.26 million dialogues. The data is continuously growing and more dialogues will be added. The raw dialogues are from healthcaremagic.com and icliniq.com.
All copyrights of the data belong to healthcaremagic.com and icliniq.com.
  • Лицензия : Авторское право
  • Версия : 2.0.0
  • Расколы :
Расколоть Примеры
'test' 340754
'train' 2725989
'validation' 340748
  • Функции :
    "utterances": {
        "feature": {
            "dtype": "string",
            "id": null,
            "_type": "Value"
        "length": -1,
        "id": null,
        "_type": "Sequence"