kd_conv

Referências:

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/travel_dialogues')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'test' 150
'train' 1200
'validation' 150
  • Características :
{
   
"messages": {
       
"feature": {
           
"message": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"attrs": {
               
"feature": {
                   
"attrname": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"attrvalue": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"name": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
}
               
},
               
"length": -1,
               
"id": null,
               
"_type": "Sequence"
           
}
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"name": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}

travel_knowledge_base

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/travel_knowledge_base')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'train' 1154
  • Características :
{
   
"head_entity": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"kb_triplets": {
       
"feature": {
           
"feature": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"length": -1,
           
"id": null,
           
"_type": "Sequence"
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}

music_dialogues

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/music_dialogues')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'test' 150
'train' 1200
'validation' 150
  • Características :
{
   
"messages": {
       
"feature": {
           
"message": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"attrs": {
               
"feature": {
                   
"attrname": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"attrvalue": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"name": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
}
               
},
               
"length": -1,
               
"id": null,
               
"_type": "Sequence"
           
}
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"name": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}

music_knowledge_base

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/music_knowledge_base')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'train' 4441
  • Características :
{
   
"head_entity": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"kb_triplets": {
       
"feature": {
           
"feature": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"length": -1,
           
"id": null,
           
"_type": "Sequence"
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}

filme_diálogos

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/film_dialogues')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'test' 150
'train' 1200
'validation' 150
  • Características :
{
   
"messages": {
       
"feature": {
           
"message": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"attrs": {
               
"feature": {
                   
"attrname": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"attrvalue": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"name": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
}
               
},
               
"length": -1,
               
"id": null,
               
"_type": "Sequence"
           
}
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"name": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}

film_knowledge_base

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/film_knowledge_base')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'train' 8090
  • Características :
{
   
"head_entity": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"kb_triplets": {
       
"feature": {
           
"feature": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"length": -1,
           
"id": null,
           
"_type": "Sequence"
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}

all_dialogues

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/all_dialogues')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'test' 450
'train' 3600
'validation' 450
  • Características :
{
   
"messages": {
       
"feature": {
           
"message": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"attrs": {
               
"feature": {
                   
"attrname": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"attrvalue": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
},
                   
"name": {
                       
"dtype": "string",
                       
"id": null,
                       
"_type": "Value"
                   
}
               
},
               
"length": -1,
               
"id": null,
               
"_type": "Sequence"
           
}
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"name": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}

all_knowledge_base

Use o seguinte comando para carregar esse conjunto de dados no TFDS:

ds = tfds.load('huggingface:kd_conv/all_knowledge_base')
  • Descrição :
KdConv is a Chinese multi-domain Knowledge-driven Conversionsation dataset, grounding the topics in multi-turn conversations to knowledge graphs. KdConv contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0. These conversations contain in-depth discussions on related topics and natural transition between multiple topics, while the corpus can also used for exploration of transfer learning and domain adaptation.
  • Licença : Licença Apache 2.0
  • Versão : 0.0.0
  • Divisões :
Dividir Exemplos
'train' 13685
  • Características :
{
   
"head_entity": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
},
   
"kb_triplets": {
       
"feature": {
           
"feature": {
               
"dtype": "string",
               
"id": null,
               
"_type": "Value"
           
},
           
"length": -1,
           
"id": null,
           
"_type": "Sequence"
       
},
       
"length": -1,
       
"id": null,
       
"_type": "Sequence"
   
},
   
"domain": {
       
"dtype": "string",
       
"id": null,
       
"_type": "Value"
   
}
}