clinc_oos

설명 :

작업 지향 대화 시스템은 쿼리가 지원되는 의도 범위를 벗어나는 경우를 알아야 하지만 현재 텍스트 분류 코퍼스는 모든 예를 포함하는 레이블 집합만 정의합니다. 범위를 벗어난(OOS) 쿼리, 즉 시스템에서 지원하는 의도에 속하지 않는 쿼리를 포함하는 새로운 데이터 세트를 소개합니다. 이는 모델이 추론 시 모든 쿼리가 시스템 지원 의도 클래스에 속한다고 가정할 수 없기 때문에 새로운 문제를 제기합니다. 우리의 데이터 세트는 또한 10개 도메인에 걸쳐 150개의 인텐트 클래스를 다루며 프로덕션 작업 지향 에이전트가 처리해야 하는 범위를 캡처합니다. 작업 기반 대화 시스템에서 텍스트 분류를 보다 엄격하고 현실적으로 벤치마킹하는 방법을 제공합니다.

추가 문서 : 코드가 있는 논문에서 탐색
홈페이지 : https://github.com/clinc/oos-eval/
소스 코드 : tfds.text.ClincOOS
버전 :
- 0.1.0 (기본값): 릴리스 정보가 없습니다.
다운로드 크기 : 256.01 KiB
데이터 세트 크기 : 3.40 MiB
자동 캐시 ( 문서 ): 예
분할 :

나뉘다	예
`'test'`	4,500
`'test_oos'`	1,000
`'train'`	15,000
`'train_oos'`	100
`'validation'`	3,000
`'validation_oos'`	100

기능 구조 :

FeaturesDict({
    'domain': int32,
    'domain_name': Text(shape=(), dtype=string),
    'intent': int32,
    'intent_name': Text(shape=(), dtype=string),
    'text': Text(shape=(), dtype=string),
})

기능 문서 :

특징	수업	D타입
	풍모Dict
도메인	텐서	int32
도메인 이름	텍스트	끈
의지	텐서	int32
intent_name	텍스트	끈
텍스트	텍스트	끈

감독 키 ( as_supervised 문서 참조): ('text', 'intent')
그림 ( tfds.show_examples ): 지원되지 않습니다.
예 ( tfds.as_dataframe ):

인용 :

@inproceedings{larson-etal-2019-evaluation,
    title = "An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction",
    author = "Larson, Stefan  and
      Mahendran, Anish  and
      Peper, Joseph J.  and
      Clarke, Christopher  and
      Lee, Andrew  and
      Hill, Parker  and
      Kummerfeld, Jonathan K.  and
      Leach, Kevin  and
      Laurenzano, Michael A.  and
      Tang, Lingjia  and
      Mars, Jason",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    month = nov,
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D19-1131",
    doi = "10.18653/v1/D19-1131",
    pages = "1311--1316",
}