참고자료:
ollie_lemmagrep
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:ollie/ollie_lemmagrep')
- 설명 :
The Ollie dataset includes two configs for the data
used to train the Ollie informatation extraction algorithm, for 18M
sentences and 3M sentences respectively.
This data is for academic use only. From the authors:
Ollie is a program that automatically identifies and extracts binary
relationships from English sentences. Ollie is designed for Web-scale
information extraction, where target relations are not specified in
advance.
Ollie is our second-generation information extraction system . Whereas
ReVerb operates on flat sequences of tokens, Ollie works with the
tree-like (graph with only small cycles) representation using
Stanford's compression of the dependencies. This allows Ollie to
capture expression that ReVerb misses, such as long-range relations.
Ollie also captures context that modifies a binary relation. Presently
Ollie handles attribution (He said/she believes) and enabling
conditions (if X then).
More information is available at the Ollie homepage:
https://knowitall.github.io/ollie/
라이센스 : University of Washington 학술 라이센스: https://raw.githubusercontent.com/knowitall/ollie/master/LICENSE
버전 : 1.1.0
분할 :
나뉘다 | 예 |
---|---|
'train' | 18674630 |
- 특징 :
{
"arg1": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"arg2": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"rel": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"search_query": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"words": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"pos": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"chunk": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence_cnt": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}
올리_패턴
TFDS에 이 데이터세트를 로드하려면 다음 명령어를 사용하세요.
ds = tfds.load('huggingface:ollie/ollie_patterned')
- 설명 :
The Ollie dataset includes two configs for the data
used to train the Ollie informatation extraction algorithm, for 18M
sentences and 3M sentences respectively.
This data is for academic use only. From the authors:
Ollie is a program that automatically identifies and extracts binary
relationships from English sentences. Ollie is designed for Web-scale
information extraction, where target relations are not specified in
advance.
Ollie is our second-generation information extraction system . Whereas
ReVerb operates on flat sequences of tokens, Ollie works with the
tree-like (graph with only small cycles) representation using
Stanford's compression of the dependencies. This allows Ollie to
capture expression that ReVerb misses, such as long-range relations.
Ollie also captures context that modifies a binary relation. Presently
Ollie handles attribution (He said/she believes) and enabling
conditions (if X then).
More information is available at the Ollie homepage:
https://knowitall.github.io/ollie/
라이센스 : University of Washington 학술 라이센스: https://raw.githubusercontent.com/knowitall/ollie/master/LICENSE
버전 : 1.1.0
분할 :
나뉘다 | 예 |
---|---|
'train' | 3048961 |
- 특징 :
{
"rel": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"arg1": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"arg2": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"slot0": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"search_query": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"pattern": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"sentence": {
"dtype": "string",
"id": null,
"_type": "Value"
},
"parse": {
"dtype": "string",
"id": null,
"_type": "Value"
}
}