Knowledge

Open information extraction

Source 📝

55:(e.g., "Dante wrote the Divine Comedy"), represented in an amenable structure for computers . An OIE extraction normally consists of a relation and a set of arguments. For instance, ("Dante", "passed away in" "Ravenna") is a proposition formed by the relation "passed away in" and the arguments "Dante" and "Ravenna". The first argument is usually referred as the subject while the second is considered to be the object. 62:. Furthermore, the factual nature of the proposition has not yet been established. In the above example, transforming the extraction into a full fledged fact would first require linking, if possible, the relation and the arguments to a knowledge base. Second, the truth of the extraction would need to be determined. In computer science transforming OIE extractions into ontological facts is known as 101:
indicated that an OIE system should be able to extract non-verb mediated relations, which account for significant portion of the information expressed in natural language text. For instance, in the sentence "Obama, the former US president, was born in Hawaii", an OIE system should be able to recognize a proposition ("Obama", "is", "former US president").
105:
implies that to correctly recognize the set of propositions in an input sentence, it is necessary to understand its grammatical structure. The authors studied the case in the English language that only admits seven clause types, meaning that the identification of each proposition only requires defining seven grammatical patterns.
88:. Other methods introduced later such as Reverb, OLLIE, ClausIE or CSD helped to shape the OIE task by characterizing some of its aspects. At a high level, all of these approaches make use of a set of patterns to generate the extractions. Depending on the particular approach, these patterns are either hand-crafted or learned. 116:
CSD introduced the idea of minimality in OIE. It considers that computers can make better use of the extractions if they are expressed in a compact way. This is especially important in sentences with subordinate clauses. In these cases, CSD suggests the generation of nested extractions. For example,
112:
Consider the sentence "Albert Einstein was born in Ulm and died in Princeton". The first step will recognize the two propositions ("Albert Einstein", "was born", "in Ulm") and ("Albert Einstein", "died", "in Princeton"). Once the information has been correctly identified, the propositions can take
108:
The finding also established a separation between the recognition of the propositions and its materialization. In a first step, the proposition can be identified without any consideration of its final form, in a domain-independent and unsupervised way, mostly based on linguistic principles. In a
104:
ClausIE introduced the connection between grammatical clauses, propositions, and OIE extractions. The authors stated that as each grammatical clause expresses a proposition, each verb mediated proposition can be identified by solely recognizing the set of clauses expressed in each sentence. This
100:
OLLIE stressed two important aspects for OIE. First, it pointed to the lack of factuality of the propositions. For instance, in a sentence like "If John studies hard, he will pass the exam", it would be inaccurate to consider ("John", "will pass", "the exam") as a fact. Additionally, the authors
96:
Reverb suggested the necessity to produce meaningful relations to more accurately capture the information in the input text. For instance, given the sentence "Faust made a pact with the devil", it would be erroneous to just produce the extraction ("Faust", "made", "a pact") since it would not be
117:
consider the sentence "The Embassy said that 6,700 Americans were in Pakistan". CSD generates two extractions ("6,700 Americans", "were", "in Pakistan") and ("The Embassy", "said", "that ). This is usually known as reification.
97:
adequately informative. A more precise extraction would be ("Faust", "made a pact with", "the devil"). Reverb also argued against the generation of overspecific relations.
77:. The extracted propositions can also be directly used for end-user applications such as structured search (e.g., retrieve all propositions with "Dante" as subject). 311: 109:
second step, the information can be represented according to the requirements of the underlying application, without conditioning the identification phase.
69:
In fact, OIE can be seen as the first step to a wide range of deeper text understanding tasks such as relation extraction, knowledge-base construction,
35:) is the task of generating a structured, machine-readable representation of the information in text, usually in the form of triples or n-ary 301: 306: 135: 58:
The extraction is said to be a textual representation of a potential fact because its elements are not linked to a
81: 249: 74: 63: 146: 199: 70: 173:
Banko, Michele; Cafarella, Michael; Soderland, Stephen; Broadhead, Matt; Etzioni, Oren (2007).
224: 136:"Methods for Open Information Extraction and Sense Disambiguation on Natural Language Text" 17: 159: 174: 59: 295: 277: 85: 48: 223:
Mausam; Schmitz, Michael; Soderland, Stephen; Bart, Robert; Etzioni, Oren (2012).
273: 36: 278:"Open Information Extraction via Contextual Sentence Decomposition" 52: 113:
the particular form required by the underlying application .
198:
Fader, Anthony; Soderland, Stephen; Etzioni, Oren (2011).
80:
OIE was first introduced by TextRunner developed at the
200:"Identifying relations for open information extraction" 250:"ClausIE: clause-based open information extraction" 225:"Open language learning for information extraction" 8: 248:Del Corro, Luciano; Gemulla, Rainer (2013). 175:"Open Information Extraction from the Web" 243: 241: 193: 191: 218: 216: 126: 155: 144: 51:, a textual expression of a potential 268: 266: 182:Conference on Artificial Intelligence 7: 312:Tasks of natural language processing 47:A proposition can be understood as 25: 27:In natural language processing, 1: 92:OIE systems and contributions 302:Natural language processing 276:; Haussmann, Elmar (2013). 29:open information extraction 18:Open Information Extraction 328: 307:Computational linguistics 84:Turing Center headed by 82:University of Washington 154:Cite journal requires 75:semantic role labeling 134:Del Corro, Luciano. 64:relation extraction 71:question answering 16:(Redirected from 319: 286: 285: 270: 261: 260: 254: 245: 236: 235: 229: 220: 211: 210: 204: 195: 186: 185: 179: 170: 164: 163: 157: 152: 150: 142: 140: 131: 21: 327: 326: 322: 321: 320: 318: 317: 316: 292: 291: 290: 289: 272: 271: 264: 252: 247: 246: 239: 227: 222: 221: 214: 202: 197: 196: 189: 177: 172: 171: 167: 153: 143: 138: 133: 132: 128: 123: 94: 45: 23: 22: 15: 12: 11: 5: 325: 323: 315: 314: 309: 304: 294: 293: 288: 287: 262: 237: 212: 187: 165: 156:|journal= 125: 124: 122: 119: 93: 90: 60:knowledge base 44: 41: 24: 14: 13: 10: 9: 6: 4: 3: 2: 324: 313: 310: 308: 305: 303: 300: 299: 297: 283: 279: 275: 269: 267: 263: 258: 251: 244: 242: 238: 233: 226: 219: 217: 213: 208: 201: 194: 192: 188: 183: 176: 169: 166: 161: 148: 137: 130: 127: 120: 118: 114: 110: 106: 102: 98: 91: 89: 87: 83: 78: 76: 72: 67: 65: 61: 56: 54: 50: 42: 40: 38: 34: 30: 19: 281: 274:Bast, Hannah 256: 231: 206: 181: 168: 147:cite journal 129: 115: 111: 107: 103: 99: 95: 86:Oren Etzioni 79: 68: 57: 49:truth-bearer 46: 37:propositions 32: 28: 26: 296:Categories 121:References 43:Overview 253:(PDF) 232:EMNLP 228:(PDF) 207:EMNLP 203:(PDF) 178:(PDF) 139:(PDF) 282:ICSC 160:help 53:fact 257:WWW 33:OIE 298:: 280:. 265:^ 255:. 240:^ 230:. 215:^ 205:. 190:^ 180:. 151:: 149:}} 145:{{ 73:, 66:. 39:. 284:. 259:. 234:. 209:. 184:. 162:) 158:( 141:. 31:( 20:)

Index

Open Information Extraction
propositions
truth-bearer
fact
knowledge base
relation extraction
question answering
semantic role labeling
University of Washington
Oren Etzioni
"Methods for Open Information Extraction and Sense Disambiguation on Natural Language Text"
cite journal
help
"Open Information Extraction from the Web"


"Identifying relations for open information extraction"


"Open language learning for information extraction"


"ClausIE: clause-based open information extraction"


Bast, Hannah
"Open Information Extraction via Contextual Sentence Decomposition"
Categories
Natural language processing
Computational linguistics

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.