Knowledge

Document structuring

Source 📝

148:—in other words, a text which starts by setting the scene and giving an introduction/overview; then describes a set of events in a clear fashion, so readers can easily see how the individual events are related and link together; and concludes with a summary/ending. Note that narrative in this sense applies to factual texts as well as stories. Current NLG systems do not do a good job of generating narratives, and this is a major source of user criticism. 136:
structuring is appealing intellectually, but it can be difficult to get it to work well in practice, in part because heuristics often depend on semantic information (how sentences relate to each other) which is not always available. On the other hand, heuristic rules can focus on what is best for text readers, whereas the other approaches focus on imitating authors (and many human-authored texts are not well-structured).
132:, where a computer program automatically generates a summary of a textual document. In principle they could be applied to text generated from non-linguistic data, but this work is in its infancy; part of the challenge is that texts generated by Natural Language Generation systems are generally expected to be of fairly high quality, which is not always the case for texts generated by automatic summarisation systems. 135:
The final approach is heuristic-based structuring. Such algorithms perform the structuring task based on heuristic rules, which can come from theories of rhetoric, psycholinguistic models, and/or a combination of intuition and feedback from pilot experiments with potential users. Heuristic-based
124:
of human-written texts in the target genre, and extracting a document template from these texts. Schemas work well in practice for texts which are short (5 sentences or less) and/or have a standardised structure, but have problems in generating texts which are longer and do not have a fixed
69:
For any ordering, there are also many ways in which sentences can be grouped into paragraphs and higher-level structures such as sections. For example, there are 8 (2**3) ways in which the sentences in (1234) can be grouped into paragraphs, including
270:
E Reiter, A Gatt, F Portet, M van der Meulen (2008).The Importance of Narrative and Other Lessons from an Evaluation of an NLG System that Summarises Clinical Data. In Proceedings of INLG-2008
204:
N Karamanis, M Poesio, C Mellish, J Oberlander (2004). Evaluating Centering-based metrics of coherence for text structuring using a reliably annotated corpus. Proceedings of ACL-2004
128:
Corpus-based structuring techniques use statistical corpus analysis techniques to automatically build ordering and/or grouping models. Such techniques are common in
105:
The document structuring task is to choose an ordering and grouping of sentences which results in a coherent and well-organised text from the reader's perspective.
59:(2341) It will be sunny on Sunday. Max temperature will be 10 °C on Saturday. Max temperature will be 15 °C on Sunday. It will rain on Saturday. 56:(1234) It will rain on Saturday. It will be sunny on Sunday. Max temperature will be 10 °C on Saturday. Max temperature will be 15 °C on Sunday. 62:(4321) Max temperature will be 15 °C on Sunday. Max temperature will be 10 °C on Saturday. It will be sunny on Sunday. It will rain on Saturday. 23:, which involves deciding the order and grouping (for example into paragraphs) of sentences in a generated text. It is closely related to the 250: 66:
Some of these orderings are better than others. For example, of the texts shown above, human readers prefer (1234) over (2314) and (4321).
298: 293: 151:
Generating good narratives is a challenge for all aspects of NLG, but the most fundamental challenge is probably in document structuring.
288: 191:
D Scott and C de Souza (1990). Getting the message across in RST-based text generation . In Dale, Mellish, Zock (eds)
102:
As with ordering, human readers prefer some groupings over others; for example, (12)(34) is preferred over (1)(23)(4).
20: 180:
M Lapata (2003). Probabilistic Text Structuring: Experiments with Sentence Ordering. Proceedings of ACL-2003
129: 116:
Schemas are templates which explicitly specify sentence ordering and grouping for a document (as well as
117: 24: 229:
Raue, Martina; Scholl, Sabine G. (2018), Raue, Martina; Lermer, Eva; Streicher, Bernhard (eds.),
113:
There are three basic approaches to document structuring: schemas, corpus-based, and heuristic.
246: 82:
Max temperature will be 10 °C on Saturday. Max temperature will be 15 °C on Sunday.
238: 235:
Psychological Perspectives on Risk and Risk Analysis: Theory, Models, and Applications
282: 271: 205: 215:
S Williams and E Reiter. Generating basic skills reports for low-skilled readers.
242: 121: 230: 181: 145: 95:
It will be sunny on Sunday. Max temperature will be 10 °C on Saturday.
35:
Assume we have four sentences which we want to include in a generated text
144:
Perhaps the ultimate document structuring challenge is to generate a good
120:
information). Typically they are constructed by manually analysing a
231:"The Use of Heuristics in Decision Making Under Risk and Uncertainty" 237:, Cham: Springer International Publishing, pp. 153–179, 52:
There are 24 (4!) orderings of these messages, including
79:
It will rain on Saturday. It will be sunny on Sunday.
193:Current research in natural language generation 45:Max temperature will be 10 °C on Saturday 98:Max temperature will be 15 °C on Sunday. 8: 48:Max temperature will be 15 °C on Sunday 160: 7: 14: 1: 217:Natural Language Engineering 171:. Cambridge University Press 299:Natural language generation 294:Natural language processing 243:10.1007/978-3-319-92478-6_7 21:Natural language generation 315: 42:It will be sunny on Sunday 289:Computational linguistics 92:It will rain on Saturday. 39:It will rain on Saturday 130:Automatic summarisation 118:Content determination 109:Algorithms and models 25:Content determination 17:Document Structuring 167:K McKeown (1985). 252:978-3-319-92478-6 306: 273: 268: 262: 261: 260: 259: 226: 220: 213: 207: 202: 196: 189: 183: 178: 172: 165: 19:is a subtask of 314: 313: 309: 308: 307: 305: 304: 303: 279: 278: 277: 276: 269: 265: 257: 255: 253: 228: 227: 223: 214: 210: 203: 199: 190: 186: 179: 175: 169:Text Generation 166: 162: 157: 142: 111: 33: 12: 11: 5: 312: 310: 302: 301: 296: 291: 281: 280: 275: 274: 263: 251: 221: 208: 197: 184: 173: 159: 158: 156: 153: 141: 138: 110: 107: 100: 99: 96: 93: 89: 88: 84: 83: 80: 76: 75: 64: 63: 60: 57: 50: 49: 46: 43: 40: 32: 29: 13: 10: 9: 6: 4: 3: 2: 311: 300: 297: 295: 292: 290: 287: 286: 284: 272: 267: 264: 254: 248: 244: 240: 236: 232: 225: 222: 218: 212: 209: 206: 201: 198: 195:, pages 47-73 194: 188: 185: 182: 177: 174: 170: 164: 161: 154: 152: 149: 147: 139: 137: 133: 131: 126: 123: 119: 114: 108: 106: 103: 97: 94: 91: 90: 86: 85: 81: 78: 77: 73: 72: 71: 67: 61: 58: 55: 54: 53: 47: 44: 41: 38: 37: 36: 30: 28: 26: 22: 18: 266: 256:, retrieved 234: 224: 216: 211: 200: 192: 187: 176: 168: 163: 150: 143: 134: 127: 115: 112: 104: 101: 68: 65: 51: 34: 16: 15: 125:structure. 283:Categories 258:2023-05-08 219:14:495-535 155:References 87:(1)(23)(4) 27:NLG task. 146:narrative 140:Narrative 74:(12)(34) 31:Example 249:  122:corpus 247:ISBN 239:doi 285:: 245:, 233:, 241::

Index

Natural language generation
Content determination
Content determination
corpus
Automatic summarisation
narrative


"The Use of Heuristics in Decision Making Under Risk and Uncertainty"
doi
10.1007/978-3-319-92478-6_7
ISBN
978-3-319-92478-6

Categories
Computational linguistics
Natural language processing
Natural language generation

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.