Knowledge (XXG)

Text simplification

Source 📝

22: 136:. Researchers, frustrated by the problems with using the classical method of asking research subjects to describe words as either simple or complex, have discovered that they can get a higher consistency in more levels of complexity if they ask labelers to sort words presented to them in order of complexity. 106:
Text simplification is illustrated with an example used by Siddharthan (2006). The first sentence contains two relative clauses and one conjoined verb phrase. A text simplification system aims to change the first sentence into a group of simpler sentences, as seen just below the first sentence.
93:
remain the same. Text simplification is an important area of research because of communication needs in an increasingly complex and interconnected world more dominated by science, technology, and new media. But natural human languages pose huge problems because they ordinarily contain large
118:
Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents. The Chicago report precedes the full purchasing agents report. The Chicago report gives an indication of what the full report might hold. The full report is due out
112:
Also contributing to the firmness in copper, the analyst noted, was a report by Chicago purchasing agents, which precedes the full purchasing agents report that is due out today and gives an indication of what the full report might
288:
Siddhartha Jonnalagadda, Luis Tari, Joerg Hakenberg, Chitta Baral and Graciela Gonzalez. Towards Effective Sentence Simplification for Automatic Processing of Biomedical Text. In Proc. of the NAACL-HLT 2009, Boulder, USA, June.
132:, a two-step process of first identifying complex words and then replacing them with simpler synonyms. A key challenge here is identifying complex words, which is performed by a machine learning classifier trained on 94:
vocabularies and complex constructions that machines, no matter how fast and well-programmed, cannot easily process. However, researchers have discovered that, to reduce linguistic diversity, they can use methods of
335: 495: 85:
to change, enhance, classify, or otherwise process an existing body of human-readable text so its grammar and structure is greatly simplified while the underlying
1099: 473: 884: 328: 290: 1053: 1094: 794: 485: 321: 43: 1048: 1084: 655: 809: 640: 65: 580: 997: 650: 645: 390: 914: 635: 285:". In Research on Language and Computation, Volume 4, Issue 1, Jun 2006, Pages 77–109, Springer Science, the Netherlands. 607: 180: 1089: 952: 937: 909: 774: 769: 344: 150: 82: 36: 30: 689: 660: 438: 301: 532: 385: 47: 1058: 982: 714: 670: 555: 453: 962: 932: 599: 819: 512: 490: 480: 448: 423: 278:". In Transactions of the Association for Computational Linguistics (TACL), Volume 3, 2015, Pages 283–297. 160: 145: 125: 679: 275: 1032: 708: 684: 537: 170: 165: 129: 95: 86: 242:"Comparative judgments are more consistent than binary classification for labelling word complexity" 1012: 942: 899: 855: 627: 612: 500: 1022: 894: 759: 522: 505: 363: 222: 175: 1027: 739: 547: 458: 904: 789: 764: 565: 468: 249: 214: 1016: 977: 972: 840: 570: 443: 418: 400: 155: 282: 724: 704: 428: 1078: 987: 799: 779: 560: 185: 226: 205:
Siddharthan, Advaith (28 March 2006). "Syntactic Simplification and Text Cohesion".
967: 133: 307: 240:
Gooding, Sian; Kochmar, Ekaterina; Sarkar, Advait; Blackwell, Alan (August 2019).
924: 804: 517: 433: 410: 358: 90: 527: 313: 218: 241: 395: 254: 870: 850: 835: 814: 784: 729: 694: 575: 1007: 865: 845: 719: 463: 378: 373: 368: 1063: 699: 585: 317: 860: 15: 98:
to limit and simplify a set of words used in given texts.
308:
Text Simplification for Information-Seeking Applications
246:
Proceedings of the 13th Linguistic Annotation Workshop
302:
Automatic Induction of Rules for Text Simplification
274:
Wei Xu, Chris Callison-Burch and Courtney Napoles. "
1041: 996: 951: 923: 883: 828: 750: 738: 669: 626: 598: 546: 409: 351: 276:Problems in Current Text Simplification Research 329: 8: 747: 543: 336: 322: 314: 283:Syntactic Simplification and Text Cohesion 253: 66:Learn how and when to remove this message 29:This article includes a list of general 197: 124:One approach to text simplification is 7: 1100:Tasks of natural language processing 795:Simple Knowledge Organization System 207:Research on Language and Computation 35:it lacks sufficient corresponding 14: 810:Thesaurus (information retrieval) 20: 391:Natural language understanding 1: 915:Optical character recognition 608:Multi-document summarization 1095:Natural language processing 938:Latent Dirichlet allocation 910:Natural language generation 775:Machine-readable dictionary 770:Linguistic Linked Open Data 345:Natural language processing 151:Controlled natural language 83:natural language processing 1116: 690:Explicit semantic analysis 439:Deep linguistic processing 1085:Computational linguistics 533:Word-sense disambiguation 386:Computational linguistics 219:10.1007/s11168-006-9011-1 1059:Natural Language Toolkit 983:Pronunciation assessment 885:Automatic identification 715:Latent semantic analysis 671:Distributional semantics 556:Compound-term processing 454:Named-entity recognition 81:is an operation used in 963:Automated essay scoring 933:Document classification 600:Automatic summarization 50:more precise citations. 820:Universal Dependencies 513:Terminology extraction 496:Semantic decomposition 491:Semantic role labeling 481:Part-of-speech tagging 449:Information extraction 434:Coreference resolution 424:Collocation extraction 281:Advaith Siddharthan. " 161:Lexical simplification 146:Automated paraphrasing 126:lexical simplification 581:Sentence segmentation 1033:Voice user interface 744:datasets and corpora 685:Document-term matrix 538:Word-sense induction 255:10.18653/v1/W19-4024 171:Semantic compression 166:Lexical substitution 130:lexical substitution 96:semantic compression 1013:Interactive fiction 943:Pachinko allocation 900:Speech segmentation 856:Google Ngram Viewer 628:Machine translation 618:Text simplification 613:Sentence extraction 501:Semantic similarity 79:Text simplification 1090:Speech recognition 1023:Question answering 895:Speech recognition 760:Corpus linguistics 740:Language resources 523:Textual entailment 506:Sentiment analysis 181:Simplified English 176:Text normalization 1072: 1071: 1028:Virtual assistant 953:Computer-assisted 879: 878: 636:Computer-assisted 594: 593: 586:Word segmentation 548:Text segmentation 486:Semantic analysis 474:Syntactic parsing 459:Ontology learning 76: 75: 68: 1107: 1049:Formal semantics 998:Natural language 905:Speech synthesis 887:and data capture 790:Semantic network 765:Lexical resource 748: 566:Lexical analysis 544: 469:Semantic parsing 338: 331: 324: 315: 267: 266: 264: 262: 257: 237: 231: 230: 202: 71: 64: 60: 57: 51: 46:this article by 37:inline citations 24: 23: 16: 1115: 1114: 1110: 1109: 1108: 1106: 1105: 1104: 1075: 1074: 1073: 1068: 1037: 1017:Syntax guessing 999: 992: 978:Predictive text 973:Grammar checker 954: 947: 919: 886: 875: 841:Bank of English 824: 752: 743: 734: 665: 622: 590: 542: 444:Distant reading 419:Argument mining 405: 401:Text processing 347: 342: 298: 271: 270: 260: 258: 239: 238: 234: 204: 203: 199: 194: 156:Language reform 142: 104: 72: 61: 55: 52: 42:Please help to 41: 25: 21: 12: 11: 5: 1113: 1111: 1103: 1102: 1097: 1092: 1087: 1077: 1076: 1070: 1069: 1067: 1066: 1061: 1056: 1051: 1045: 1043: 1039: 1038: 1036: 1035: 1030: 1025: 1020: 1010: 1004: 1002: 1000:user interface 994: 993: 991: 990: 985: 980: 975: 970: 965: 959: 957: 949: 948: 946: 945: 940: 935: 929: 927: 921: 920: 918: 917: 912: 907: 902: 897: 891: 889: 881: 880: 877: 876: 874: 873: 868: 863: 858: 853: 848: 843: 838: 832: 830: 826: 825: 823: 822: 817: 812: 807: 802: 797: 792: 787: 782: 777: 772: 767: 762: 756: 754: 745: 736: 735: 733: 732: 727: 725:Word embedding 722: 717: 712: 705:Language model 702: 697: 692: 687: 682: 676: 674: 667: 666: 664: 663: 658: 656:Transfer-based 653: 648: 643: 638: 632: 630: 624: 623: 621: 620: 615: 610: 604: 602: 596: 595: 592: 591: 589: 588: 583: 578: 573: 568: 563: 558: 552: 550: 541: 540: 535: 530: 525: 520: 515: 509: 508: 503: 498: 493: 488: 483: 478: 477: 476: 471: 461: 456: 451: 446: 441: 436: 431: 429:Concept mining 426: 421: 415: 413: 407: 406: 404: 403: 398: 393: 388: 383: 382: 381: 376: 366: 361: 355: 353: 349: 348: 343: 341: 340: 333: 326: 318: 312: 311: 305: 297: 296:External links 294: 293: 292: 286: 279: 269: 268: 232: 196: 195: 193: 190: 189: 188: 183: 178: 173: 168: 163: 158: 153: 148: 141: 138: 122: 121: 115: 103: 100: 74: 73: 28: 26: 19: 13: 10: 9: 6: 4: 3: 2: 1112: 1101: 1098: 1096: 1093: 1091: 1088: 1086: 1083: 1082: 1080: 1065: 1062: 1060: 1057: 1055: 1054:Hallucination 1052: 1050: 1047: 1046: 1044: 1040: 1034: 1031: 1029: 1026: 1024: 1021: 1018: 1014: 1011: 1009: 1006: 1005: 1003: 1001: 995: 989: 988:Spell checker 986: 984: 981: 979: 976: 974: 971: 969: 966: 964: 961: 960: 958: 956: 950: 944: 941: 939: 936: 934: 931: 930: 928: 926: 922: 916: 913: 911: 908: 906: 903: 901: 898: 896: 893: 892: 890: 888: 882: 872: 869: 867: 864: 862: 859: 857: 854: 852: 849: 847: 844: 842: 839: 837: 834: 833: 831: 827: 821: 818: 816: 813: 811: 808: 806: 803: 801: 800:Speech corpus 798: 796: 793: 791: 788: 786: 783: 781: 780:Parallel text 778: 776: 773: 771: 768: 766: 763: 761: 758: 757: 755: 749: 746: 741: 737: 731: 728: 726: 723: 721: 718: 716: 713: 710: 706: 703: 701: 698: 696: 693: 691: 688: 686: 683: 681: 678: 677: 675: 672: 668: 662: 659: 657: 654: 652: 649: 647: 644: 642: 641:Example-based 639: 637: 634: 633: 631: 629: 625: 619: 616: 614: 611: 609: 606: 605: 603: 601: 597: 587: 584: 582: 579: 577: 574: 572: 571:Text chunking 569: 567: 564: 562: 561:Lemmatisation 559: 557: 554: 553: 551: 549: 545: 539: 536: 534: 531: 529: 526: 524: 521: 519: 516: 514: 511: 510: 507: 504: 502: 499: 497: 494: 492: 489: 487: 484: 482: 479: 475: 472: 470: 467: 466: 465: 462: 460: 457: 455: 452: 450: 447: 445: 442: 440: 437: 435: 432: 430: 427: 425: 422: 420: 417: 416: 414: 412: 411:Text analysis 408: 402: 399: 397: 394: 392: 389: 387: 384: 380: 377: 375: 372: 371: 370: 367: 365: 362: 360: 357: 356: 354: 352:General terms 350: 346: 339: 334: 332: 327: 325: 320: 319: 316: 309: 306: 303: 300: 299: 295: 291: 287: 284: 280: 277: 273: 272: 256: 251: 247: 243: 236: 233: 228: 224: 220: 216: 213:(1): 77–109. 212: 208: 201: 198: 191: 187: 186:Basic English 184: 182: 179: 177: 174: 172: 169: 167: 164: 162: 159: 157: 154: 152: 149: 147: 144: 143: 139: 137: 135: 131: 127: 120: 116: 114: 110: 109: 108: 101: 99: 97: 92: 88: 84: 80: 70: 67: 59: 49: 45: 39: 38: 32: 27: 18: 17: 968:Concordancer 617: 364:Bag-of-words 259:. Retrieved 245: 235: 210: 206: 200: 134:labeled data 123: 117: 111: 105: 78: 77: 62: 53: 34: 925:Topic model 805:Text corpus 651:Statistical 518:Text mining 359:AI-complete 261:22 November 248:: 208–214. 91:information 48:introducing 1079:Categories 646:Rule-based 528:Truecasing 396:Stop words 192:References 31:references 955:reviewing 753:standards 751:Types and 56:June 2012 871:Wikidata 851:FrameNet 836:BabelNet 815:Treebank 785:PropBank 730:Word2vec 695:fastText 576:Stemming 227:14619244 140:See also 1042:Related 1008:Chatbot 866:WordNet 846:DBpedia 720:Seq2seq 464:Parsing 379:Trigram 102:Example 87:meaning 44:improve 1015:(c.f. 673:models 661:Neural 374:Bigram 369:n-gram 225:  119:today. 33:, but 1064:spaCy 709:large 700:GloVe 223:S2CID 113:hold. 829:Data 680:BERT 310:2004 304:1996 263:2019 128:via 89:and 861:UBY 250:doi 215:doi 1081:: 244:. 221:. 209:. 1019:) 742:, 711:) 707:( 337:e 330:t 323:v 265:. 252:: 229:. 217:: 211:4 69:) 63:( 58:) 54:( 40:.

Index

references
inline citations
improve
introducing
Learn how and when to remove this message
natural language processing
meaning
information
semantic compression
lexical simplification
lexical substitution
labeled data
Automated paraphrasing
Controlled natural language
Language reform
Lexical simplification
Lexical substitution
Semantic compression
Text normalization
Simplified English
Basic English
doi
10.1007/s11168-006-9011-1
S2CID
14619244
"Comparative judgments are more consistent than binary classification for labelling word complexity"
doi
10.18653/v1/W19-4024
Problems in Current Text Simplification Research
Syntactic Simplification and Text Cohesion

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.