Corpus of Linguistic Acceptability

130: 290: 35:

The publicly available version of CoLA contains 9,594 sentences that belong to training and development sets. It excludes 1,063 sentences reserved for a held-out test set.

268: 27:

of sentences. It consists of 10,657 English sentences from published linguistics literature that were manually labeled either as grammatical or ungrammatical.

679: 123: 848: 19:(CoLA) is a dataset the primary purpose of which is to serve as a benchmark for evaluating the ability of artificial neural networks, including 879: 589: 280: 116: 843: 450: 604: 435: 375: 792: 445: 440: 185: 709: 430: 402: 747: 732: 704: 569: 564: 139: 24: 484: 455: 233: 327: 180: 853: 777: 509: 465: 350: 248: 757: 727: 394: 614: 307: 285: 275: 243: 218: 474: 827: 503: 479: 332: 20: 807: 737: 694: 650: 422: 412: 407: 295: 817: 689: 554: 317: 300: 158: 81: 822: 534: 342: 253: 699: 584: 559: 360: 263: 91: 811: 772: 767: 635: 365: 238: 213: 195: 519: 499: 223: 873: 782: 594: 574: 355: 762: 719: 599: 312: 228: 205: 153: 322: 108: 190: 665: 645: 630: 609: 579: 524: 489: 370: 95: 802: 660: 640: 514: 258: 173: 168: 163: 69: 86: 858: 494: 46: 380: 112: 655: 74:

Transactions of the Association for Computational Linguistics

68:

Warstadt, Alex; Singh, Amanpreet; Bowman, Samuel R. (2019).

836: 791: 746: 718: 678: 623: 545: 533: 464: 421: 393: 341: 204: 146: 47:"CoLA - The Corpus of Linguistic Acceptability" 124: 8: 542: 338: 131: 117: 109: 85: 70:"Neural Network Acceptability Judgments" 60: 7: 590:Simple Knowledge Organization System 17:Corpus of Linguistic Acceptability 14: 605:Thesaurus (information retrieval) 25:judge the grammatical correctness 186:Natural language understanding 1: 710:Optical character recognition 403:Multi-document summarization 880:Natural language processing 733:Latent Dirichlet allocation 705:Natural language generation 570:Machine-readable dictionary 565:Linguistic Linked Open Data 140:Natural language processing 896: 485:Explicit semantic analysis 234:Deep linguistic processing 328:Word-sense disambiguation 181:Computational linguistics 854:Natural Language Toolkit 778:Pronunciation assessment 680:Automatic identification 510:Latent semantic analysis 466:Distributional semantics 351:Compound-term processing 249:Named-entity recognition 758:Automated essay scoring 728:Document classification 395:Automatic summarization 615:Universal Dependencies 308:Terminology extraction 291:Semantic decomposition 286:Semantic role labeling 276:Part-of-speech tagging 244:Information extraction 229:Coreference resolution 219:Collocation extraction 376:Sentence segmentation 21:large language models 828:Voice user interface 539:datasets and corpora 480:Document-term matrix 333:Word-sense induction 96:10.1162/tacl_a_00290 808:Interactive fiction 738:Pachinko allocation 695:Speech segmentation 651:Google Ngram Viewer 423:Machine translation 413:Text simplification 408:Sentence extraction 296:Semantic similarity 818:Question answering 690:Speech recognition 555:Corpus linguistics 535:Language resources 318:Textual entailment 301:Sentiment analysis 867: 866: 823:Virtual assistant 748:Computer-assisted 674: 673: 431:Computer-assisted 389: 388: 381:Word segmentation 343:Text segmentation 281:Semantic analysis 269:Syntactic parsing 254:Ontology learning 887: 844:Formal semantics 793:Natural language 700:Speech synthesis 682:and data capture 585:Semantic network 560:Lexical resource 543: 361:Lexical analysis 339: 264:Semantic parsing 133: 126: 119: 110: 100: 99: 89: 65: 50: 45:Warstadt, Alex. 895: 894: 890: 889: 888: 886: 885: 884: 870: 869: 868: 863: 832: 812:Syntax guessing 794: 787: 773:Predictive text 768:Grammar checker 749: 742: 714: 681: 670: 636:Bank of English 619: 547: 538: 529: 460: 417: 385: 337: 239:Distant reading 214:Argument mining 200: 196:Text processing 142: 137: 106: 104: 103: 67: 66: 62: 57: 44: 41: 33: 12: 11: 5: 893: 891: 883: 882: 872: 871: 865: 864: 862: 861: 856: 851: 846: 840: 838: 834: 833: 831: 830: 825: 820: 815: 805: 799: 797: 795:user interface 789: 788: 786: 785: 780: 775: 770: 765: 760: 754: 752: 744: 743: 741: 740: 735: 730: 724: 722: 716: 715: 713: 712: 707: 702: 697: 692: 686: 684: 676: 675: 672: 671: 669: 668: 663: 658: 653: 648: 643: 638: 633: 627: 625: 621: 620: 618: 617: 612: 607: 602: 597: 592: 587: 582: 577: 572: 567: 562: 557: 551: 549: 540: 531: 530: 528: 527: 522: 520:Word embedding 517: 512: 507: 500:Language model 497: 492: 487: 482: 477: 471: 469: 462: 461: 459: 458: 453: 451:Transfer-based 448: 443: 438: 433: 427: 425: 419: 418: 416: 415: 410: 405: 399: 397: 391: 390: 387: 386: 384: 383: 378: 373: 368: 363: 358: 353: 347: 345: 336: 335: 330: 325: 320: 315: 310: 304: 303: 298: 293: 288: 283: 278: 273: 272: 271: 266: 256: 251: 246: 241: 236: 231: 226: 224:Concept mining 221: 216: 210: 208: 202: 201: 199: 198: 193: 188: 183: 178: 177: 176: 171: 161: 156: 150: 148: 144: 143: 138: 136: 135: 128: 121: 113: 102: 101: 80:(4): 625–641. 59: 58: 56: 53: 52: 51: 40: 39:External links 37: 32: 31:Public version 29: 13: 10: 9: 6: 4: 3: 2: 892: 881: 878: 877: 875: 860: 857: 855: 852: 850: 849:Hallucination 847: 845: 842: 841: 839: 835: 829: 826: 824: 821: 819: 816: 813: 809: 806: 804: 801: 800: 798: 796: 790: 784: 783:Spell checker 781: 779: 776: 774: 771: 769: 766: 764: 761: 759: 756: 755: 753: 751: 745: 739: 736: 734: 731: 729: 726: 725: 723: 721: 717: 711: 708: 706: 703: 701: 698: 696: 693: 691: 688: 687: 685: 683: 677: 667: 664: 662: 659: 657: 654: 652: 649: 647: 644: 642: 639: 637: 634: 632: 629: 628: 626: 622: 616: 613: 611: 608: 606: 603: 601: 598: 596: 595:Speech corpus 593: 591: 588: 586: 583: 581: 578: 576: 575:Parallel text 573: 571: 568: 566: 563: 561: 558: 556: 553: 552: 550: 544: 541: 536: 532: 526: 523: 521: 518: 516: 513: 511: 508: 505: 501: 498: 496: 493: 491: 488: 486: 483: 481: 478: 476: 473: 472: 470: 467: 463: 457: 454: 452: 449: 447: 444: 442: 439: 437: 436:Example-based 434: 432: 429: 428: 426: 424: 420: 414: 411: 409: 406: 404: 401: 400: 398: 396: 392: 382: 379: 377: 374: 372: 369: 367: 366:Text chunking 364: 362: 359: 357: 356:Lemmatisation 354: 352: 349: 348: 346: 344: 340: 334: 331: 329: 326: 324: 321: 319: 316: 314: 311: 309: 306: 305: 302: 299: 297: 294: 292: 289: 287: 284: 282: 279: 277: 274: 270: 267: 265: 262: 261: 260: 257: 255: 252: 250: 247: 245: 242: 240: 237: 235: 232: 230: 227: 225: 222: 220: 217: 215: 212: 211: 209: 207: 206:Text analysis 203: 197: 194: 192: 189: 187: 184: 182: 179: 175: 172: 170: 167: 166: 165: 162: 160: 157: 155: 152: 151: 149: 147:General terms 145: 141: 134: 129: 127: 122: 120: 115: 114: 111: 107: 97: 93: 88: 83: 79: 75: 71: 64: 61: 54: 48: 43: 42: 38: 36: 30: 28: 26: 22: 18: 763:Concordancer 159:Bag-of-words 105: 77: 73: 63: 34: 16: 15: 720:Topic model 600:Text corpus 446:Statistical 313:Text mining 154:AI-complete 441:Rule-based 323:Truecasing 191:Stop words 87:1805.12471 55:References 750:reviewing 548:standards 546:Types and 874:Category 666:Wikidata 646:FrameNet 631:BabelNet 610:Treebank 580:PropBank 525:Word2vec 490:fastText 371:Stemming 837:Related 803:Chatbot 661:WordNet 641:DBpedia 515:Seq2seq 259:Parsing 174:Trigram 810:(c.f. 468:models 456:Neural 169:Bigram 164:n-gram 859:spaCy 504:large 495:GloVe 82:arXiv 23:, to 624:Data 475:BERT 656:UBY 92:doi 876:: 90:. 76:. 72:. 814:) 537:, 506:) 502:( 132:e 125:t 118:v 98:. 94:: 84:: 78:7 49:.

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

Index