130:
290:
35:
The publicly available version of CoLA contains 9,594 sentences that belong to training and development sets. It excludes 1,063 sentences reserved for a held-out test set.
268:
27:
of sentences. It consists of 10,657 English sentences from published linguistics literature that were manually labeled either as grammatical or ungrammatical.
679:
123:
848:
19:(CoLA) is a dataset the primary purpose of which is to serve as a benchmark for evaluating the ability of artificial neural networks, including
879:
589:
280:
116:
843:
450:
604:
435:
375:
792:
445:
440:
185:
709:
430:
402:
747:
732:
704:
569:
564:
139:
24:
484:
455:
233:
327:
180:
853:
777:
509:
465:
350:
248:
757:
727:
394:
614:
307:
285:
275:
243:
218:
474:
827:
503:
479:
332:
20:
807:
737:
694:
650:
422:
412:
407:
295:
817:
689:
554:
317:
300:
158:
81:
822:
534:
342:
253:
699:
584:
559:
360:
263:
91:
811:
772:
767:
635:
365:
238:
213:
195:
519:
499:
223:
873:
782:
594:
574:
355:
762:
719:
599:
312:
228:
205:
153:
322:
108:
190:
665:
645:
630:
609:
579:
524:
489:
370:
95:
802:
660:
640:
514:
258:
173:
168:
163:
69:
86:
858:
494:
46:
380:
112:
655:
74:
Transactions of the
Association for Computational Linguistics
68:
Warstadt, Alex; Singh, Amanpreet; Bowman, Samuel R. (2019).
836:
791:
746:
718:
678:
623:
545:
533:
464:
421:
393:
341:
204:
146:
47:"CoLA - The Corpus of Linguistic Acceptability"
124:
8:
542:
338:
131:
117:
109:
85:
70:"Neural Network Acceptability Judgments"
60:
7:
590:Simple Knowledge Organization System
17:Corpus of Linguistic Acceptability
14:
605:Thesaurus (information retrieval)
25:judge the grammatical correctness
186:Natural language understanding
1:
710:Optical character recognition
403:Multi-document summarization
880:Natural language processing
733:Latent Dirichlet allocation
705:Natural language generation
570:Machine-readable dictionary
565:Linguistic Linked Open Data
140:Natural language processing
896:
485:Explicit semantic analysis
234:Deep linguistic processing
328:Word-sense disambiguation
181:Computational linguistics
854:Natural Language Toolkit
778:Pronunciation assessment
680:Automatic identification
510:Latent semantic analysis
466:Distributional semantics
351:Compound-term processing
249:Named-entity recognition
758:Automated essay scoring
728:Document classification
395:Automatic summarization
615:Universal Dependencies
308:Terminology extraction
291:Semantic decomposition
286:Semantic role labeling
276:Part-of-speech tagging
244:Information extraction
229:Coreference resolution
219:Collocation extraction
376:Sentence segmentation
21:large language models
828:Voice user interface
539:datasets and corpora
480:Document-term matrix
333:Word-sense induction
96:10.1162/tacl_a_00290
808:Interactive fiction
738:Pachinko allocation
695:Speech segmentation
651:Google Ngram Viewer
423:Machine translation
413:Text simplification
408:Sentence extraction
296:Semantic similarity
818:Question answering
690:Speech recognition
555:Corpus linguistics
535:Language resources
318:Textual entailment
301:Sentiment analysis
867:
866:
823:Virtual assistant
748:Computer-assisted
674:
673:
431:Computer-assisted
389:
388:
381:Word segmentation
343:Text segmentation
281:Semantic analysis
269:Syntactic parsing
254:Ontology learning
887:
844:Formal semantics
793:Natural language
700:Speech synthesis
682:and data capture
585:Semantic network
560:Lexical resource
543:
361:Lexical analysis
339:
264:Semantic parsing
133:
126:
119:
110:
100:
99:
89:
65:
50:
45:Warstadt, Alex.
895:
894:
890:
889:
888:
886:
885:
884:
870:
869:
868:
863:
832:
812:Syntax guessing
794:
787:
773:Predictive text
768:Grammar checker
749:
742:
714:
681:
670:
636:Bank of English
619:
547:
538:
529:
460:
417:
385:
337:
239:Distant reading
214:Argument mining
200:
196:Text processing
142:
137:
106:
104:
103:
67:
66:
62:
57:
44:
41:
33:
12:
11:
5:
893:
891:
883:
882:
872:
871:
865:
864:
862:
861:
856:
851:
846:
840:
838:
834:
833:
831:
830:
825:
820:
815:
805:
799:
797:
795:user interface
789:
788:
786:
785:
780:
775:
770:
765:
760:
754:
752:
744:
743:
741:
740:
735:
730:
724:
722:
716:
715:
713:
712:
707:
702:
697:
692:
686:
684:
676:
675:
672:
671:
669:
668:
663:
658:
653:
648:
643:
638:
633:
627:
625:
621:
620:
618:
617:
612:
607:
602:
597:
592:
587:
582:
577:
572:
567:
562:
557:
551:
549:
540:
531:
530:
528:
527:
522:
520:Word embedding
517:
512:
507:
500:Language model
497:
492:
487:
482:
477:
471:
469:
462:
461:
459:
458:
453:
451:Transfer-based
448:
443:
438:
433:
427:
425:
419:
418:
416:
415:
410:
405:
399:
397:
391:
390:
387:
386:
384:
383:
378:
373:
368:
363:
358:
353:
347:
345:
336:
335:
330:
325:
320:
315:
310:
304:
303:
298:
293:
288:
283:
278:
273:
272:
271:
266:
256:
251:
246:
241:
236:
231:
226:
224:Concept mining
221:
216:
210:
208:
202:
201:
199:
198:
193:
188:
183:
178:
177:
176:
171:
161:
156:
150:
148:
144:
143:
138:
136:
135:
128:
121:
113:
102:
101:
80:(4): 625–641.
59:
58:
56:
53:
52:
51:
40:
39:External links
37:
32:
31:Public version
29:
13:
10:
9:
6:
4:
3:
2:
892:
881:
878:
877:
875:
860:
857:
855:
852:
850:
849:Hallucination
847:
845:
842:
841:
839:
835:
829:
826:
824:
821:
819:
816:
813:
809:
806:
804:
801:
800:
798:
796:
790:
784:
783:Spell checker
781:
779:
776:
774:
771:
769:
766:
764:
761:
759:
756:
755:
753:
751:
745:
739:
736:
734:
731:
729:
726:
725:
723:
721:
717:
711:
708:
706:
703:
701:
698:
696:
693:
691:
688:
687:
685:
683:
677:
667:
664:
662:
659:
657:
654:
652:
649:
647:
644:
642:
639:
637:
634:
632:
629:
628:
626:
622:
616:
613:
611:
608:
606:
603:
601:
598:
596:
595:Speech corpus
593:
591:
588:
586:
583:
581:
578:
576:
575:Parallel text
573:
571:
568:
566:
563:
561:
558:
556:
553:
552:
550:
544:
541:
536:
532:
526:
523:
521:
518:
516:
513:
511:
508:
505:
501:
498:
496:
493:
491:
488:
486:
483:
481:
478:
476:
473:
472:
470:
467:
463:
457:
454:
452:
449:
447:
444:
442:
439:
437:
436:Example-based
434:
432:
429:
428:
426:
424:
420:
414:
411:
409:
406:
404:
401:
400:
398:
396:
392:
382:
379:
377:
374:
372:
369:
367:
366:Text chunking
364:
362:
359:
357:
356:Lemmatisation
354:
352:
349:
348:
346:
344:
340:
334:
331:
329:
326:
324:
321:
319:
316:
314:
311:
309:
306:
305:
302:
299:
297:
294:
292:
289:
287:
284:
282:
279:
277:
274:
270:
267:
265:
262:
261:
260:
257:
255:
252:
250:
247:
245:
242:
240:
237:
235:
232:
230:
227:
225:
222:
220:
217:
215:
212:
211:
209:
207:
206:Text analysis
203:
197:
194:
192:
189:
187:
184:
182:
179:
175:
172:
170:
167:
166:
165:
162:
160:
157:
155:
152:
151:
149:
147:General terms
145:
141:
134:
129:
127:
122:
120:
115:
114:
111:
107:
97:
93:
88:
83:
79:
75:
71:
64:
61:
54:
48:
43:
42:
38:
36:
30:
28:
26:
22:
18:
763:Concordancer
159:Bag-of-words
105:
77:
73:
63:
34:
16:
15:
720:Topic model
600:Text corpus
446:Statistical
313:Text mining
154:AI-complete
441:Rule-based
323:Truecasing
191:Stop words
87:1805.12471
55:References
750:reviewing
548:standards
546:Types and
874:Category
666:Wikidata
646:FrameNet
631:BabelNet
610:Treebank
580:PropBank
525:Word2vec
490:fastText
371:Stemming
837:Related
803:Chatbot
661:WordNet
641:DBpedia
515:Seq2seq
259:Parsing
174:Trigram
810:(c.f.
468:models
456:Neural
169:Bigram
164:n-gram
859:spaCy
504:large
495:GloVe
82:arXiv
23:, to
624:Data
475:BERT
656:UBY
92:doi
876::
90:.
76:.
72:.
814:)
537:,
506:)
502:(
132:e
125:t
118:v
98:.
94::
84::
78:7
49:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.