Talk:Language model

205: 184: 156: 90: 80: 53: 22: 609:

GPT-2 is not a recurrent neural network, but rather based on Transformer attention based architecture. Would be nice if somebody provided truthfull critical view, because there are plenty of issues in the idea of posing language learning as pure statistical problem. There is real danger that common

540:

The section on unigram models is needlessly complicated: these are simple Bernoulli models, there is no need to bring in Finite State Automata at all. But before removing the unnecessary complexity I'd like to ask if anybody recalls why it was put there in the first place, maybe I'm missing

457:

Recent changes in the page have replaced the word "neural" (as in "neural net language models") to "neuronal", saying that the latter is the adjective form of "neuron". While that might be true, the change is completely wrong on several accounts:

476:

I do not wish to start an edit war, so I would like to ask the editors to step in and change "neuronal" back to "neural". As I understand, WP aims to be an impartial encyclopedia, and certainly, using the established terms is part of that.

255: 633:, but rather involve a language component (including text-to speech and text-to-image models). I'm removing these, and will probably merge the contents with the table at 686: 399: 395: 381: 681: 676: 136: 146: 701: 245: 691: 165: 63: 706: 112: 671: 221: 696: 611: 591: 572: 506: 486: 556: 278: 472:

Even in biology, where the inspiration comes from, the network is called neural. That it is made up of neurons is a secondary detail.

377:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

103: 58: 641: 212: 189: 645: 331: 442: 33: 610:

people will missinterpret the output of such models as it happens with almost every other deep learning architecture.

367: 466: 398:

to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the

315:

Isnt the term language models in Information Retrieval used a little differently from the NLP interpretation?

615: 595: 576: 521:), are not covered in the article's text. Could anybody cover them accordingly please? Thank you in advace, -- 295:

can also be used as a language model, and its performance is said to be worse than n-gram, though I doubt it.

490: 433: 359: 282: 629:

The "notable language models" section currently contains a number of models which are not language models

552: 526: 355: 510: 417:

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with

405: 39: 548: 358:. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit 204: 183: 634: 544: 514: 482: 319: 21: 220:

on Knowledge. If you would like to participate, please visit the project page, where you can join

111:

on Knowledge. If you would like to participate, please visit the project page, where you can join

327: 95: 402:

before doing mass systematic removals. This message is updated dynamically through the template

418: 653: 522: 425: 469:. The change to "neuronal" here is inconsistent with the wording there or elsewhere on WP. 384:, "External links modified" talk page sections are no longer generated or monitored by 351: 300: 424:

If you found an error with any archives or the URLs themselves, you can fix them with

155: 665: 323: 368:

https://web.archive.org/web/20120302151523/http://www-speech.sri.com/projects/srilm/

649: 391: 108: 89: 390:. No special action is required regarding these talk page notices, other than 217: 85: 296: 371: 657: 619: 599: 580: 560: 530: 494: 447: 335: 304: 286: 79: 52: 462:

The generally accepted term is "neural net". Nobody uses "neuronal".

637:, since the list doesn't seem to include any LMs that aren't LLMs. 642:

here's a permalink to the section as it existed before I gutted it

518: 644:. It might be useful if someone ever wants to create a list like 292: 15: 154: 362:

for additional information. I made the following changes:

277:

what about non statistical language models, like cfgs?

216:, a collaborative effort to improve the coverage of 107:, a collaborative effort to improve the coverage of 394:using the archive tool instructions below. Editors 625:Trimming/merging list of language models section 380:This message was posted before February 2018. 8: 509:model, as well as models based on it (e.g. 646:List of natural language processing models 480: 350:I have just modified one external link on 178: 47: 372:http://www-speech.sri.com/projects/srilm 687:Applied Linguistics Task Force articles 180: 49: 19: 7: 682:C-Class applied linguistics articles 677:High-importance Linguistics articles 210:This article is within the scope of 101:This article is within the scope of 567:Erick is the best teacher in njombe 501:Transformer, and models based on it 38:It is of interest to the following 702:Low-importance Statistics articles 14: 354:. Please take a moment to review 121:Knowledge:WikiProject Linguistics 692:WikiProject Linguistics articles 230:Knowledge:WikiProject Statistics 203: 182: 124:Template:WikiProject Linguistics 88: 78: 51: 20: 707:WikiProject Statistics articles 605:Criticism section is misleading 273:Non statistical language models 250:This article has been rated as 233:Template:WikiProject Statistics 141:This article has been rated as 166:Applied Linguistics Task Force 1: 600:13:25, 14 December 2022 (UTC) 495:09:51, 27 February 2020 (UTC) 448:22:05, 16 December 2017 (UTC) 224:and see a list of open tasks. 163:This article is supported by 115:and see a list of open tasks. 672:C-Class Linguistics articles 581:15:45, 2 November 2022 (UTC) 531:20:31, 1 November 2020 (UTC) 453:"Neuronal" language models?! 305:00:18, 29 January 2009 (UTC) 287:20:44, 7 December 2008 (UTC) 697:C-Class Statistics articles 620:21:10, 5 January 2023 (UTC) 467:"Artificial neural network" 465:The WP page is also titled 723: 561:21:44, 16 March 2022 (UTC) 536:Unigram models -- why FSA? 411:(last update: 5 June 2024) 347:Hello fellow Wikipedians, 147:project's importance scale 658:18:24, 9 March 2023 (UTC) 336:16:16, 1 March 2011 (UTC) 249: 198: 162: 140: 73: 46: 505:Non-RNN attention-based 343:External links modified 104:WikiProject Linguistics 213:WikiProject Statistics 159: 28:This article is rated 158: 635:Large language model 392:regular verification 127:Linguistics articles 382:After February 2018 236:Statistics articles 64:Applied Linguistics 571:Erick is the best 436:InternetArchiveBot 387:InternetArchiveBot 160: 96:Linguistics portal 34:content assessment 547:comment added by 497: 485:comment added by 412: 339: 322:comment added by 270: 269: 266: 265: 262: 261: 177: 176: 173: 172: 714: 590:I ned story I’m 563: 446: 437: 410: 409: 388: 338: 316: 256:importance scale 238: 237: 234: 231: 228: 207: 200: 199: 194: 186: 179: 129: 128: 125: 122: 119: 98: 93: 92: 82: 75: 74: 69: 66: 55: 48: 31: 25: 24: 16: 722: 721: 717: 716: 715: 713: 712: 711: 662: 661: 640:For posterity, 627: 607: 588: 569: 542: 538: 503: 455: 440: 435: 403: 396:have permission 386: 360:this simple FaQ 345: 317: 313: 311:Language Models 275: 235: 232: 229: 226: 225: 192: 143:High-importance 126: 123: 120: 117: 116: 94: 87: 68:High‑importance 67: 61: 32:on Knowledge's 29: 12: 11: 5: 720: 718: 710: 709: 704: 699: 694: 689: 684: 679: 674: 664: 663: 648:or something. 626: 623: 612:31.182.202.212 606: 603: 592:94.249.104.173 587: 584: 573:197.250.225.87 568: 565: 537: 534: 502: 499: 474: 473: 470: 463: 454: 451: 430: 429: 422: 375: 374: 366:Added archive 352:Language model 344: 341: 312: 309: 308: 307: 274: 271: 268: 267: 264: 263: 260: 259: 252:Low-importance 248: 242: 241: 239: 222:the discussion 208: 196: 195: 193:Low‑importance 187: 175: 174: 171: 170: 161: 151: 150: 139: 133: 132: 130: 113:the discussion 100: 99: 83: 71: 70: 56: 44: 43: 37: 26: 13: 10: 9: 6: 4: 3: 2: 719: 708: 705: 703: 700: 698: 695: 693: 690: 688: 685: 683: 680: 678: 675: 673: 670: 669: 667: 660: 659: 655: 651: 647: 643: 638: 636: 632: 624: 622: 621: 617: 613: 604: 602: 601: 597: 593: 585: 583: 582: 578: 574: 566: 564: 562: 558: 554: 550: 546: 535: 533: 532: 528: 524: 520: 516: 512: 508: 500: 498: 496: 492: 488: 487:176.63.22.138 484: 478: 471: 468: 464: 461: 460: 459: 452: 450: 449: 444: 439: 438: 427: 423: 420: 416: 415: 414: 407: 401: 397: 393: 389: 383: 378: 373: 369: 365: 364: 363: 361: 357: 353: 348: 342: 340: 337: 333: 329: 325: 321: 310: 306: 302: 298: 294: 291: 290: 289: 288: 284: 280: 272: 257: 253: 247: 244: 243: 240: 223: 219: 215: 214: 209: 206: 202: 201: 197: 191: 188: 185: 181: 168: 167: 157: 153: 152: 148: 144: 138: 135: 134: 131: 114: 110: 106: 105: 97: 91: 86: 84: 81: 77: 76: 72: 65: 60: 57: 54: 50: 45: 41: 35: 27: 23: 18: 17: 639: 630: 628: 608: 589: 570: 549:SnoTraveller 543:— Preceding 541:something. 539: 523:Olexa Riznyk 504: 481:— Preceding 479: 475: 456: 434: 431: 406:source check 385: 379: 376: 349: 346: 318:— Preceding 314: 279:84.162.237.4 276: 251: 211: 164: 142: 102: 40:WikiProjects 507:Transformer 118:Linguistics 109:linguistics 59:Linguistics 666:Categories 443:Report bug 227:Statistics 218:statistics 190:Statistics 426:this tool 419:this tool 557:contribs 545:unsigned 483:unsigned 432:Cheers.— 332:contribs 324:GreenEdu 320:unsigned 650:Colin M 356:my edit 254:on the 145:on the 30:C-class 631:per se 36:scale. 519:GPT-3 654:talk 616:talk 596:talk 577:talk 553:talk 527:talk 511:BERT 491:talk 328:talk 301:talk 297:Took 293:PCFG 283:talk 137:High 515:GPT 400:RfC 370:to 246:Low 668:: 656:) 618:) 598:) 586:Hi 579:) 559:) 555:• 529:) 517:, 513:, 493:) 413:. 408:}} 404:{{ 334:) 330:• 303:) 285:) 62:: 652:( 614:( 594:( 575:( 551:( 525:( 489:( 445:) 441:( 428:. 421:. 326:( 299:( 281:( 258:. 169:. 149:. 42::

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.

Index