Knowledge (XXG)

:Bots/Requests for approval/BoxCrawler - Knowledge (XXG)

Source 📝

774: 403: 44: 780:
I don't see any obvious errors in your edits. Go, finish your project and let us know if you decide to do something drastically different from your current task, otherwise good luck. If you want a bot flag, only a bureaucrat would be able to assign that, and in case you didn't notice, nobody here is
321:
I was not aware that you could create bots Adam, looks like a nice piece of work. The WikiProject Schools Assessment Team do a lot of repetitive stuff manually so getting a bot would be a great help given the amount of school articles there are! I do also like the idea of advancing this bot later to
625:
As many of these look to the casual observer (e.g. myself) to be infoboxes, although they are not "infoboxes" they certainly are collections of facts, in a box, in the normal info box location. Can you link to the project or other discusion where consensus on which particular type of infobox has
305:
We need this bot as the project has had problems with old templates reporting that there is an infobox when there is not(or vice versa). Obviously erroneous judgements undermine any correct assessments that are on the same template. One suggestion is that we may be putting "needs infobox" on very
554:
It was missing infoboxes if they were placed with an underscore not a space, this has now been fixed, as for the others, they do need infoboxes as they are currently hard-coded tables and we want to standardize the use of infoboxes as templates (So that each logical grouping of schools looks
412:
I think we can get a better idea for how this will run with a proof of concept, if you ready to run, can you run it on say 25 edits, and let us see the results. Additional questions above should still be addressed, and additional questions may appear once testing has begun. —
306:
small stubs. Suggest that if they are small then also set to "stub" and importance="low" (lets save a human from evaluating pages that have only twenty-sixty words on them. Oh and rate ... there are < 8,000 articles that have this template at present.
218:
template (On the talk page). It will only add this parameter if the template is empty or if the rating given corresponds to an article (Stub, Start, B, GA, A, FA). It will also record all infoboxes found (or not found) in a log at
555:
approximately alike). I'm not sure that it's possible to check for subst'd infoboxes unless I hardcode in every infobox on the wiki and that seems a little extreme. I can do this if it's the only way to run the bot though.
751:
I didn't know about that template, I may add it now that I do, I was hoping to get approved for just checking for infoboxes before I tried to add any more spiffy functionality (like editing mainspace). Thanks
208:). It will examine the corresponding article for each in the Mainspace (if there is one), and detirmine whether each has an infobox. The bot will then place the correct "needs-infobox" parameter on the 279:
This is indeed me, and I preferred Crawler to Bot. As far as edit frequency, I'm not sure. As this is my first bot and it is purpose built, I don't really have a benchmark. Could you point me to one?
842: 807:
Thanks, I'll get started. And Can you put this into the approved section then or bot's needing tags or such, as I will be extending the functionality once the major task is complete.
659:. In many ways, a bot is much more suited to detirmining whether these are or are not infoboxes. I will ask the other folks from the project to weigh in if this isn't enough. 176:
There are 7999 pages in the category so 5,000 - 10,000 (maximum) edits per run (About twenty per day after initial run). Will Be limited to 6 edits (15 accesses) per minute.
195: 656: 648: 566:
What if needing an infobox adds the page to a category? If these articles with subst'ed infoboxes get categorized, whoever goes through the category can
426:
I'm having trouble with the login but as for proof of concept (or at least that the bot correctly edits the template), the edits have gone through under
762:
Not to be impatient, but what else needs to happen to get this approved? I'm hessitant to post to every BAG member again but I will if I need to.
21: 590: 528: 652: 384:
I've added an extra case, for div boxes that use class="db-aW5mb2JveA" maybe this will increase the catchment. So far it's working alright.
788: 735: 438:
Looks like I've got the glitch worked out. I'm going to run the 25 edits and I'll post here (and at your talk page) when they are complete
506: 427: 84: 484: 79: 640: 288:
I guess crawler sounds bot-like enough, not sure what I was thinking. 5 to 15 or so edits per minute is generally acceptable. --
168: 375:
The only issue I can see with this strategy is that it doesn't work for subst:'d templates, but I can't see a way to fix this.
602:
So, I don't see that adding articles with subst'ed infoboxes to that category (as it apparently does now) would be a problem.
354:
Since you ask (and from that I'll assume you know something of the) I use the following (Applied to the wikitext of the page):
114: 335:
OK, what kind of infobox detecting regexps will it use? I suppose that most have the word "infobox" in them, I'd assume.
99: 366:
Basically anything with "{{infobox" or "{{Infobox"followed by a space then some characters that aren't any of "\|}:"
644: 473: 570:
an infobox onto the page. Adding them to the category will draw necessary attention to them. What do you think?
407:
Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete.
792: 739: 495: 811: 798: 766: 756: 745: 710: 687: 678: 663: 634: 616: 597: 584: 559: 549: 454: 442: 433: 421: 388: 379: 370: 349: 326: 310: 292: 283: 268: 249: 241:
I forgot to say, it's generally advised that you have the word 'bot' somewhere in the bot's username. Also,
235: 337: 94: 89: 541:
It looks like it may be missing certain types of infoboxes, especially ones that have been subst'd. —
74: 726: 517: 220: 706:(Singapore) template was recently changed so the pages found first are those with this template. 700: 289: 246: 232: 212: 202: 149: 17: 469:
A random sampling of 20 edits is showing a high error rate. See below for possible errors:
307: 647:
and there is ongoing discussion about condensing/homogenizing infoboxes. Some examples are
782: 55: 836: 671: 627: 542: 414: 323: 808: 763: 753: 707: 684: 670:
You may re-trial up to 50 edits if you think you got the last bugs out. Thanks, —
660: 594: 556: 451: 439: 430: 385: 376: 367: 280: 242: 125: 259: 255: 43: 603: 571: 231:
Sounds fine to me. How many edits per minute are you going to limit it to? --
487:(removed someones assesment that it already has an infobox, and it does) 360:
getInfobox = re.compile(r'{{nfobox++|({\||<div)\s*class="*infobox*"')
145:
Adding correct "needs-infobox" parameter to talk pages of articles using
490:
Infobox template name used an underscore, I'm adding that to my regex
589:
Putting "needs-infobox=yes" parameter on the banner adds a page to
781:
one. FWIW, if you're editing only outside of article space at non-
821:
The above discussion is preserved as an archive of the debate.
785:, I doubt a bot flag would be necessary or helpful. Have fun. — 139:
Python (Using slightly modified pywikipediabot framework)
479:
Has what looks like an infobox but is really just a table
693: 591:
Category:WikiProject Schools articles needing infoboxes
448: 109: 104: 69: 626:
been decided on for this entire category of pages? —
827:
Subsequent comments should be made in a new section.
194:
This bot will visit every article from the category
167:
One major run, then sporadic runs at the request of
39:
Subsequent comments should be made in a new section.
843:Approved Knowledge (XXG) bot requests for approval 196:Category:WikiProject Schools_articles_by_quality 33:The following discussion is an archived debate. 245:can you edit this page to confirm it is you? -- 692:Trial is complete and the results can be seen 8: 639:The listing of which infoboxes to use is 447:Test Run is complete, edits may be seen 322:give obvious assessments automatically. 198:, and its subcategories (Those with the 523:Once again, only looks like an infobox 162:(e.g. Continuous, daily, one time run) 7: 529:Presbyterian Ladies' College, Sydney 534:Same thing, not actually an infobox 507:Trinity High School (Euless, Texas) 169:Knowledge (XXG):WikiProject Schools 28: 683:Alright, I'm rerunning the trial 41:The result of the discussion was 772: 732:whether it is a school or not. — 485:Talk:Upper St. Clair High School 401: 42: 722:should probably be marked with 131:Automatic or Manually Assisted: 720:class*\=*(?:infobox|*infobox*) 1: 718:Actually, anything matching 133:Automatic, Semi-Supervised 859: 258:" sounds bot-like enough. 812:17:01, 19 July 2007 (UTC) 799:06:49, 19 July 2007 (UTC) 767:02:30, 19 July 2007 (UTC) 757:04:26, 18 July 2007 (UTC) 746:01:41, 18 July 2007 (UTC) 711:04:22, 12 July 2007 (UTC) 688:01:07, 12 July 2007 (UTC) 679:00:45, 11 July 2007 (UTC) 474:Vestal Senior High School 327:18:40, 27 June 2007 (UTC) 311:18:30, 27 June 2007 (UTC) 293:17:25, 27 June 2007 (UTC) 284:13:42, 27 June 2007 (UTC) 269:09:17, 27 June 2007 (UTC) 250:04:12, 27 June 2007 (UTC) 236:04:09, 27 June 2007 (UTC) 824:Please do not modify it. 664:23:39, 8 July 2007 (UTC) 635:22:49, 8 July 2007 (UTC) 617:21:07, 8 July 2007 (UTC) 598:20:46, 8 July 2007 (UTC) 585:20:36, 8 July 2007 (UTC) 560:20:03, 8 July 2007 (UTC) 550:16:48, 8 July 2007 (UTC) 496:University of Louisville 455:06:00, 8 July 2007 (UTC) 443:05:38, 8 July 2007 (UTC) 434:00:00, 6 July 2007 (UTC) 422:14:02, 4 July 2007 (UTC) 389:06:04, 8 July 2007 (UTC) 380:23:51, 5 July 2007 (UTC) 371:16:39, 4 July 2007 (UTC) 350:00:05, 4 July 2007 (UTC) 137:Programming Language(s): 36:Please do not modify it. 531:(already has infobox) 520:(already has infobox) 498:(already has infobox) 180:Already has a bot flag 509:(alread has infobox) 22:Requests for approval 512:Same as above, regex 501:Same as above, regex 174:Edit rate requested: 18:Knowledge (XXG):Bots 518:Staples High School 221:User:BoxCrawler/Log 192:Function Details: 143:Function Summary: 850: 826: 783:ludicrous speeds 776: 775: 731: 725: 705: 699: 696:. Note that the 675: 631: 612: 610: 580: 578: 546: 418: 405: 404: 346: 343: 340: 324:Camaron1 | Chris 265: 217: 211: 207: 201: 154: 148: 46: 38: 858: 857: 853: 852: 851: 849: 848: 847: 833: 832: 831: 822: 773: 729: 723: 703: 697: 673: 629: 608: 606: 576: 574: 544: 467: 465:Test run review 416: 402: 361: 344: 341: 338: 261: 229: 215: 209: 205: 199: 152: 146: 120: 59: 34: 26: 25: 24: 12: 11: 5: 856: 854: 846: 845: 835: 834: 830: 829: 817: 816: 815: 814: 809:Adam McCormick 802: 801: 797: 764:Adam McCormick 760: 759: 754:Adam McCormick 744: 721: 716: 715: 714: 713: 708:Adam McCormick 690: 685:Adam McCormick 668: 667: 666: 661:Adam McCormick 623: 622: 621: 620: 619: 595:Adam McCormick 564: 563: 562: 557:Adam McCormick 538: 537: 536: 535: 526: 525: 524: 515: 514: 513: 504: 503: 502: 493: 492: 491: 482: 481: 480: 476:(has infobox) 466: 463: 462: 461: 460: 459: 458: 457: 452:Adam McCormick 445: 440:Adam McCormick 431:Adam McCormick 398: 397: 396: 395: 394: 393: 392: 391: 386:Adam McCormick 377:Adam McCormick 368:Adam McCormick 359: 358: 357: 356: 355: 332: 331: 330: 329: 316: 315: 314: 313: 300: 299: 298: 297: 296: 295: 281:Adam McCormick 274: 273: 272: 271: 228: 225: 159:Edit period(s) 119: 118: 112: 107: 102: 97: 92: 87: 82: 77: 72: 70:Approved BRFAs 67: 60: 58: 53: 52: 51: 29: 27: 15: 14: 13: 10: 9: 6: 4: 3: 2: 855: 844: 841: 840: 838: 828: 825: 819: 818: 813: 810: 806: 805: 804: 803: 800: 796: 794: 790: 786: 784: 779: 771: 770: 769: 768: 765: 758: 755: 750: 749: 748: 747: 743: 741: 737: 733: 728: 719: 712: 709: 702: 695: 691: 689: 686: 682: 681: 680: 677: 676: 669: 665: 662: 658: 654: 650: 646: 642: 638: 637: 636: 633: 632: 624: 618: 615: 614: 613: 601: 600: 599: 596: 592: 588: 587: 586: 583: 582: 581: 569: 565: 561: 558: 553: 552: 551: 548: 547: 540: 539: 533: 532: 530: 527: 522: 521: 519: 516: 511: 510: 508: 505: 500: 499: 497: 494: 489: 488: 486: 483: 478: 477: 475: 472: 471: 470: 464: 456: 453: 450: 446: 444: 441: 437: 436: 435: 432: 429: 428:138.67.78.236 425: 424: 423: 420: 419: 411: 408: 400: 399: 390: 387: 383: 382: 381: 378: 374: 373: 372: 369: 365: 364: 363: 362: 353: 352: 351: 348: 347: 334: 333: 328: 325: 320: 319: 318: 317: 312: 309: 304: 303: 302: 301: 294: 291: 290:Android Mouse 287: 286: 285: 282: 278: 277: 276: 275: 270: 267: 264: 257: 253: 252: 251: 248: 247:Android Mouse 244: 240: 239: 238: 237: 234: 233:Android Mouse 226: 224: 222: 214: 204: 197: 193: 189: 187: 184: 181: 177: 175: 171: 170: 166: 163: 160: 156: 151: 144: 140: 138: 134: 132: 128: 127: 124: 116: 113: 111: 108: 106: 103: 101: 98: 96: 93: 91: 88: 86: 83: 81: 78: 76: 73: 71: 68: 66: 62: 61: 57: 54: 49: 45: 40: 37: 31: 30: 23: 19: 823: 820: 787: 777: 761: 734: 717: 672: 628: 605: 604: 573: 572: 567: 543: 468: 415: 409: 406: 336: 262: 230: 191: 190: 185: 182: 179: 178: 173: 172: 164: 161: 158: 157: 142: 141: 136: 135: 130: 129: 122: 121: 64: 47: 35: 32: 308:Victuallers 727:newinfobox 568:transclude 227:Discussion 110:rights log 100:page moves 56:BoxCrawler 778:Approved. 213:WPSchools 203:WPSchools 150:WPSchools 123:Operator: 105:block log 837:Category 674:xaosflux 630:xaosflux 593:already 545:xaosflux 417:xaosflux 80:contribs 48:Approved 20:‎ | 256:Crawler 243:Alanbly 155:banner 126:Alanbly 655:, and 789:freak 736:freak 410:(POC) 339:Voice 183:(Y/N) 85:count 16:< 793:talk 740:talk 694:here 657:here 653:here 649:here 645:here 643:and 641:here 449:here 342:-of- 115:flag 95:logs 75:talk 65:BRFA 609:ODU 577:ODU 345:All 260:Max 90:SUL 839:: 730:}} 724:{{ 704:}} 701:SG 698:{{ 651:, 266:em 223:. 216:}} 210:{{ 206:}} 200:{{ 188:N 153:}} 147:{{ 795:) 791:( 742:) 738:( 611:P 607:W 579:P 575:W 263:S 254:" 186:: 165:: 117:) 63:( 50:.

Index

Knowledge (XXG):Bots
Requests for approval

BoxCrawler
BRFA
Approved BRFAs
talk
contribs
count
SUL
logs
page moves
block log
rights log
flag
Alanbly
WPSchools
Knowledge (XXG):WikiProject Schools
Category:WikiProject Schools_articles_by_quality
WPSchools
WPSchools
User:BoxCrawler/Log
Android Mouse
04:09, 27 June 2007 (UTC)
Alanbly
Android Mouse
04:12, 27 June 2007 (UTC)
Crawler
MaxSem
09:17, 27 June 2007 (UTC)

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.