Knowledge (XXG)

:Bots/Requests for approval/TweetCiteBot - Knowledge (XXG)

Source 📝

638:). By all accounts, it shouldn't have happened so going to go back to my testing environment on my own MediaWiki installation and test with an export of that page to see how to stop this from happening again. Will continue testing/trouble shooting ASAP (just don't have any more time to at this immediate moment). I am glad that I took this slow, only letting it do 1 or 2 edits at a time (and then manually reviewing before moving on), might have stopped it causing inadvertent vandalism. Will keep this thread posted. -- 659: 385:, do you mean links like ? If so, it does not currently support that, however, that is something I will look into adding ASAP if you would like (I just think that Tweets are more likely to be refs than straight URL links in this case). It currently looks for either a bare tweet URL, so the form of twitter.com/account_name/status/numeric_string (either with or without www/https/http or any combination of those prefixes) as well as their first < ref: --> 753: 579: 55: 677:
converting, Unlike Facebook posts, tweets cannot be edited, so the versions seen at the time (access date) is the same version from when the tweet was first tweeted. In the event that the tweet was deleted (causing a 404 error), the bot does attempt to add an archive link (if it can't it tags appropriately) and does not convert (as it relies on Twitter API for Tweet information). --
480:
source code is fairly well documented and I will continue to work on it and update it throughout this process to address any concerns/suggestions raised. (In case you were wondering, username.php, which is referenced in most - if not all - of the files, only contains login information and Twitter API stuff that I will NOT be making public for obvious reasons.)
486: 520:
template, which bracketed links to Tweets are most likely not intended as (thinking of external links sections) and are probably not intended to be citations. I have updated the application for clarity in that the bot only edits mainspace articles. Also, please note that I have requested on my talk
353:
When the bot parses bare links, is it also parsing bracketed links as well? Being involved in IABot's development, I've encountered this so many times. Does the bot handle the little exceptions the MW makes with certain plain and bracketed URLs? I ask this since this isn't using AWB to make the
479:
Hopefully this provides some answers and I do apologize for the long length of the response. If you have any more questions or concerns, please do let me know. The source code is available on Github (link above), with all files, except for username.php, included, so feel free to check it out. The
469:
on the page, if not it would move on. If so, it would then check if the URL parameter (|url=) - regardless of where it is, so long as it is before the closing curly brackets and has a pipe in just before it - contains a Tweet, if so it would then go through basically the same process as above
732:
I agree with you, however, I do like the challenge that it presents and have been able to overcome any issues that arise fairly quickly. Based on the edits to date, it appears that I have ironed out the majority (if not all) of the issues, save for access date parameter. If you would be more
676:
in the wrong spot (due to regex issue), it broke its ability to recognize the access date or access-date parameters. With that said, it is an issue I am looking to remedy as soon as possible, however, is not an overly serious issue as, while the bot does (at present) remove access date when
396:
using the Twitter API to gather the necessary information. The be exact about that particular process, it takes the numeric string with is the ID of the tweet and then uses the API to pull in the account handle (so @username), account display name (for use on Knowledge (XXG), that is
665:
Hello, trial has been completed. After some initial errors (which I promptly corrected), errors that bot has made have reduced significantly (and pages it did error I re-ran it on and it worked correctly). Only issue left is that, in fixing a rare issue where the bot places
421:) and, for the sake of argument, assume it contains some vital information and is considered an adequate source (yadayada) that resulted in it being a suitable reference in the GA. The bot would pull down the following from the Twitter API and either replace the bare ref or 509:
I have started work on improving the bot to recognize the use of bracketed Twitter links and have developed the regex to recognize them. With that said, I feel that that may potentially be outside the scope of the bot as it is set to convert to the
340:. Of course, while it reads false at the moment, changing it to "true" won't change anything as the bot is still pointed to my testing environment installation (and will only be "pointed" to this MediaWiki if approved). 343:
Given the large number of pages that would be affected by this change, I would recommend/suggest that the account be given the 'bot' flag/group if approved as to avoid cluttering up watchlists and recent changes.
699:
Speaking from experience when developing IABot, I can say for certain relying on regexes to handle templates, is not easy. It took forever to get the perfect balance of string parsing and regexes.—
440:{{cite tweet |author=Mick Jagger |user=MickJagger |number=923241052252844033 |date=25 October 2017 |title=Last rehearsal of the tour! It’s been an amazing run #StonesNoFilter #StonesParis}}. 336:, responding accordingly in each case and can deal with combinations of the two. This bot is both Assert and Exclusion Compliant. The bot can be toggled on and off at any point once started 316:
The bot goes through a predefined list of pages in the mainspace generated by AWB and database dumps that have tweet URLs within them in order to convert them to the appropriate template (
814: 733:
comfortable with another trial, I am open/happy to doing that. I will fix the access date param as soon as possible (just busy with my studies as finals are fast approaching) --
401:
their real name), text of tweet, and the date/time it is created (which then it uses the native PHP library/class for dealing with conversions, DateTime.)
21: 159: 94: 147: 89: 124: 171: 555: 583:
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete.
109: 27: 165: 264: 781: 742: 716: 686: 647: 629: 607: 564: 549: 501: 376: 104: 153: 774: 738: 709: 682: 643: 625: 600: 545: 497: 369: 141: 99: 635: 84: 796: 47: 530: 337: 214: 514: 390: 320: 252: 760: 734: 727: 700: 694: 678: 670: 639: 634:
So far so good overall, but I am pausing for the moment to work out one alarming bug discovered (
621: 617: 541: 506: 493: 382: 136: 463: 453: 425: 330: 242: 65: 17: 411: 526: 529:
pointed out to me that "TweetBot" could potentially be confused as being too close to
808: 559: 522: 470:(analyzing link with Twitter API to pull relevant info out of it, then replacing it). 288: 533:
and I did not realize at the time of filing that it may be confused with a Mac app
538:(ironically enough, I am writing this on a Mac and actually wrote the bot on one) 415: 223: 326:). The bot is intelligent enough to know the difference between bare URLs and 586: 355: 228: 236:
This bot converts tweet references from either bare links (so <ref: -->
534: 419: 386:
tag and swaps those out (leaving the ref tag alone of course) for
789:
The above discussion is preserved as an archive of the debate.
795:
To request review of this BRFA, please start a new section at
201: 46:
To request review of this BRFA, please start a new section at
482:(As a side note, I love the "trick or treat" themed sig 189: 183: 177: 119: 114: 79: 459:it would first check if there are any instances of 28:
Knowledge (XXG):Bots/Requests for approval/TweetBot
815:Approved Knowledge (XXG) bot requests for approval 262:Links to relevant discussions (where appropriate): 265:Knowledge (XXG):Bot_requests#Bare_Twitter_URL_bot 620:, I will start the trial as soon as possible. -- 40:The following discussion is an archived debate. 8: 278:Approximately 4600-5000 mainspace articles 229:https://github.com/TheSandDoctor/TweetBot 7: 483: 276:Estimated number of pages affected: 200:02:18, Thursday, October 26, 2017 ( 35: 556:Knowledge (XXG):Changing_username 208:Automatic, Supervised, or Manual: 52:The result of the discussion was 751: 657: 577: 484: 53: 558:for how to request a rename. — 525:change TweetBot's username as 1: 743:23:52, 16 November 2017 (UTC) 717:13:49, 16 November 2017 (UTC) 687:05:43, 11 November 2017 (UTC) 410:For a specific example: take 782:17:59, 2 December 2017 (UTC) 648:06:11, 28 October 2017 (UTC) 630:03:55, 28 October 2017 (UTC) 608:00:25, 28 October 2017 (UTC) 565:20:22, 26 October 2017 (UTC) 550:20:18, 26 October 2017 (UTC) 502:03:05, 26 October 2017 (UTC) 377:02:27, 26 October 2017 (UTC) 831: 792:Please do not modify it. 43:Please do not modify it. 302:Already has a bot flag 535:sharing the same name 22:Requests for approval 215:Programming language 18:Knowledge (XXG):Bots 431:with the following: 289:Exclusion compliant 234:Function overview: 314:Function details: 26:(Redirected from 822: 794: 780: 777: 769: 768: 765: 755: 754: 731: 715: 712: 704: 698: 675: 669: 661: 660: 606: 603: 595: 594: 591: 581: 580: 562: 539: 519: 513: 491: 489: 488: 487: 468: 462: 458: 452: 430: 424: 395: 389: 375: 372: 364: 363: 360: 335: 329: 325: 319: 284:Mainspace only. 257: 251: 247: 241: 237:URL</ref: --> 194: 193: 57: 56: 45: 31: 830: 829: 825: 824: 823: 821: 820: 819: 805: 804: 803: 790: 776:Merry Christmas 775: 772: 766: 763: 762: 752: 725: 710: 707: 702: 692: 673: 667: 663:Trial complete. 658: 601: 598: 592: 589: 588: 578: 560: 537: 517: 511: 485: 481: 466: 460: 456: 450: 449:In the case of 428: 422: 393: 387: 370: 367: 361: 358: 357: 350: 333: 327: 323: 317: 270:Edit period(s): 255: 249: 245: 239: 139: 135: 130: 69: 54: 41: 33: 32: 25: 24: 12: 11: 5: 828: 826: 818: 817: 807: 806: 802: 801: 785: 784: 748: 747: 746: 745: 720: 719: 655: 654: 653: 652: 651: 650: 611: 610: 602:Trick or Treat 574: 573: 572: 571: 570: 569: 568: 567: 474: 473: 472: 471: 444: 443: 442: 441: 435: 434: 433: 432: 405: 404: 403: 402: 371:Trick or Treat 349: 346: 129: 128: 122: 117: 112: 107: 102: 97: 92: 87: 82: 80:Approved BRFAs 77: 70: 68: 63: 62: 61: 36: 34: 15: 14: 13: 10: 9: 6: 4: 3: 2: 827: 816: 813: 812: 810: 800: 798: 793: 787: 786: 783: 778: 771: 770: 758: 750: 749: 744: 740: 736: 735:TheSandDoctor 729: 728:Cyberpower678 724: 723: 722: 721: 718: 713: 706: 705: 696: 695:TheSandDoctor 691: 690: 689: 688: 684: 680: 679:TheSandDoctor 672: 664: 649: 645: 641: 640:TheSandDoctor 637: 633: 632: 631: 627: 623: 622:TheSandDoctor 619: 618:Cyberpower678 615: 614: 613: 612: 609: 604: 597: 596: 584: 576: 575: 566: 563: 557: 553: 552: 551: 547: 543: 542:TheSandDoctor 536: 532: 528: 524: 516: 508: 507:Cyberpower678 505: 504: 503: 499: 495: 494:TheSandDoctor 478: 477: 476: 475: 465: 455: 448: 447: 446: 445: 439: 438: 437: 436: 427: 420: 417: 413: 409: 408: 407: 406: 400: 392: 384: 383:Cyberpower678 380: 379: 378: 373: 366: 365: 352: 351: 347: 345: 341: 339: 332: 322: 315: 311: 309: 306: 303: 299: 297: 294: 291: 290: 285: 283: 282:Namespace(s): 279: 277: 273: 271: 267: 266: 263: 259: 254: 244: 235: 231: 230: 227: 225: 220: 218: 216: 211: 209: 205: 203: 199: 195: 191: 188: 185: 182: 179: 176: 173: 170: 167: 164: 161: 158: 155: 152: 149: 146: 143: 138: 137:TheSandDoctor 134: 126: 123: 121: 118: 116: 113: 111: 108: 106: 103: 101: 98: 96: 93: 91: 88: 86: 83: 81: 78: 76: 72: 71: 67: 64: 60: 51: 49: 44: 38: 37: 29: 23: 19: 791: 788: 761: 756: 701: 662: 656: 587: 582: 418:'s Twitter ( 398: 356: 342: 313: 312: 307: 304: 301: 300: 295: 292: 287: 286: 281: 280: 275: 274: 269: 268: 261: 260: 233: 232: 222: 221: 213: 212: 207: 206: 197: 196: 186: 180: 174: 168: 162: 156: 150: 144: 132: 131: 74: 66:TweetCiteBot 58: 42: 39: 554:Please see 531:WP:CORPNAME 416:Mick Jagger 224:Source code 198:Time filed: 703:CYBERPOWER 616:Thank you 521:page that 515:cite tweet 412:this tweet 391:cite tweet 348:Discussion 321:cite tweet 253:cite tweet 226:available: 210:automatic 190:ANI search 184:rights log 172:page moves 160:edit count 120:rights log 110:page moves 757:Approved. 671:dead link 527:Jonesey95 272:Periodic 178:block log 133:Operator: 115:block log 59:Approved. 809:Category 561:xaosflux 523:xaosflux 464:cite web 454:cite web 426:cite web 399:probably 331:cite web 305:(Yes/No) 293:(Yes/No) 243:cite web 148:contribs 90:contribs 20:‎ | 797:WT:BRFA 354:edits.— 48:WT:BRFA 767:POWER 764:CYBER 593:POWER 590:CYBER 362:POWER 359:CYBER 238:) or 95:count 16:< 739:talk 711:Chat 683:talk 644:talk 636:this 626:talk 546:talk 540:. -- 498:talk 338:here 298:Yes 219:PHP 217:(s): 166:logs 142:talk 125:flag 105:logs 85:talk 75:BRFA 414:on 381:Hi 310:No 248:to 202:UTC 154:SUL 100:SUL 811:: 741:) 685:) 674:}} 668:{{ 646:) 628:) 548:) 518:}} 512:{{ 500:) 492:-- 467:}} 461:{{ 457:}} 451:{{ 429:}} 423:{{ 394:}} 388:{{ 334:}} 328:{{ 324:}} 318:{{ 258:. 256:}} 250:{{ 246:}} 240:{{ 204:) 799:. 779:) 773:( 759:— 737:( 730:: 726:@ 714:) 708:( 697:: 693:@ 681:( 642:( 624:( 605:) 599:( 585:— 544:( 496:( 490:) 374:) 368:( 308:: 296:: 192:) 187:· 181:· 175:· 169:· 163:· 157:· 151:· 145:· 140:( 127:) 73:( 50:. 30:)

Index

Knowledge (XXG):Bots
Requests for approval
Knowledge (XXG):Bots/Requests for approval/TweetBot
WT:BRFA
TweetCiteBot
BRFA
Approved BRFAs
talk
contribs
count
SUL
logs
page moves
block log
rights log
flag
TheSandDoctor
talk
contribs
SUL
edit count
logs
page moves
block log
rights log
ANI search
UTC
Programming language
Source code
https://github.com/TheSandDoctor/TweetBot

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.