Shortcuts: WP:BRFA WP:RBA WP:B/RFA WP:RFBOT WP:RFBA

If you want to run a bot on the English Wikipedia, you must first get it approved. To do so, follow the instructions below to add a request. If you are not familiar with programming it may be a good idea to ask someone else to run a bot for you, rather than running your own.

Instructions for bot operators

Instructions for approvals group members

Current requests for approval

BU RoBOT 11

Operator: BU Rob13 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 04:25, Monday, April 11, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: AWB

Function overview: Substitute transclusions of a template to complete merges or carry out other consensus decisions at WP:TFD

Links to relevant discussions (where appropriate): Varies by template, but substitution will always follow a close as "merge", "substitute and delete", or similar at WP:TFD

Edit period(s): Multiple runs, based on need at WP:TFD/H and future TfD discussions

Estimated number of pages affected: Depends entirely on the template(s) involved. Could be anywhere from a couple hundred to tens of thousands in a single template.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Implementing consensus at TfD discussions often require semi-automated or automated work to merge/substitute templates. I've filed BRFAs in the past for simple substitution of wrappers to implement the consensus of various TfD discussions. See Task 8 for the most recent (including an example of the regex that would be used for this task). I believe there were others in the past as well. I'm hoping to get permission to substitute templates in that way more broadly, mostly to avoid clogging up BRFA with non-controversial bot tasks that are technically trivial. {{About3}} is currently pending substitution, and I'll use that for any trial if necessary (although you could also treat Task 8 as a large trial, which had no issues whatsoever).

I want to be very clear on what this task is not. This task will not be construed to be approval for anything involving adding/removing/changing/editing of parameters within a template using regex. It will not be used to invoke modules, as was the case in Task 2, since the use of a module is an extra layer of technical complexity that warrants scrutiny. It will only substitute transclusions of templates to implement a TfD consensus.

Discussion

To demonstrate need for this, {{Unicode}} needs substituting now, too (over 50,000 transclusions). ~ Rob^Talk 13:21, 12 April 2016 (UTC)
- I note that, if simple substitution is what you're after, you could just put the template into Category:Wikipedia templates to be automatically substituted and add it to User:AnomieBOT/TemplateSubster force. Anomie⚔ 18:45, 12 April 2016 (UTC)
  - @Anomie: Hmm, didn't even know that was a feature. Still, I think it's best to get this task approved because there may be situations where I need to substitute only in certain situations (i.e. if a template doesn't use parameter X, then substitute. Otherwise, do not and humans will come through later and take care of it.) Also, how would your bot handle a situation where 50,000+ transclusions need substituting? Would it "crowd-out" the type of work your bot usually does? ~ Rob^Talk 19:03, 12 April 2016 (UTC)
    - It shouldn't crowd out other work. Even the tasks on the same task runner shouldn't be affected too badly, since I usually code the jobs to yield after running for about 5 minutes. Anomie ⚔ 19:22, 12 April 2016 (UTC)

This is asking for blanket approval, however the most recent bot trial on this account still contains errors that could be caught with dry-runs (e.g., this trial had this bug twice within the first 10 edits) 10 edits is the absolute bare-minimum to check for unforeseen edge cases before allowing a bot to make edits. If we're being asked here to grant blanket approval, I'd only be comfortable supporting such a thing when at least a couple of conditions are met:

Trial safety — That is, it's clear the owner's proactive with testing and debugging their own code before "putting it on production." This means there's dry-run testing and/or manually-approved-edits happening, and that there's no obvious evidence to the contrary. A reasonably sized sample (basically think 25-50 edits—the same as most official bot trials) before allowing the bot to edit unsupervised is a decent chunk.
BRFAs by the owner develop a clear trend of being low-maintenance — you're essentially saying "I'm confident that I can bypass BAG, who'd otherwise check that the bot's going to run reasonably safely and within the bounds of site policies and guidelines, because I've taken clear steps and have sufficient wisdom to avoid needing them." There needs to be a clear trend of evidence to support that.

I'm concerned these are not yet met. For example, the most recent bot trial on this account contained errors that clearly would have been caught with a reasonable supervised run (i.e., this trial had this bug twice within the first 10 edits). While I'm fully aware that bugs can happen even after extensive testing, I feel this is evidence toward the lack thereof and therefore evidence for a continued need to involve BAG in testing before edits are made live—at least until a few more BRFAs show otherwise. In the grand scheme of things, it's not a huge bug, but it's a reflection of bug-detecting/bug-prevention methodology, which is at the core of a blanket approval.

--slakr^\ talk / 05:22, 16 April 2016 (UTC)

@Slakr: The regex that I'm requesting to use here does not change, however, and I've published it in Task 8. The only thing that changes is the template name. I'm asking for blanket approval to use that exact same regex with zero modifications to its structure on more templates. There were zero errors in that bot run (and its trial). The only situation in which I might use different regex is to restrict the sample, but that doesn't change the edits on the page, only what pages are used. If a mistake is made there, it merely results in less edits. Ultimately, it depends how BAG wants to handle things. I'm happy to spam trivial tasks that all look the same here, but it's just going to crowd up this process even more.

As a side note, I think we have different opinions on what trials are. I consider the trial the supervised run. I directly supervise the first 10 edits or so and then, if all is well, let the next 40 run and check them immediately afterward. This is mostly because I don't care to have the bot edits on my account and I can't, under bot policy, do a supervised run on my bot account before the trial. Obviously, I would check the first edits made by my bot when I swapped out template names, although it really doesn't change anything substantive. Better safe than sorry. ~ Rob^Talk 12:49, 16 April 2016 (UTC)

By the way, if blanket approval for these simple substitutions is not possible, please approve a trial here for just substituting {{Unicode}} in this BRFA to save us all at least a little time. ~ Rob^Talk 15:29, 18 April 2016 (UTC)

A user has requested the attention of a member of the Bot Approvals Group. Once assistance has been rendered, please deactivate this tag. ~ Rob^Talk 06:41, 22 April 2016 (UTC)

Matthewrbot

Operator: Matthewrbowker (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 17:26, Friday, April 1, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available: Bot Source Code, Web-based tool source code

Function overview: Takes requests from a web-based form and places it on the appropriate subpage of Wikipedia:Requested Articles.

Links to relevant discussions (where appropriate): WT:RA

Edit period(s): Every half-hour (If there are requests pending)

Estimated number of pages affected: Wikipedia:Requested articles and sub-pages

Exclusion compliant (Yes/No): Not for this task, there is no need for exclusion compliance

Already has a bot flag (Yes/No):

Function details: This bot will take requests posted on a web-based form. It will sanitize the input to work with {{article request}} then post the request directly above {{User:Matthewrbot/Requests}}. If the template is not found, the bot will place the request at the bottom of the page and add the pages to Category:Requested Articles Pages with no template.

It will not re-add a request once it has been removed. The form itself contains a honeypot and eventually a Captcha based on Mediawiki's system.

Discussion

Note: Bot is not yet complete, I am still working on building it. Wanted to start the BRFA because it is a non-traditional request and I wanted to give time to handle concerns.

External Loads

What is this sending to bootstrapcdn? — xaosflux ^Talk 20:32, 1 April 2016 (UTC)

Nothing is sent to bootstrapcdn. The bootstrap styling is retrieved from the bootstrap cdn, as bootstrap hasn't designed their repo to allow for git submoduling. ~ Matthewrbowker ^{Drop me a note} 20:56, 1 April 2016 (UTC)

@Xaosflux: As of this commit, BootstrapCDN is no longer used. ~ Matthewrbowker ^{Drop me a note} 00:30, 5 April 2016 (UTC)

Thank you! — xaosflux ^Talk 00:55, 5 April 2016 (UTC)

Matthewrbowker it looks like your landing web page is explicitly sending to third parties again (code.jquery.com , maxcdn.bootstrapcdn.com) - is this the long term solution? — xaosflux ^Talk 19:22, 5 April 2016 (UTC)

@Xaosflux: Which one are you looking at? I don't believe I've deployed the fix to the live form yet. ~ Matthewrbowker ^{Drop me a note} 19:59, 5 April 2016 (UTC)

This link. — xaosflux ^Talk 20:09, 5 April 2016 (UTC)

That is the live version of the tool. Updated. ~ Matthewrbowker ^{Drop me a note} 23:28, 5 April 2016 (UTC)

Page configurations

@Matthewrbowker: I'm a little concerned that anyone can edit the tool at User:Matthewrbot/Config/1/interface/all. The idea is cute, but it clearly appears to allow arbitrary html injection, which is probably a significant security and privacy risk to our users. --slakr^\ talk / 02:46, 2 April 2016 (UTC)

@Slakr: A concern of mine as well. I contacted an admin via IRC several months ago for cascading semi-protection, but was told that the protection is unlikely to be applied unless I can demonstrate vandalism. Would caching of the strings solve this concern? Alternatively, I can move them into xml files on the tool itself. P.S. Did I handle the template right? If not, my apologies ~ Matthewrbowker ^{Drop me a note} 03:25, 2 April 2016 (UTC)

@Matthewrbowker: Cascading semi protection is not permitted because it's a security hazard. A plain full protection may be better if that "control" page has security implications.Jo-Jo Eumerus (talk, contributions) 17:55, 3 April 2016 (UTC)

There's some precedent for this sort of thing (meta:www.wikipedia.org template), but it still makes me uncomfortable. Besides, if one of us full-protects the pages then you won't be able to edit them. Also, wouldn't this mean the tool is constantly fetching pages from on-wiki whenever it's loaded? While caching could help, that's still an inherently expensive operation. I suggest taking the configuration off-wiki. — Earwig ^talk 21:36, 3 April 2016 (UTC)

@Jo-Jo Eumerus: @The Earwig: Acknowledged. I'm working on a quick patch that should be pushed tonight. It will move the configuration local. ~ Matthewrbowker ^{Drop me a note} 03:47, 4 April 2016 (UTC)

@Slakr: @Jo-Jo Eumerus: @The Earwig: Fixed in this commit Fixed version has been pushed to the test version of the tool. ~ Matthewrbowker ^{Drop me a note} 07:10, 4 April 2016 (UTC)

We can change the content model of this page to .js then it will be protected - would that work? (re: User:Matthewrbot/Config/1/interface/all) — xaosflux ^Talk 20:09, 5 April 2016 (UTC)

Example User:Matthewrbot/Config/1/interface/all/2. — xaosflux ^Talk 20:12, 5 April 2016 (UTC)

Hmm, you will have to log on with the bot's account to change that now though - that locks it to page owner and admins. — xaosflux ^Talk 20:14, 5 April 2016 (UTC)

A thought perhaps, the concern with the editable pages was allowing experienced users to edit the tool. As of right now, the local configuration is functional. ~ Matthewrbowker ^{Drop me a note} 23:28, 5 April 2016 (UTC)

@Xaosflux: Whoa, how did you do that...? — Earwig ^talk 04:06, 13 April 2016 (UTC)

Off site privacy

What type of privacy policy is in place here? As you are soliciting usernames, and have access to request and address information. — xaosflux ^Talk 00:56, 5 April 2016 (UTC)

@Xaosflux: See Labs Terms of use. I do not have access to IP addresses (they are stripped from the logs), so only username and request data is stored. ~ Matthewrbowker ^{Drop me a note} 19:03, 5 April 2016 (UTC)

Sample outputs

New question: What will the output on to wiki look like, can you make a post manually for example purposes? — xaosflux ^Talk 20:09, 5 April 2016 (UTC)

Using {{Article request}}, see User:Matthewrbot/example1 (Headings have different examples) ~ Matthewrbowker ^{Drop me a note} 23:28, 5 April 2016 (UTC)

The web form seems to have an extensive category selector - will that be posted on wiki as well? — xaosflux ^Talk 21:32, 6 April 2016 (UTC)

@Xaosflux: Pages will be in the following form: "Wikipedia:Requested Articles/[category]/[sub-category]/sub-sub category]." If the sub-sub category is "other" it is chopped off. This does require re-structuring the existing RA sub pages. ~ Matthewrbowker ^{Drop me a note} 21:52, 6 April 2016 (UTC)

Are there any rate limits to prevent someone flooding the tool? --slakr^\ talk / 04:38, 12 April 2016 (UTC)

@Slakr: The web-based form has no rate limiting as of yet, as I don't have an ability to really distinguish different users (Again, I don't have access to IPs). The bot will edit at a rate of one request every five seconds. ~ Matthewrbowker ^{Drop me a note} 05:27, 12 April 2016 (UTC)

Any plans to add a captcha of some form? --slakr^\ talk / 05:52, 16 April 2016 (UTC)

Yes, it's in the works. I have to write my own solution, as there's currently no captcha solution for labs (specifically one that's compatible with the ToU, as far as I know). ~ Matthewrbowker ^{Drop me a note} 05:55, 16 April 2016 (UTC)

Bots in a trial period

Cyberbot II 5a

Operator: Cyberpower678 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 01:46, Tuesday, March 15, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available: here

Function overview: Addendum to 5th task. Cyberbot will now review links and check to see if they are dead. Based on the configuration on the config page, Cyberbot will look at a link and retrieve a live status from the given source. It will update a DB value, default 4, about that link.

Links to relevant discussions (where appropriate): none

Edit period(s): continuous

Estimated number of pages affected: Analyzes 5 million articles, the initial run will probably affect half of that.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: When the bot checks a link it runs that value against the bot's DB, and assigns it a value from 0 to 4. 0 represents the site being dead, 1-3 represents the site being alive and 4 indicates an unknown state and is the default value. Every pass the bot makes over a URL, if the URL is found to be dead at that moment, the integer is decreased by 1. If found to be alive, the value gets reset to 3. If it is 0, it no longer checks if it is alive, as a dead site found to be dead at least 3 times, is most likely going to remain dead and thus the bot will conserve resources.—^cyberpower_Chat:Online 01:46, 15 March 2016 (UTC)

Discussion

Checking if a link is really dead or not is a million dollar question because of soft 404s which are common. There is a technique for solving this problem described here and code here (quote):

Basically, you fetch the URL in question. If you get a hard 404, it’s easy: the page is dead. But if it returns 200 OK with a page, then we don’t know if it’s a good page or a soft 404. So we fetch a known bad URL (the parent directory of the original URL plus some random chars). If that returns a hard 404 then we know the host returns hard 404s on errors, and since the original page fetched okay, we know it must be good. But if the known dead URL returns a 200 OK as well, we know it’s a host which gives out soft 404s. So then we need to test the contents of the two pages. If the content of the original URL is (almost) identical to the content of the known bad page, the original must be a dead page too. Otherwise, if the content of the original URL is different, it must be a good page.

-- GreenC 04:40, 27 March 2016 (UTC)

Hi @Green Cardamom:. That is a good point. We've discussed this and decided, for now, to not check for soft 404s. It's never going to be 100% reliable. So for now, we're checking for: hard 404s (and other bad response codes) and redirects to domain roots only. It's less than optimal, but at least we can be sure we don't end up tagging non-dead links as dead. It turns out it's quite easy for a search engine or big web scrapers to detect soft 404s and various other kinds of dead links (ones replaced by link farms etc.). For this reason, we're seeking Internet Archive's help on this problem. They've been very helpful so far and promised to look into this and share their code/open an API for doing this. -- NKohli (WMF) (talk) 03:16, 28 March 2016 (UTC)

That would be super to see when available as I could use it as well. Some other basic ways of detecting 404 redirects is to look for these strings in the new path (mix case): 404 (eg 404.htm, or /404/ etc), "not*found" (variations such as Not_Found etc), /error/, . I've built up a database of around 1000 probable soft 404 redirects and can see some repeating patterns across sites. It's very basic filtering, but catches some more beyond root domain. -- GreenC 04:10, 28 March 2016 (UTC)

Awesome, thanks! I'll add those filters to the checker. -- NKohli (WMF) (talk) 04:23, 28 March 2016 (UTC)

Are the checks spaced out a bit? Something could be down for a few days and then come back up for a while. Also, can we clarify the goal here; is this to add archival links to unmarked links, or to tag unmarked links as dead which have no archival links, or to untag marked-as-dead links? — Earwig ^talk 05:57, 29 March 2016 (UTC)

Cyberbot can do all three, but the onwiki configuration only allows for the first two. Since Cyberbot is processing a large wiki, the checks are naturally spaced out.—^cyberpower_Chat:Online 14:44, 29 March 2016 (UTC)
{{BAGAssistanceNeeded}} Can we move forward with this?—^cyberpower_Chat:Online 14:03, 5 April 2016 (UTC)

"naturally spaced out" I would want some sort of minimum time here in the system... ·addshore· ^{talk to me!} 06:58, 8 April 2016 (UTC)

I can program it wait at least a day or 3 before running the check again. That would give the link 3 or 9 days, in case it was temporarily down.—^cyberpower_Chat:Online 15:31, 8 April 2016 (UTC)

Let's try 3 days of spacing. Is it easy to trial this component as part of the bot's normal runtime? Can you have it start maintaining its database now and after a week or two we can come back and check what un-tagged links it would have added archival links for or marked as dead? — Earwig ^talk 23:28, 9 April 2016 (UTC)

Unfortunately the bot isn't designed that way. If the VERIFY_DEAD setting is off, it won't check anything, nor will it tag anything. If it's on it will do both of those things. I can create a special worker to run under a different bot account so we can monitor the edits more easily.—^cyberpower_{Chat:Limited Access} 23:36, 9 April 2016 (UTC)

How often does the bot pass over a URL? (Ignoring any 3-day limits.) In other words, are you traversing through all articles in some order? Following transclusions of some template? — Earwig ^talk 01:32, 10 April 2016 (UTC)

Ideally, given the large size of this wiki, there would be unique workers each handling a list of articles beginning with a specific letter. Due to technical complications, there is only one worker that traverses all of Wikipedia, and one that handles only articles with dead links. So it would likely hit each URL much longer than 3 days, until the technical complication is resolved. What I can do is startup the checking process, and compile a list of urls that have a dead status of 2 or 1, which mean the URL failed the first and/or seconds passes.—^cyberpower_{Chat:Limited Access} 02:10, 10 April 2016 (UTC)

That's similar to what I meant by "Can you have it start maintaining its database now...", though as you suggest it might make more sense to check what's been identified as dead at least once so we don't need to wait forever. Okay, let's try it. Approved for trial (14 days, 0 edits). — Earwig ^talk 19:50, 10 April 2016 (UTC)

DB Results

In an effort to more easily show what is going on in Cyberbot's memory, I have compiled a list of URLs with a live status of 2 or 1, which indicate they have failed their first, or second, pass respectively.

DB results
Wikipedia:Bots/Requests for approval/Cyberbot II 5a/DB Results

I've looked through the first chunk of these results. It looks like there are several false positives. The 2 most common types appear to be:

Redirects that add or remove the 'www' hostname. This is bug in the soft-404 detection code. I'll create a Phabricator ticket for it.
Timeouts. Several pages (and especially PDFs) seem to take longer than 3 seconds to load. We should consider increasing the timeout from 3 seconds to 5 or 10 seconds. We should also just exclude PDFs entirely. I gave up on http://www.la84foundation.org/6oic/OfficialReports/1924/1924.pdf after waiting 3 minutes for it to load.

There are also some weird cases I haven't figured out yet:

http://au.eonline.com/news/386489/2013-grammy-awards-winners-the-complete-list sometimes returns a 405 Method Not Allowed error and sometimes returns 200 OK when accessed via curl. In a browser, however, it seems to always return 200 OK.
http://gym.longinestiming.com/File/000002030000FFFFFFFFFFFFFFFFFF01 always returns a 404 Not Found error when accessed via curl, but always returns 200 OK from a browser.

I confirmed that these are not related to User Agent. Maybe there is some header or special cookie handling that we need to implement on the curl side. Kaldari (talk) 00:28, 14 April 2016 (UTC)

Accord to Cyberpower, the bot is actually using a 30 second timeout (and only loading headers). I'll retest with that. Kaldari (talk) 00:45, 14 April 2016 (UTC)

Timeouts should be handled in a sane and fail-safe way. If something times out, any number of things could be going on, including bot-side, host-side, and anything in between. Making a final "time to replace this with an archive link" is premature if you're not retrying these at least a couple of times over the course of several days. Also, you might try to check content-length headers when it comes to binaries like PDFs. If you get back a content-length that's over 1MB or content-type that matches the one you're asking for (obviously apart from things like text/html, application/json), chances are the file's there and valid—it's highly unlikely that it's a 404 masquerading as a 200. Similarly, if an image request returns something absurdly tiny (like a likely transparent pixel sorta thing), it might also be suspicious. --slakr^\ talk / 04:14, 16 April 2016 (UTC)

Actually, it looks like yields two back-to-back 301 redirects. Following 5 redirects is sufficiently enough for most likely 99.99% of links I would guess. For example, if you're using curl, it's most likely CURLOPT_FOLLOWLOCATION + CURLOPT_MAXREDIRS, or on the command line, curl -L --max-redirs 5. --slakr^\ talk / 06:03, 16 April 2016 (UTC)

I'm not sure I follow with the timeouts. If it is a temporary think, the second pass will likely not timeout, and the status resets. When the bot checks a URL, it needs to receive a TRUE response 3 times consecutively, where each check is spaced apart at least 3 days, for it to be officially classified as dead and the bot to act on it.—^cyberpower_Chat:Offline 04:24, 16 April 2016 (UTC)

@Cyberpower678 and Slakr: The timeouts were a result of me testing URLs with checkDeadlink() which was the wrong function to test with, and having a very slow internet connection (since I'm in Central America right now). There should be no timeout issue with the actual bot as it's using a 30 second timeout and only downloading the headers. It looks like the real issue with http://www.la84foundation.org/6oic/OfficialReports/1924/1924.pdf is the user agent string, which will be fixed by [1]. As soon as you pass it a spoofed user agent, it returns a 200. I still have no idea what's happening with http://www.eonline.com/au/news/386489/2013-grammy-awards-winners-the-complete-list, though. I'm not sure how it's returning a different status code for curl than for web browsers (although it isn't 100% consistent). Kaldari (talk) 15:56, 18 April 2016 (UTC)

There could be bot detection mechanisms at work. Google bot detection and mitigation. Some techniques to fool remote sites you are not a bot. A legitimate looking agent string helps, not making too many repeat requests of the same site, not too fast. -- GreenC 14:48, 23 April 2016 (UTC)

Cyberbot only scans each page every 3 days. That should be spaced apart far enough.—^cyberpower_{Chat:Limited Access} 15:28, 23 April 2016 (UTC)

KharBot

Operator: Kharkiv07 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 23:50, Wednesday, March 30, 2016 (UTC)

Automatic, Supervised, or Manual: Supervised

Programming language(s): Python

Source code available: No... willing to provide if issues arise

Function overview: Publishing The Signpost

Links to relevant discussions (where appropriate): Fully approved and endorsed by The Signpost's Editorial Board

Edit period(s): Weekly

Estimated number of pages affected: 15-ish per week directly (depending on the article count), and a few thousand

Exclusion compliant (Yes/No): No, but obviously Category:Opted-out of message delivery works

Already has a bot flag (Yes/No): No

Function details: This bot is replacing LivingBot in publishing The Signpost, but with a few changes. It starts by moving articles in Category:Next Signpost issue do the weeks version and cleaning them up a bit, then it makes the new main page, edits a few pages in The Signpost's namespace, and sends out the issue (via mass message and email). The bot can't do much damage during a trial except for the mass message, so if you want it to output the message and have me send it by hand to make sure it's running ship-shape, that's perfectly fine with me. Kharkiv07 (T) 23:50, 30 March 2016 (UTC)

Discussion

Is there a link to the discussion elsewhere - it will be very important to ensure that this bot is not fully operational without shutting down the functions of the existing bot. — xaosflux ^Talk 00:49, 31 March 2016 (UTC)

Jarry1250 please provide your comments below. — xaosflux ^Talk 00:50, 31 March 2016 (UTC)

@Xaosflux: Jarry hasn't been around much (Editorial Board members have been in contact with him off-wiki), and his bot is run completely supervised and only when it's activated; everyone who has the password to activate it is aware of the switch. Pinging editors-in-chief Gamaliel and Go Phightins! to make them aware of this request. Kharkiv07 (T) 03:09, 31 March 2016 (UTC)

We're all completely in support of this. Gamaliel (talk) 03:12, 31 March 2016 (UTC)

Yup. Go Phightins! 21:05, 1 April 2016 (UTC)

@Xaosflux: Is there anything you can do for us? I need to get an issue out today and LivingBot is acting up... I can confine the bot completely to Wikipedia:Wikipedia Signpost and its subpages and do anything with mass effect (email, mass-message) by hand. I know these things shouldn't be rushed but the hour it takes for three people to manually publish is something I want to avoid :) Kharkiv07 (T) 16:16, 1 April 2016 (UTC)

Approved for trial (50 edits or 1 days). — xaosflux ^Talk 17:37, 1 April 2016 (UTC)
Kharkiv07 LIMITED Trial, may make up to 50 edits in Wikipedia space only, may send MMS to up to 50 recipients (I've enabled MMS for this test). If you have other MMS's send them traditionally. — xaosflux ^Talk 17:37, 1 April 2016 (UTC)

abuse filter stuff that is resolved
@Xaosflux: I'm getting trapped by abuse filter 10. Ideas? Kharkiv07 (T) 17:52, 1 April 2016 (UTC) Actually, I don't know what filter, just MediaWiki:Abusefilter-warning-badmove Kharkiv07 (T) 17:55, 1 April 2016 (UTC) Appears to be filter 68. Kharkiv07 (T) 17:57, 1 April 2016 (UTC) Make more than 12 edits with the account and try again. — xaosflux ^Talk 18:01, 1 April 2016 (UTC) Note had to be live edits, not just 'actions' - just have the bot edit it's sandbox a few times. — xaosflux ^Talk 18:03, 1 April 2016 (UTC)

abuse filter stuff that is resolved

@Xaosflux: I'm getting trapped by abuse filter 10. Ideas? Kharkiv07 (T) 17:52, 1 April 2016 (UTC)

Actually, I don't know what filter, just MediaWiki:Abusefilter-warning-badmove Kharkiv07 (T) 17:55, 1 April 2016 (UTC)

Appears to be filter 68. Kharkiv07 (T) 17:57, 1 April 2016 (UTC)

Make more than 12 edits with the account and try again. — xaosflux ^Talk 18:01, 1 April 2016 (UTC)

Note had to be live edits, not just 'actions' - just have the bot edit it's sandbox a few times. — xaosflux ^Talk 18:03, 1 April 2016 (UTC)

Trial complete. Okay... so things could have gone better, most of the framework worked right but the bot got tripped up on a few minor things, causing me to restart it a few times later in the code. While it did require some human helping hands to fix it, I can figure out the few mistakes that were really just stupid things on my part in time for another trial run next week. Feel free to pose any questions, and I'll post a detailed report of what went wrong once we evaluate the situation more completely. Kharkiv07 (T) 20:29, 1 April 2016 (UTC)

What went wrong:

99% of errors can be chalked up to me coding when tired and/or being a moron; for instance two escape characters used the wrong slash ("/" instead of "\"), and two variables had spelling errors. Unfortunately I wasn't able to do a full test run before I did this because of the (over)complex nature of The Signpost's templates, of which there are hundreds that interact with each other in odd ways. Fixing these things, of which there's a list, will fix most issues.
2 pages didn't get created, the code exists; however when I had to restart the bot due to typos I skipped those by mistake
The main page and issue page used the full page name (Wikipedia:Wikipedia Signpost/2016-04-01/News and notes) instead of what should have been done (News and notes). Easy fix.
Many of the templates assume that the date of publication is a Wednesday, which we're changing with KharBot. These templates are being re-done.
Finally the mass-message and emails (almost) worked, see the talk page of this page. The one flaw was the wrong date; which was simply the wrong variable by me.

All in all, this is a win for never have running this code in its entirety, I'm confident we can do this with no issue next week. Thanks for the quick response with the trial! Kharkiv07 (T) 23:48, 1 April 2016 (UTC)

Side note, when the moves were made redirects were created and subsequently deleted, but I knew this would be an issue until the bot can suppress redirects. Kharkiv07 ([[User talk:|T]]) 23:50, 1 April 2016 (UTC)

Okay well I'm thoroughly bemused as to the value of reimplementing LivingBot from scratch in a different programming language just to make a few tweaks, but since you've done that bit already I guess I don't have much to add! Though I do think you should publish the source code, as I did for LivingBot: it is important to maintain continuity of service when things go wrong. Just let me know when you're done and I'll archive my interface. (Although I'll keep the source on Github in case useful.) - Jarry1250 ^{[Vacation needed]} 14:25, 3 April 2016 (UTC)

Approved for extended trial (30 days). I'm extending your trial, but in the meantime will request the bot flag be added so you can complete trialing of the move w/o redirect task. Please report back after next run. — xaosflux ^Talk 13:56, 4 April 2016 (UTC)

WP:BN request for flag during extended trial made. — xaosflux ^Talk 14:01, 4 April 2016 (UTC)

Kharkiv07 A bot flag has been added for the extended trial, please report back after the next run. — xaosflux ^Talk 16:15, 4 April 2016 (UTC)

{{OperatorAssistanceNeeded|D}} @Kharkiv07: Any updates on this? — xaosflux ^Talk 21:14, 17 April 2016 (UTC)

I have a lot on my plate and will get back to you on this today or tomorrow. Kharkiv07 (T) 16:21, 18 April 2016 (UTC)

@Xaosflux: Sorry for the lack of response. I screwed up the last run by adapting the code for the first run and not changing it back. My bad. I ran a comprehensive trial run yesterday and everything looks 100% ready to go now. Request extension as the The Signpost's publication has been a little behind due to external circumstances. Kharkiv07 (T) 18:42, 21 April 2016 (UTC)

Certainly, I bumped the extension out - I was hoping this would be ready to close :D — xaosflux ^Talk 21:15, 21 April 2016 (UTC)

CheckBot 4

Operator: Omni Flames (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 09:07, Monday, April 4, 2016 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: Removes any mentions of "(retired)" in the "position=" parameter, in the Infobox footballer template.

Links to relevant discussions (where appropriate): Wikipedia talk:WikiProject_Football/Archive_102#Playing_position_.28retired.29

Edit period(s): One time run

Estimated number of pages affected: Approximately 500-1000

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: Simple task. As per discussion linked above, the use of "(retired)" is unnecessary. This bot will use a list from Category:Footballers to complete the task.

Discussion

Approved for trial (50 edits). — xaosflux ^Talk 13:53, 4 April 2016 (UTC)

Thanks, will run this soon. — Omni Flames (talk contribs) 10:05, 5 April 2016 (UTC)

When I try to make a list from Category:Footballers, it takes a long time to load and I haven't actually been able to make a list from it yet. Even when I try to use some of it's individual subcategories it takes a long time. Is there a better way to do this? — Omni Flames (talk contribs) 22:16, 10 April 2016 (UTC)

Apparently there are around 130,000 transclusions, identified at Special:WhatLinksHere/Template:Infobox_football_biography. Could you use the data from there (you could list 5000 at a time using Special:WhatLinksHere/Template:Infobox_football_biography&limit=5000, if that helps)? C 679 08:28, 14 April 2016 (UTC)

@Omni Flames: You might try seeding some hints from a site search on the rendered text (like this). It's not 100%, but it'll get you close plus throw in some slightly different ones to investigate as well. --slakr^\ talk / 03:42, 16 April 2016 (UTC)

A user has requested the attention of the operator. Once the operator has seen this message and replied, please deactivate this tag. (user notified) @Omni Flames: Do you still intend to trial this? — xaosflux ^Talk 15:10, 22 April 2016 (UTC)

@Xaosflux: Sorry for the delay in reply, I was away for a few days. I do still intend to trial this. I'm working on the regexes now, hopefully I'll be able to trial it in one or two days. Omni Flames ^{let's talk about it} 05:05, 23 April 2016 (UTC)

CheckBot 2

Operator: Omni Flames (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:53, Sunday, March 27, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: Monitors Category:Living people and replaces reference maintenance templates on BLP pages with the correct BLP template (e.g {{refimprove}} -> {{BLP sources}})

Links to relevant discussions (where appropriate):

Edit period(s): Weekly

Estimated number of pages affected: Approximately 25-50 pages a week.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: The bot will replace the following templates using a list of pages in Category:Living people:

Refimprove with BLP sources
Primary sources with BLP primary sources
Unreferenced section with BLP unsourced section
Refimprove section with BLP sources section
Self-published with BLP self-published

Note that the replacement of Unreferenced with BLP unsourced is already performed by BattyBot, however the other templates are not replaced. These tasks would be extremely tedious to do manually using AWB because the program only allows the creation of lists up to 25,000 pages long and Category:Living people has over 700,000 pages. This plugin allows the creation of larger lists, however it's only available to those with the "apihighlimits" permission (bots and admins). The bot will use AWB's find and replace feature to switch out the templates. For example, Template:Refimprove will be replaced Template:BLP sources by finding instances of Template:Refimprove using the regex {{refimprove(.+|)}}. It will replace these instances with {{BLP sources$1}} (which means that it will keep any parameters such as date=March 2016). It will skip any pages which do not contain any of the above templates or contain the {{Nobots}} template.

Discussion

Approved for trial (50 edits). Try to do ten per template. Also ping GoingBatty to take a look. — Earwig ^talk 20:18, 3 April 2016 (UTC)

@Omni Flames: The first four bullets above are included as part of AWB's general fixes, so you shouldn't need to use regex for those. I submitted a feature request for the developers to add the last one to general fixes. GoingBatty (talk) 21:40, 3 April 2016 (UTC)

@The Earwig:, Okay I'll try for 10 edits per template, and @GoingBatty: Oh, thanks, I didn't realize that :) — Omni Flames (talk contribs) 22:17, 3 April 2016 (UTC)

Okay, so I'm running the bot now and so far it's checked ~3000 pages. Yet it hasn't made a single edit. I'm not sure whether this is due to an error or my part or the fact that there aren't that many pages that need to be changed. Thoughts? — Omni Flames (talk contribs) 08:15, 6 April 2016 (UTC)

I think you pretty much answered your own question; if the problem exists and isn't being solved already, you might double check your logic. --slakr^\ talk / 03:35, 16 April 2016 (UTC)

FastilyBot 5

Operator: Fastily (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 07:14, Monday, March 7, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Java

Source code available: When I have written it

Function overview: Find files tagged with both a free and non-free license tag, and apply {{Wrong-license}} to the file description page.

Links to relevant discussions (where appropriate):

Edit period(s): Weekly

Estimated number of pages affected: 1-2k

Exclusion compliant (Yes/No): no

Already has a bot flag (Yes/No): yes

Function details: See above -FASTILY 07:14, 7 March 2016 (UTC)

Discussion

This will probably mostly return false positives: photos of non-free sculptures both need a non-free copyright tag for the sculpture and a licence from the photographer. Unfortunately, there currently doesn't seem to be a way to tag such files so that they do not end up in both Category:All free media and Category:All non-free media. --Stefan2 (talk) 11:25, 7 March 2016 (UTC)

True, but it should be fairly trivial to exempt tags which might lead to false positives (e.g. {{Non-free 3D art}}) from the result set -FASTILY 11:54, 7 March 2016 (UTC)

Which tags would appear in that list? --Stefan2 (talk) 12:43, 7 March 2016 (UTC)

I've started an ignore list here; please feel free to add any other titles you can think of! -FASTILY 04:43, 10 March 2016 (UTC)

Photos of non-free 3D artworks often (but not always) use {{photo of art}} to indicate the two copyright tags, so I added that template to the list. I don't know if the current list is complete. --Stefan2 (talk) 22:11, 12 March 2016 (UTC)

Should files tagged as {{NFUR not needed}} also be ignored? Jo-Jo Eumerus (talk, contributions) 18:31, 24 March 2016 (UTC)

I'd assume so. Those files have been identified as free, but the fair use rationale needs to be converted to {{information}} and remains on the page since it provides some information of the file. --Stefan2 (talk) 21:44, 24 March 2016 (UTC)

@Stefan2 and Fastily: Do we feel that we can keep the false positive number low if we ran this through a small trial or two for adjusting the ignore list? If by the second one it's looking abysmal, we could just abort the idea. --slakr^\ talk / 03:24, 24 March 2016 (UTC)

There are currently 278 files which appear in Category:All free media and Category:All non-free media and which do not have any of the tags in User:FastilyBot/Task5Ignore. I took 20 random files from that set and checked them.

These files should not appear in both categories, and the file information pages should be edited to remove them from one of the cats:

These files appear in both categories, but should not be tagged with {{wrong license}} by FastilyBot for one reason or another:

File:2 euro mo series1.gif - different copyright tags refer to different parts of the image
File:Internet Explorer 4.png - different copyright tags refer to different parts of the image
File:Opera 7.02.png - different copyright tags refer to different parts of the image
File:Netscape9.png - different copyright tags refer to different parts of the image
File:Peruvian Airlines White Logo.jpg - already has {{wrong license}}, no need to add a second one
File:Internet Explorer 8 InPrivate.png - different copyright tags refer to different parts of the image
File:Nintendo - 1950.png - different copyright tags refer to different countries

It looks as if the number of false positives will go down a lot of {{Copyright by Wikimedia}} is added to the list of exempted templates as it seems that many false positives are screenshots which show a non-free web browser and a Wikipedia page. I'll go through the first set of files in my list above and fix them. --Stefan2 (talk) 00:33, 25 March 2016 (UTC)

Approved for trial (50 edits). While I'm not too thrilled about the prospect of false positives, it looks like there's a belief it might be mitigated with proper whitelisting. I'd strongly recommend a dry run. --slakr^\ talk / 02:21, 2 April 2016 (UTC)

Agreed, I'll do a few dry runs and try to refine the rule set before actually making any edits. -FASTILY 23:45, 3 April 2016 (UTC)

Nyubot

Operator: Nyuszika7H (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 17:32, Monday, February 22, 2016 (UTC)

Automatic, Supervised, or Manual: Manual

Programming language(s): AutoWikiBrowser

Source code available: AWB

Function overview: Moving articles to the appropriate subcategory of Category:Lists of television series episodes, removing redundant parent categories

Links to relevant discussions (where appropriate): WT:TV#Category clean up recruitment - Category:Lists of television series episodes

Edit period(s): Continuous

Estimated number of pages affected: Probably around 1,000–2,000 at most

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): No

Function details:

Remove Category:Lists of television series episodes from articles that already contain one of its subcategories
~~Replace Category:Lists of children's television series episodes with Category:Lists of Disney shows' episodes‎ where applicable~~
Move pages containing both Category:Lists of American television series episodes and Category:Lists of comedy television series episodes to Category:Lists of American comedy television series episodes
Add pages to country-specific subcategories of Category:Lists of television series episodes

Discussion

I think it does not need to be exclusion compliant as all edits are manually checked before saving. But it can be if that's preferred. nyuszika7h (talk) 17:32, 22 February 2016 (UTC)

On #4 I have a feeling that the country-specific versions are likely non-diffusing (i.e., it's fine for the pages to be both in a genre / the parent and be categorized by country). It'd be a pain to try to navigate the category tree by country+genre instead of country and/or genre. #2 might be non-diffusing, as well, as a "Disney show" isn't inherently only for kids (e.g., take a peek at List of channels owned by Disney). --slakr^\ talk / 04:27, 25 February 2016 (UTC)

@Slakr: I wasn't suggesting removing country-specific ones from genre categories (it's only about adding a country category where none exists), but I don't think they need to be in the main category. As for Disney, I realized that most of them are actually in both the children's and Disney categories. The latter is currently a subcategory of a former, which is okay for most cases as the target audience tends to be kids (doesn't mean it's only for kids), but I guess that subcategorization could be removed if there are shows with a different target audience. nyuszika7h (talk) 09:47, 25 February 2016 (UTC)

How would you do it by bot, though? (1) should be fine, but I agree that (2) is going to be tricky; let's keep the genre and company subcats separate. — Earwig ^talk 04:03, 28 February 2016 (UTC)

{{OperatorAssistanceNeeded}} @Nyuszika7H: ^ --slakr^\ talk / 03:25, 24 March 2016 (UTC)

Actually, I'll just do #2 manually (at least add the Disney category if it's missing somewhere). For #4, I'm not sure yet, it was one of the requested things. I might have some ways to infer that, but it might end up having to be more manual checking and adding with HotCat – in the latter case, can I still use the bot account for that? Though #2 is kinda the same kind of inferring too, so I don't know. But if preferred, I can just use main account for those, I won't edit a lot at once like that anyway. nyuszika7h (talk) 10:25, 24 March 2016 (UTC)

Approved for trial (25 edits). You can use whatever account you'd like for hotcat so long as you're making supervised edits (i.e., you're clearly reviewing each one for accuracy before commiting the change). We're mainly concerned about ensuring the bot doesn't screw up. Humans hate having to clean up after bots, and humans who make bots hate having to make their own bots to undo the changes of a bad bot. --slakr^\ talk / 02:16, 2 April 2016 (UTC)

@Nyuszika7H: ^ --slakr^\ talk / 03:31, 16 April 2016 (UTC)

@Slakr: Sorry for the delay, I'll do the trial later today. nyuszika7h (talk) 08:13, 16 April 2016 (UTC)

FastilyBot 8

Operator: Fastily (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:42, Friday, April 1, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Java

Source code available: Once I have written it

Function overview: Finds files flagged with {{Nominated for deletion on Commons}} that have been deleted on Commons and applies {{Deleted on Commons}} accordingly.

Links to relevant discussions (where appropriate): n/a

Edit period(s): Bi-weekly

Estimated number of pages affected: 5-10 per run

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: As described above. -FASTILY 00:42, 1 April 2016 (UTC)

See also: Wikipedia:Bots/Requests for approval/FastilyBot 7 -FASTILY 00:42, 1 April 2016 (UTC)

Discussion

Approved for trial (100 edits). — xaosflux ^Talk 10:54, 1 April 2016 (UTC)

Please ensure that the {{deleted on Commons}} template always contains the file name. For example, FastilyClone removed the filename in Special:Diff/713468926. Since {{deleted on Commons}} may remain on pages for many years, we should not rely on the local file not being moved. --Stefan2 (talk) 17:50, 11 April 2016 (UTC)

APersonBot 6

Operator: APerson (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 22:10, Saturday, March 5, 2016 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: https://github.com/APerson241/APersonBot/blob/master/wp-go-archiver/wp-go-archiver.py

Function overview: The bot archives WP:GO to a subpage and clears it out for a new week.

Links to relevant discussions (where appropriate):

Edit period(s): Weekly

Estimated number of pages affected: 1

Exclusion compliant (Yes/No): n/a

Already has a bot flag (Yes/No): Yes

Function details: The bot follows the directions at Template:Editnotices/Page/Wikipedia:Goings-on to archive WP:GO.

Discussion

Approved for trial (15 days). Very straight forward, report back after running if there were any issues or complaints. — xaosflux ^Talk 00:24, 6 March 2016 (UTC)

@APerson: It would be nice if the bot would check whether the page is already archived (on a certain week) or not, so it could avoid double archiving. (like there) Also if the bot follows Template:Editnotices/Page/Wikipedia:Goings-on, than it should archive at 00:00:01 and not more than 2&1/2 hours later. Armbrust ^{The Homunculus} 13:28, 13 March 2016 (UTC)

Most recent run has completed without errors; the bot was 3 minutes late due to some MediaWiki-enforced timeouts that I only saw later in the logs. APerson (talk!) 00:07, 20 March 2016 (UTC)

Marking as

Trial complete. --slakr^\ talk / 03:33, 24 March 2016 (UTC)

@APerson: I take it the two moves on the 13th issue has been sorted? --slakr^\ talk / 02:52, 29 March 2016 (UTC)

Slakr, yes, I fixed it. The cron configuration was a bit screwy, which resulted in the task being run quite late. As you can see from the last two runs, it's been doing quite fine. APerson (talk!) 02:59, 29 March 2016 (UTC)

Approved for extended trial (30 days). Juuust in case. I think 14 days (on something that gets archived only twice during that period) was probably a little short. This should hopefully give a better sample, though I don't foresee any major issues if everything's fixed. =) --slakr^\ talk / 03:24, 29 March 2016 (UTC)

I guess I should post an explanation on here about why the most recent run was a bit late: everything went well (the cron job successfully found the file, which is an improvement over last time) except I forgot to chmod +x the actual shell file (which we're using for the first time this week). Anyway, everything should be working perfectly next time. APerson (talk!) 03:31, 3 April 2016 (UTC)
@APerson: For some reason the bot didn't archive the page today at all. Any idea, why? Armbrust ^{The Homunculus} 22:59, 17 April 2016 (UTC)

Considering that the bot worked perfectly last week, I have no idea. I haven't looked at the logs yet, because I'm away from a computer that can SSH at the moment. I'll be back with an answer tomorrow. I suspect there was something funny going on with login sessions on tools, but I have no idea. APerson (talk!) 03:24, 18 April 2016 (UTC)

Armbrust, I just confirmed that the fault was with the login session, not with the bot's code. I've logged it in again; it should be working fine for next week. Interestingly enough, task 5 also encountered some screwiness with login sessions, but there it was confirmed that login sessions were an entirely one-time issue. I hope that's also the case here. APerson (talk!) 02:01, 19 April 2016 (UTC)

Bots that have completed the trial period

GreenC bot

Operator: Green Cardamom (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 16:29, Sunday, March 13, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): ~~AWK~~ Nim

Source code available: WaybackMedic on GitHub

Function overview: Fix known problems with Internet Archive wayback machine links and page formatting errors introduced by Cyberbot IABot between December 2015 and March 2016.

Links to relevant discussions (where appropriate):

Edit period(s): one time run

Estimated number of pages affected: est. 20k pages of ~100k checked (the corpus of all articles edited by Cyberbot IABot from 20151231 to 20160310).

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: User:Green Cardamom/WaybackMedic lists details

Discussion

Do to the large scope, this will likely require multiple trials, and a community response period. Sometimes these are easier to show as demonstrations, so your first small trial is approved, please post results below when ready. — xaosflux ^Talk 17:00, 13 March 2016 (UTC)

Approved for trial (50 edits). — xaosflux ^Talk 17:00, 13 March 2016 (UTC)

During development a trial run was made in full manual mode. Checked each edit and verified using offline tools, then uploaded via AWB. It processed the first 500 articles edited by Cyberbot (starting Dec 14 2015). Of those it found corrections were needed in 94. The 94 edits can be seen [2] starting 11 March at 7:21pm with a subject line "wayback medic using AWB". If this trial run is acceptable I can run the next 250 article batch (they have to be done in batches) which should roughly correspond to 50 edits. -- GreenC 19:31, 13 March 2016 (UTC)

OK, also - added AWB "bot" access. As your account is not yet flag for botedit, please limit use a AWB delay rate of 10. — xaosflux ^Talk 21:15, 13 March 2016 (UTC)

Alright thanks. I've been in touch with Internet Archive and they provided documentation on a new version of their API to use so once I get that coded and tested, will run the next batch of 250 articles (~50 edits) using GreenC bot. -- GreenC 21:50, 13 March 2016 (UTC)

@Green Cardamom: It looks like {{Dead link|bot=...}} is a thing. I dunno if that param is truly critical in the grand scheme of things, but I'd suggest supplying it with the bot's username. Also, pinging @Cyberpower678: into the loop. --slakr^\ talk / 02:31, 16 March 2016 (UTC)

Cyberbot should not be tagging external links as dead yet.—^cyberpower_{Chat:Limited Access} 03:05, 16 March 2016 (UTC)

I was not aware of the bot param and can easily add it in case someone wants a record trail. WaybackMedic is re-adding the dead link template after it was removed by Cyberbot so the decision to tag the source dead is not original to WaybackMedic (or Cyberbot). That distinction may or may not matter. -- GreenC 04:13, 16 March 2016 (UTC)

GreenC bot has completed it's trial run. The edits are view-able here, ending with the Moscow theater hostage crisis. -- GreenC 20:35, 18 March 2016 (UTC)

The bot has gone through a major overhaul to incorporate new API, features. I added an additional 25 edits to the previous 33 show some of it. -- GreenC 21:57, 20 March 2016 (UTC)

Marking as

Trial complete. so someone will drop by to check the diffs (possibly even me a little later). Either way, another trial is probably a good idea due to the overhaul. --slakr^\ talk / 03:29, 24 March 2016 (UTC)

Slackr; There was a bug in this trial run. A number of articles were dropping the query portion of the URL due to a problem with urlencoding in the post to the API. Example. This is fixed. -- GreenC 15:59, 24 March 2016 (UTC)

Approved for extended trial (50 edits). --slakr^\ talk / 02:47, 29 March 2016 (UTC)

Trial complete. Trial complete. -- GreenC 18:59, 29 March 2016 (UTC)

{{BAGAssistanceNeeded}} - there were no errors with the trial. -- GreenC 20:47, 10 April 2016 (UTC)

I've never seen a bot in Awk before. I think you deserve some kind of award... — Earwig ^talk 02:49, 11 April 2016 (UTC)

I agree with Earwig. →Σ σς. (Sigma) 02:55, 11 April 2016 (UTC)

Thanks. I'm happy with awk, fun and easy. It's specialized for text processing so is ideal for wiki text processing (with some external programs for any networking etc) -- GreenC 04:59, 11 April 2016 (UTC)

This edit broke a citation template, unfortunately. — Earwig ^talk 02:58, 11 April 2016 (UTC)
This edit replaces a broken archive with a... incorrect archive? I'm not sure. — Earwig ^talk 03:25, 11 April 2016 (UTC)
In this edit, the bot removes a broken archive leaving the original link, which is some ad-spam nonsense. I guess that's because it appears to be up? I'm not sure if there's much we can do here. — Earwig ^talk 03:36, 11 April 2016 (UTC)

That's all I've reviewed so far (stopping at Henry Fox Talbot). Other than those things, it looks good. — Earwig ^talk 03:38, 11 April 2016 (UTC)

The original cite had an invisible LF character at the start (maybe the text was uploaded from a Windows text file). I've added a strip().
Edit is correct. If archive.org has no available snapshots it will query Memento which is an index to a dozen or so other archives (WebCite, Library of Congress, national archives). This is an unusual condition, though, 98% or more will be from Wayback.
If there is nothing available at Wayback (or other archives) it restores back to the original non-working link.

-- GreenC 04:59, 11 April 2016 (UTC)

But for the second one, the WebCite link doesn't appear to be valid: the date given is from 2005 but the news story is from 2008, and the link leads to an image download. Is it really archiving the right page? — Earwig ^talk 16:15, 11 April 2016 (UTC)

You're right. Unfortunately this is bad data from the Memento API. Here is the API request:

http://timetravel.mementoweb.org/api/json/20090101075242/http://www.themusic.com.au:80/imm_display.php?s%3Dchristie%26id%3D556%26d%3D2008-08-12

Returns the following JSON output (Pastebin). WaybackMedic tries to find the nearest match to 20090101075242 that isn't archive.org or archive.is .. in this case "first" dated 2005-11 at WebCite. Not sure what can be done here other than report it to Memento. In total there are only about 3-400 links to alternative archives in the whole set (I've already run it to completetion offline). After WM has completed I'll go through and check the WebCites that have this unusual truncated URL, fix any articles and send the data to Memento. Other spot checks things looked ok. -- GreenC 19:59, 11 April 2016 (UTC)

Ok I found 20 out of the 61 WebCite URLs don't work. The non-working all take the form of a 3-character (or less) URL path, the working have a 9-character path (http://www.webcitation.org/5lZ39OFsi), so it is easy to fix and is now fixed. -- GreenC 21:13, 11 April 2016 (UTC)

{{BAGAssistanceNeeded}} - I understand that in the 30+ days of this bot's trial, a single editor Earwig found two problems. Those problems are edge cases that, had the bot run to completion, would have impacted an estimated 25 of 25,000 edits or a bot accuracy rate of 0.999 though there might other unknown edge cases that bring it up to .99 or something. No other editor has raised concerns. Meanwhile the problems that MediaWiki is trying to fix are becoming worse -- editors attempt to fix them manually, and by doing so break things making it impossible for WaybackMedic to actually make the fixes it is designed for (eg. they see a link doesn't work, remove it and add {{cbignore}} making it impossible for WaybackMedic to replace with a working link). Each day that goes by WM's edit ability to fix problems is degraded. -- GreenC

Just letting you know: you probably shouldn't {{tl}} the assistance template if you want it to show up on the main status page. Anyway, I'm at work now. I wanted to finish going through the trial, and I haven't had time... Anyone else? — Earwig ^talk 16:13, 13 April 2016 (UTC)

Ok. Thank you for the assistance. -- GreenC 16:39, 13 April 2016 (UTC)

I've listed the remaining 22 edits in the second trial below. -- GreenC 18:03, 14 April 2016 (UTC)

Trial results

Trial results are at User:Green_Cardamom/WaybackMedic/trial2.

There's still enough red X's to justify more trial, edge cases keep showing up. I'd like to run in batches of 25 which seems manageable, using the same method above above. Hopefully it won't need more than another 50-75 edits, but however long it takes. I'll log the results on sub-pages to avoid making this page too long. -- GreenC 21:15, 15 April 2016 (UTC)

Would it be possible to dry run some of these instead of making live edits in production? e.g., log what would have changed to a sub-page in the bot's userspace or just manually review `diff` output, for example. We shouldn't have to post-mortem numerous trial runs. By the third trial, this should be at production readiness. There should also be clear evidence that there's been large amounts of self-testing without disruption to the production environment. --slakr^\ talk / 05:39, 16 April 2016 (UTC)

If you only knew how much dry run testing as been done! And of course I will continue to do so. In the trial's 47 edits, 99 changes were made of which 4 had fixable bugs that were difficult to spot (edit 1, 6, 14 & 38), or about a 4% error rate. That's not good enough, but it's close. The bot by its nature will always contain 'mistakes' (edits 5 and 17) that can't be helped, it's the nature of a constantly changing Internet. -- GreenC 13:48, 16 April 2016 (UTC)

I had mostly skipped the first trial due to your comment that the bot had been reworked, so as far as I'm concerned, there's only been one trial. I don't see any problem with a "second" one closely-monitored in small batches; Linus's Law comes into play, and the error rate is small enough to avoid damage.

Approved for trial (100 edits max, in 25-edit batches). — Earwig ^talk 22:42, 20 April 2016 (UTC)

Earwig, my problem has been lack of graphic in-line diffs so bugs were hard to spot and I was depending on the live trials to pick out the remaining bugs. I took User:Slakr's advice and setup some dry runs in User space and the last one ran mostly clean. Also, I ported the bot to a new language, Nim, which compiles to C with assembly optimizations and it's now running about 400% faster and half the memory. The port wasn't difficult as Nim can be made to look like other languages such as Awk or Python, the Nim code appears close to the Awk code. I uncovered some deep bugs along the way, and added some new features and optimizations, so it's a much improved version. -- GreenC 15:21, 21 April 2016 (UTC)

(edit: and I'll run some live trials next) -- GreenC 16:03, 21 April 2016 (UTC)

Trial 3 results

Trial complete. - results:

The trial articles were hand-picked to stress test the software's feature set. There was one bug in 51-75 that in production would have impacted few articles (required two rare conditions to occur simultaneous). -- GreenC 14:32, 23 April 2016 (UTC)

BU RoBOT 10

Operator: BU Rob13 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 23:29, Saturday, April 2, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: AWB

Function overview: Tag articles identified by WikiProject Women as being under their umbrella with {{WikiProject Women}}, assuming more specific project templates aren't already on the page.

Links to relevant discussions (where appropriate):

Edit period(s): Multiple runs based on lists generated and approved by the WikiProject

Estimated number of pages affected: First list currently has ~~20,191~~ 12,174 articles. See User:Edgars2007/Women_tag/Women.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: WikiProject Women is looking to automatically tag articles that fall under their umbrella. They're currently considering different criteria that could be automatically tagged, and I'd like this task to be approved for use on whatever lists they come up with in the future as well. The first such criteria they've agreed upon was "all articles which have an identically named article on the German Wikipedia that has been placed in the category 'Frau' (i.e. Women)," and I think the specificity of this criteria shows that the project is exercising sufficient caution in what is automatically tagged. Articles that already have banners of more specific projects related to women will not be tagged (list found at bot request, linked above). The original request and discussion occurred a while back, so the list of articles will be regenerated before the full task is run. A trial should be fine on the existing list.

By the way, this runs with genfixes to handle the placement of project templates appropriately. ~ Rob^Talk 21:07, 3 April 2016 (UTC)

Discussion

Under the assumption that the bot checks for redirects to the banners that it skips: Approved for trial (50 edits). — Earwig ^talk 20:55, 3 April 2016 (UTC)

Yes, it checks for redirects. Thanks, will run this later today. ~ Rob^Talk 21:04, 3 April 2016 (UTC)

Please note that I updated the number of pages above to a substantially higher number. I miscounted before due to an apparent error/limit in AWB. If you use "Links on page (only bluelinks)", it will only populate the list with up to 5,000 links at once. ~ Rob^Talk 21:27, 3 April 2016 (UTC)

@The Earwig: Stopped the trial after a single edit because the bot choked on something it shouldn't have. This is being handled with the "prepend text" option in AWB with genfixes used to place the template where it really belongs on the page. Unfortunately, for some reason, AWB is not throwing this template within {{WikiProjectBannerShell}} when it exists on the page. I double checked that genfixes are applied after "prepend text", and they are. See [3]. Magioladitis, could you comment on why AWB isn't treating this like a WikiProject template? ~ Rob^Talk 21:42, 3 April 2016 (UTC)

Hmm... Could you do the auto-assessment, too? — Earwig ^talk 21:48, 3 April 2016 (UTC)

Not sure what "auto-assessment" is? I've done some more testing and the "prepend text" definitely isn't the problem. When I fed the page through AWB a second time with just genfixes enabled, it returned no changes. I appear to have encountered an unfortunate bug. ~ Rob^Talk 21:52, 3 April 2016 (UTC)

Auto-assessment would typically be copying the class specified in other project banners if they're all identical, as most WikiProjects use the same classification system based on WP 1.0's guidelines. In [4], it could be identified as a stub. — Earwig ^talk 21:58, 3 April 2016 (UTC)

@The Earwig: I'm 99% sure that AWB genfixes will handle the auto-assessment once we sort out why this template isn't being detected as a WikiProject template. ~ Rob^Talk 22:07, 3 April 2016 (UTC)

@Rob: According to Wikipedia:AutoWikiBrowser/Order of procedures, talk page general fixes come before prepending text, which is consistent with your edit. Also, AWB genfixes do not include auto assessment. GoingBatty (talk) 03:04, 5 April 2016 (UTC)

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ Alright, thanks GoingBatty. Let me test some more stuff in my userspace and then I'll do the trial. I'm going to have to separate this out into cases, I think, and I'll include at least a few edits of each case in the trial. ~ Rob^Talk 08:55, 5 April 2016 (UTC)

I've split this into four cases. The first three cases (WikiProjectBannerShell on page, WikiProject template on page without BannerShell, and no WikiProject templates on page) are handled by regex and include auto-assessment if the WikiProject templates are of the form "WikiProject foo". It will not auto-assess if a redirect such as "WP foo" is used. It also will not auto-assess if an unexpected value is found in the class parameter. The last case is when no talk page exists. That will be handled with the prepend text separately. It appears a very small number of pages will fall under that case (i.e. this bot will not mass-create talk pages). ~ Rob^Talk 09:39, 5 April 2016 (UTC)

Is this request still active? I am confused. I recall that I have started it at some point but then I stopped for some reason. -- Magioladitis (talk) 10:04, 5 April 2016 (UTC)

It is still active, yes. If I recall correctly, you mentioned in the request that you never started because there wasn't consensus for the women's history aspect. This request omits the history stuff and just completes the part with consensus. ~ Rob^Talk 10:51, 5 April 2016 (UTC)

I updated the WikiProjects module and bypassed all redirects in question. -- Magioladitis (talk) 13:48, 5 April 2016 (UTC)

Trial complete. Here's the 50 edits. I was under the impression genfixes placed WP banners at the tops of talk pages whether or not a BannerShell was there, but it only does that as part of the BannerShell stuff. I've corrected the regex to take this into account. By default, the banners will now be placed at the top of the page instead of the bottom (which you'll see a couple times in the edits - lazy regex where I thought genfixes would pick up the slack). There's one really weird edit, where multiple WP banners were pulled out of a BannerShell: [5]. This seems like a genfix oddity rather than anything I've done. I think a small extended trial would be good. ~ Rob^Talk 18:13, 5 April 2016 (UTC)

@Rob: AWB's WikiProjectBannerShell fixes only work with templates that begin with "WikiProject" and not with their redirects that start with "WP". You might want to load User:Magioladitis/WikiProjects as a custom module to replace the redirects and make sure you're using the latest version of the AWB SVN. GoingBatty (talk) 02:52, 6 April 2016 (UTC)

@GoingBatty: I've added the module. What's the benefit of the latest SVN? I've always gone with the release version as the latest stable version. Stability is extremely important for a bot run, as an error could potentially impact thousands of talk pages. ~ Rob^Talk 11:03, 6 April 2016 (UTC)

There were some changes in talk page general fixes since the latest stable release that could be crucial. -- Magioladitis (talk) 11:22, 6 April 2016 (UTC)

@Magioladitis: You would consider the latest SVN to be stable enough for a bot run? If so, I'll go with that. ~ Rob^Talk 11:36, 6 April 2016 (UTC)

@Rob: {{WikiProjectBannerShell}} was recently changed to {{WikiProject banner shell}} - the module and SVN account for that, but there hasn't been a release version for that yet. GoingBatty (talk) 01:40, 7 April 2016 (UTC)

{{BAGAssistanceNeeded}} Need an extended trial on this one using the latest SVN and module and with some slightly tweaked regex. ~ Rob^Talk 21:26, 14 April 2016 (UTC)

Approved for extended trial (50 edits). I'm assuming the mistake was [6], which it looks like you cleaned up. Though, it does look like another bot might have already done this task (or possibly portions of it). Be sure to take this into account. --slakr^\ talk / 03:53, 16 April 2016 (UTC)

{{BotWithdrawn}} As per the above. I'm beyond befuddled that another bot operator said he'd take up the task, dropped it without comment for over a month, then completed the task out of nowhere without communicating this to me after seeing that I had filed this BRFA. This was a colossal waste of everyone's time. ~ Rob^Talk 04:22, 16 April 2016 (UTC)

Rob, Slakr I did at some point a part of this task because of the WPBS issue that works correctly only in latest SVN of AWB and not in the stable release (A new release is up this week though), that the list of redirects was not correct. I stopped after realising the list of pages was not entire clean and I sensed they will be troubles again. -- Magioladitis (talk) 07:06, 18 April 2016 (UTC)

{{BotWithdrawn}} Magioladitis (talk) 13:47, 18 April 2016 (UTC)

If there's still work to be done, this can be kept open. I checked a good 50 pages and found zero without the project template, plus the bot's contributions showed thousands done. @Magioladitis: Can you clarify what you mean by "not entirely clean" before I proceed with the extended trial on the latest SVN? ~ Rob^Talk 15:28, 18 April 2016 (UTC)

Rob Well, not anymore :( After you said you withdraw I went forward and tagged the 12,000 pages with Yobot. Next time we 'll cooperate better since we need more tagging bots. This is for sure. -- Magioladitis (talk) 20:36, 18 April 2016 (UTC)

You think, there isn't anything to tag anymore (for WP:Women)? You're wrong (I assume) ;) The updated list (I assume, Marios didn't used it). And we have enwiki categories also to tag, so ... --Edgars2007 (talk/contribs) 20:57, 18 April 2016 (UTC)

I'll see about running a trial on that list in a bit. The problem here wasn't really our coordination; it was the bot approvals process. If this hadn't sat here for weeks waiting on approval, the task would have been done ages ago. There should absolutely be a gateway for people to operate bots on the project, but as it stands, bot approvals are a net negative for the project. Simple tasks done by competent bot operators are being held up for weeks or even outright declined (see the direction of my 11th BRFA...). Meanwhile, those who don't bother going through the process can get a lot of productive work done while the BAG turns a blind eye. See Wikipedia:Bots/Requests for approval/wargo32.exe, where I brought up the fact that the bot op was operating the bot unapproved and was summarily ignored. At this point, I'd rather sit at my computer clicking away at the Save button in AWB than go through the ordeal of the approvals process unless the bot is making north of 5,000 edits, because it's faster than trying to get the fully automated version through a BRFA. I have no idea when the BAG moved away from being a safety check that made sure bots weren't destroying anything or doing things against consensus into being an online incarnation of the DMV. The whole process fails WP:NOTBUREAUCRACY at this point. ~ Rob^Talk 21:46, 18 April 2016 (UTC)

Is it OK if I go and merge all undone pages in one single page by removing also all pages taht are tagged wiht a more specific banner? -- Magioladitis (talk) 22:29, 21 April 2016 (UTC)

After some additional discussion at WP:WikiProject Women, the project has decided to go with a more restricted list to address concerns that the current one is a bit noisy (I think due to redirects, but I didn't compile the original list myself, so I don't know exactly how). I'm holding off on a trial for a few days until they have time to look over the new list, and then I'll post it here and run the trial. The new list already removes ones with more specific templates on it, I believe (although I still will keep the code to skip articles with more specific banners just in case). ~ Rob^Talk 22:39, 21 April 2016 (UTC)

This is the problem with this bot request from the beginning. Bot requests on tagging should be very specific otherwise there will be complains. Thanks. When you are ready please show the final list and the number of pages ot be tagged. -- Magioladitis (talk) 06:23, 22 April 2016 (UTC)

The process was followed correctly here. There was a category of pages discussed for tagging at the WikiProject, an editor requested that specific list to be tagged at Bot Requests, and I filed the request with that list as the basis. Since BRFAs are meant to address technical issues (not consensus-gathering) and the WikiProject indicated they had future tagging processes coming down the pipeline, I requested approval for tagging of WikiProject Women pages more generally (similar to how your own BRFA for Yobot allows blanket tagging rather than a specific list, although I constrained my BRFA to a single WikiProject so the code will not change whatsoever from run to run). The issue here has been that the WikiProject's members have changed their mind about the list several times. I was given a list with 20k articles, then one with nearly 50k articles, and now one with 3k articles (although a large part of the reason for the latest drop was filtering out the 20k you did and any that have more specific templates, which would have been skipped anyway). Unfortunately, that's well outside of my control. It's up to the WikiProject to determine what they want tagged, not the bot op, so I'm kind of at their whims there. I will post the list here when it's confirmed, probably tomorrow. ~ Rob^Talk 06:37, 22 April 2016 (UTC)

At User:Yobot you can find the rues I have formulated about WikiProject tagging. -- Magioladitis (talk)

Yes, those were seen prior to filing the BRFA and followed. Step 4 took long enough that the project had revisited step 3/5 before the task was completed. The list is User:Edgars2007/Women tag, and I'll be running the trial later today. There are 3,399 pages on the list, although a few will be skipped (redlinks, for instance). The WikiProject is currently discussing categories on en-wiki for tagging with auto-assessment, but this is all they've decided on for now. ~ Rob^Talk 15:27, 22 April 2016 (UTC)

Trial complete. Edits are here. One extremely small spacing issue at Talk:Alix Le Clerc which is fixed. No other errors that I noticed. ~ Rob^Talk 20:11, 22 April 2016 (UTC)

Spacing
No tag -- Magioladitis (talk) 22:25, 22 April 2016 (UTC)
- No spacing is fixed as mentioned above (and not really a problem in that situation; it caused no problem). The no tag was skipped due to one of the skip criteria to ensure nothing is mistagged. As for why the purely cosmetic edit occurred, I double checked that I had cosmetic-only and genfix only changes set to skip, and they are. Seems like an issue with AWB. ~ Rob^Talk 23:10, 22 April 2016 (UTC)

Yes. Known issue. See T132286 -- Magioladitis (talk) 23:29, 22 April 2016 (UTC)

@Magioladitis: Ah. Thanks for the information. Is there any known work-around at this time? ~ Rob^Talk 02:03, 23 April 2016 (UTC)

Josvebot 12

Operator: Josve05a (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 23:34, Sunday, April 17, 2016 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): AutoWikiBrwoser

Source code available: AWB

Function overview: Will tag orphaned talk pages (talk pages without existing corresponding article page) for speedy deleteion.

Links to relevant discussions (where appropriate):

Edit period(s): Continuous

Estimated number of pages affected: 100-3000 per run

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Will run this SQL and prepend {{db-g8}} to all talkpages using AutoWIkiBrowser. If you are a sysops, see this edit I did with my account.

Discussion

Question: Will this bot operate only in article space, i.e. main space? There are many talk pages outside of article space that do not have corresponding non-talk pages, e.g. Help talk:Citation Style 1/Archive 7. Thanks – Jonesey95 (talk) 03:24, 18 April 2016 (UTC)

The quarry results are showing pages in all sorts of namespaces - I think there are too many false-positives if you were to do this. — xaosflux ^Talk 03:40, 18 April 2016 (UTC)

I disagree with this strategy, as it would flood CAT:CSD - would be better to make it in to an on-wiki report for processing. — xaosflux ^Talk 03:40, 18 April 2016 (UTC)

I can limit this to namespace-related talk pages if so seems more appropriate. (And I agree, it does. See quesion raised by me below) And I can space out the edits with one tag per minute, in order to not flood CSD (60 tags per hour doesn't seem to be flooding imo), since this will be continuous it seems better to tag them directly with a bot, than list them and wait for a human to tag them, then most likely en masse and really flood stuff. (t) Josve05a (c) 05:56, 18 April 2016 (UTC)

@Xaosflux: You say the query results in non-mainspace-results; can you give an example? I can only find namespace-matches. (t) Josve05a (c) 07:48, 18 April 2016 (UTC)

@Jonesey95: The query should exclude results with %/% (titles with slashes, sush as subpages). (t) Josve05a (c) 07:50, 18 April 2016 (UTC)

@Jonesey95: - in you own link above, look at the namespace results; notice 2600:Topic; 711:Timed Text Talk; 447:Education Program talk; 119:Draft Talk, etc.... — xaosflux ^Talk 11:48, 18 April 2016 (UTC)

@Xaosflux: Oh right, sorry. I haven't coded in exceptions for those mainsspaces yet, and I was looking at a downloaded query result which had filtered out all !=1 titles. I will only operate for namespace talk pages. (t) Josve05a (c) 12:27, 18 April 2016 (UTC)

Thank you. — xaosflux ^Talk 13:00, 18 April 2016 (UTC)

I'm still concerned with your false positive rate, can you run your query, and produce a wiki-page of all the pages that your bot would tag. This needs to be reviewed by humans to find out your error rate before this can even begin live-trials. Your bot may run these reports in its own userspace.

{{BotTrial}} — xaosflux ^Talk 13:00, 18 April 2016 (UTC)

{{BotTrialComplete}} @Xaosflux: Ignoring all the already deleted pages, which the bot will skip, here is the list: User:Josvebot/Orphaned talk pages/2016-04-18. (t) Josve05a (c) 14:54, 18 April 2016 (UTC)

Thank you - a very quick check of some random pages looks clean - would like to give anyone else some time to go through these to see if there is a legitimate reason they have been left behind. I'll post at AN for feedback. — xaosflux ^Talk 15:18, 18 April 2016 (UTC)

This message is being sent to inform you that there is currently a discussion at Wikipedia:Administrators' noticeboard regarding an issue with which you may have been involved. Thank you. — xaosflux ^Talk 15:22, 18 April 2016 (UTC)

Note: removed "Trial" tags - the report looks good for review - after comments an actual trial of the bot's actions will be needed. — xaosflux ^Talk 15:24, 18 April 2016 (UTC)

It seems to mistakenly list talk pages which are redirects and are talk pages of articles which are also redirects. Talk:2016-17 LIU Brooklyn Blackbirds men's basketball team is the talk page of the existing 2016-17 LIU Brooklyn Blackbirds men's basketball team and should not be deleted as G8 -- and that's the first link I spot-checked! ☺ · Salvidrim! · ✉ 15:27, 18 April 2016 (UTC)

@Salvidrim: Hm, that's actually an interesting thing. The "redirect" article was created today (merly hour(s) beofre the quey, so there were some server lag...I'll try and work on a fix for that. (t) Josve05a (c) 15:55, 18 April 2016 (UTC)

Actually both the article and its talk page were auto-renamed by AnomieBOT (for dashes reasons), but for some reason the talk page was renamed approx. 9 hours before the article, and your list happened to be generated within the time gap between the first move and the second one (server-wise). ☺ · Salvidrim! · ✉ 16:01, 18 April 2016 (UTC)

Hmm..I didn't see that the article was created after the talk page (for whatever strange reason). This should not happen to often and should be well below the acceptable false possitive ratio. (But I'll try and see if I can do something about this.) (t) Josve05a (c) 16:03, 18 April 2016 (UTC)

If the list were smaller, I could effectively do a dry run before each tagging session and remove all occurrence of existing article page with AWB. But it can't be checked while tagging. (t) Josve05a (c) 16:19, 18 April 2016 (UTC)

What about subpages? Please set the bot to ignore pages with slashes in the title, unless both (1) the slash-free title doesn't exist [e.g. Talk:AC/DC would be ignored as long as Talk:AC exists], and (2) the corresponding article doesn't exist. I'm just concerned that the bot might start tagging archive pages (after all, United States/Archive 22 isn't an article), making extra work and perhaps causing a few pages to be deleted that shouldn't be. Also, how does the bot handle G8 exceptions? WP:G8 reminds us that we shouldn't delete useful-but-orphaned pages, and that we should tag such pages with {{G8-exempt}}; does the bot know that it shouldn't touch those pages? These are the only exceptions that come to mind; otherwise I think this bot a great idea. Nyttend (talk) 17:49, 18 April 2016 (UTC)

@Nyttend: Both these things are caught and excluded in the SQL query. See discussion at the top for slashes. The sql should not list articles tagged with that template, and if you see the example I listed in the description at the top (if you have sysops) then you'll see the edit summary asks that you replace the db-tag with the exempt tag if tagged in error. (t) Josve05a (c) 18:51, 18 April 2016 (UTC)

I also have a custom 'skip-RegEx', for such as the exempt-template, active MfD and CSD-templates etc. when editing. (t) Josve05a (c)

I'm sorry; I missed the part about slashes when I was reading quickly through this discussion, and I didn't pay attention to the edit summary when I looked at the deleted test. No remaining concerns on this issue. New issue: Sphilbrick's comment about articles in talkspace. What if the bot checked to see if each orphaned talk page has a deletion-log entry for its corresponding article, and if there's no such entry (i.e. the article never existed), the bot adds a cleanup category to the talk page? I'm thinking something like "Possibly orphaned talk pages", which of course would be tagged with {{Empty category}}; we admins could always run through it at random, deleting or draftifying them as appropriate. Right now I'm sleepy, so I have to acknowledge that perhaps I missed a spot in which you addressed it. Nyttend (talk) 05:47, 19 April 2016 (UTC)

Currently I have no way to check this with my current set-up. i do however believe the amount of "real" articles in talk-space that's worth keeping is so low that is miniscule. However, I could make the bot add something like {{Orphaned talk}} instead of tagging with {{db-g8}}, but I do not see that this would be such a thing that would warrant a complete re-coding of everything. (t) Josve05a (c) 11:21, 19 April 2016 (UTC)

@Josve05a: I see here that the bot will ignore pages tagged with {{G8-exempt}}, but will it also ignore pages in Category:Wikipedia orphaned talk pages that should not be speedily deleted? Asking since all pages tagged with the template will be in that category, but not necessarily all pages in that category will be tagged with that template. Also, {{G8-exempt}} has incoming redirects. Steel1943 (talk) 15:51, 22 April 2016 (UTC)

It should skip those as well, but~they do still show up in the list of article that the bot will list and "try and edit", but it will not be able to edit or process those. (t) Josve05a (c) 15:57, 22 April 2016 (UTC)

Why does Talk:Ocean Beach (Bluff Harbour) appear in your query output? It appears to have a corresponding article page, and both pages were not created recently. I must be missing something. – Jonesey95 (talk) 19:55, 18 April 2016 (UTC)

Talk:Ocean Beach (Bluff Harbour) does not appear. Talk:Ocean Beach (Bluff Habour) does, because Ocean Beach (Bluff Habour) does not exist. ☺ · Salvidrim! · ✉ 20:01, 18 April 2016 (UTC)

I knew I was missing something. Thanks. – Jonesey95 (talk) 20:08, 18 April 2016 (UTC)

A few comments:

I'd like to hear from @Aleenf1: If that name doesn't register, that editor proposes orphaned talk pages at CSD every Monday morning. I don't know that editors process, and there is, of course, no requirement that any such editor has to be exhaustive, but I'm puzzled by the observation that we have a dedicated editor working on this and apparently some items that may have been missed
I'd like to see a little more discussion of false positives. The first few items in this list require some careful review. Per MOS, articles about sports seasons should have an en-dash in the date. Per the very reasonable assumption that some people searching for such a page might do a query with an ordinary dash rather than an en-dash, it is common to create a redirect with the ordinary dash. We ought to be clear on whether that redirect ought to have a talk page. I think a good argument is that it should not, but it isn't clear to me that it is an orphaned talk page.
One of my concerns is that people mistakenly create an article in talk space rather than article space. This used to be more common before the draft space was created, but may still occur. I would prefer that such mistakes be moved to draft space, even though they technically are orphaned talk pages. I've made this proposal before without much success, but if a bot simply checks to make sure that the talk page exist in the article does not, it is likely to miss that the page was intended as an article.--S Philbrick (Talk) 23:31, 18 April 2016 (UTC)

Regarding point two, see above, this was caused by the "article" was created after the talk page, but before the query run which made the list, so there were no redirect on the article page then.

Regarding point two, I've yet to see an article in talk page which was worth keeping (neither in draft or in ns space). It is much more likely someone creates a talk page asking a question such as "why doens't this article exists" than someone creating a complete article in talk. Also see my respons to Nyttend above. (t) Josve05a (c) 11:21, 19 April 2016 (UTC)

Adding some of my own comments/questions:

Way back when, there was Wikipedia:Database reports/Orphaned talk pages which I used to work on. It broke when the Toolserver went away. I think I prefer this list based method as it was easier to slam through the list instead of dealing with the bulleted format of CAT:CSD. Would it be easier to fix this report instead of writing a new bot? @Athaenara: used to work on that list a lot with me, so I would like to see what their opinion is.
Does the bot honor {{G8-exempt}}?
I'd like to see a throttle on the bot (assuming it doesn't go to the single page list) so that it doesn't tag more than 50 orphans at one time and then wait until the tagged count goes below a certain number before tagging the next set. Similar to how HasteurBot was going through the G13 backlog. That way CAT:CSD doesn't get slammed.
Not a comment or question, just a pat on the back for @Aleenf1: who has been doing a great job for a long time and I just wanted to take the opportunity to thank them again. -- Gogo Dodo (talk) 03:20, 19 April 2016 (UTC)

Regarding point one, not everyone likes every "metod". However I do believe tagging for deletion is better than listing on a "list page" and wait for someone else (non-admin) to tag them. Especially if we get the list down to below 100. (t) Josve05a (c) 11:21, 19 April 2016 (UTC)

Regarding point two, see multiple responses above. (Yes it does)

Regarding point three, this could be done. Or set a time throttle between each edit.

Regarding point four, I agree! Thanks for all your work Aleenf1. (t) Josve05a (c) 11:21, 19 April 2016 (UTC)

When the report was running, there really was not much CSD tagging going on. What usually happened is that the report would run and some interested admin (usually myself or Athaenara) would slam through the deleting all the appropriate ones or doing whatever else was necessary. That is the main disadvantage of a list in that there are fewer admins potentially working on it from lack of visibility. I just find it easier to deal with the list instead of CAT:CSD, but then I did a lot (IMHO) of the G13 deletions when HasteurBot was working through the backlog.

Regarding the throttle, a throttle of the number of orphans it tags at once a la HasteurBot is better than a time throttle per edit. The point of throttling the total number is to not overwhelm CAT:CSD. If the bot is set to limit say one orphan per 5 seconds, you could still overwhelm CAT:CSD if no admin deletes the tagged orphans. If there is a limit of 50 open tags, then CAT:CSD will never grow too large. -- Gogo Dodo (talk) 03:18, 20 April 2016 (UTC)

I'm all for compromises. I could keep posting these updates, and tag X number of files for csd per day. How does that sound? (t) Josve05a (c) 05:00, 20 April 2016 (UTC)

It isn't tag X per day. It is on given interval N, tag X number of orphans unless X is already tagged either by the bot or somebody else and not deleted (i.e., count of Category:Candidates for speedy deletion as dependent on a non-existent page exceeds X). -- Gogo Dodo (talk) 06:20, 21 April 2016 (UTC)

FYI: Thparkth resurrected Wikipedia:Database reports/Orphaned talk pages with Community Tech Bot. I just wiped out the entire listing there (I only made a few mistakes). Of course, now every time I see a fill in box, I want to paste in "G8: Talk page of a nonexistent or deleted page". I think my deletion count is over 100k now. Woohoo! -- Gogo Dodo (talk) 06:29, 21 April 2016 (UTC)

Determining automated G8 tagging strategy

From the discussion above, it appears that the current back log has been removed, and a list-based method at Wikipedia:Database reports/Orphaned talk pages has been reactivated with another bot. That being said, in order for this tagging bot to go forward we need to know if there is a consensus for automated G8 tagging at all. — xaosflux ^Talk 15:18, 22 April 2016 (UTC)

Discuss:: Community comments for this proposal have been solicited at the talk pages for CAT:CSD, WP:CSD. — xaosflux ^Talk 15:28, 22 April 2016 (UTC)

Support:

Support, but only if the bot also adds instructions to there tags to inform the reviewing administrator to remove the tag and replace it with {{G8-exempt}} in the event of false positives, as well as ignores pages in Category:Wikipedia orphaned talk pages that should not be speedily deleted. Steel1943 (talk) 15:47, 22 April 2016 (UTC)

Oppose:

JJMC89 bot 4

Operator: JJMC89 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 05:54, Wednesday, March 23, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AutoWikiBrowser

Source code available: AWB (User:Magioladitis/WikiProjects)

Function overview: Complete discontinuation of comments subpages.

Links to relevant discussions (where appropriate):

Wikipedia:Discontinuation of comments subpages ( | talk | history | links | watch | logs)
BOTREQ (permalink)

Edit period(s): One time run initially, upon request afterwards if needed

Estimated number of pages affected: 35,000 per part initially

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Complete discontinuation of comments subpages.

Part A: subst /Comments subpages (via {{subst:substituted comment/subst}}). List compiled from Category:All articles with assessment comments (and the categories here if needed depending on the job queue).
Part B: redirect /Comments subpages to the main talk page. List compiled from Category:Pages whose comments subpage can be redirected.

Part A will be run with all automatic changes and User:Magioladitis/WikiProjects.

Discussion

Please see my comments at Wikipedia:Bot requests#Substitute and redirect /Comments subpages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 21:51, 25 March 2016 (UTC)
I also object; just treat them as archive pages with non-standard names, per my comments at same locale as Andy's. — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 19:09, 26 March 2016 (UTC)
More complex logic, while extra work to implement, might be useful. For example, if the comments page is quite new (newer than the most recent talk edit), I'd subst it to the bottom of the talk page. If it's older than anything on the current talk page, I'd put it in the archive box. If the page doesn't have an archives, you could add it to the top of the talk page. But it might not be worth adding all that functionality. — Earwig ^talk 02:44, 29 March 2016 (UTC)
Thanks for the comments. Interesting ideas from everyone, although quite complicated as Earwig acknowledges and probably unnecessary in my opinion. These pages typically consist of a single sentence of questionable value - I invite you to sample a few at Category:All articles with assessment comments. I don't believe it is worth retaining a whole archive page for them; the vast majority of these talk pages do not have any archives yet anyway. Some may be gibberish or even vandalism that has never been spotted; bringing it to the talk page will increase the likelihood that it will get attention. A disadvantage of keeping a separate archive page is the danger of it getting separated when a page move occurs - much neater to have all comments on one talk page. It might be worth asking JJMC89 to comment on how feasible the above suggestions actually are. My own opinion favours the simple method of substituting at the bottom of the talk page - I don't believe this will disrupt current discussions, and any long comments will be collapsed. Regards — Martin (MSGJ · talk) 23:53, 29 March 2016 (UTC)
@Pigsonthewing, SMcCandlish, The Earwig, and MSGJ: As far as I know, AWB cannot use the state of another page to make changes to the page it is editing, so placing the comments in different places based on time is out unless someone can write a template to parse everthing. Instead of substituting /Comments, {{Archives}} could be added or |1=[[/Comments|Article assessment comments]] added to an existing {{Archives}} (Unless someone can come up with a working regex, this would override any |1= already present.). /Comments could then could be wrapped with {{Archive top}} and {{Archive bottom}} with {{Soft redirect}} added. Starting point for an AWB module to add {{Archives}}:

Code
public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip) { Skip = false; Summary = ""; string zerothSection = WikiRegexes.ZerothSection.Match(ArticleText).Value; string restOfArticle = ArticleText.Replace(zerothSection, ""); string archives = @"\{\{\s(archives\|archive[ _]box)(\s\\|([^{]\|\{[^{]\|\{\{[^{}]+\}\})+)\}\}"; if ( Regex.IsMatch(zerothSection, archives, RegexOptions.IgnoreCase) ) { ArticleText = Regex.Replace(zerothSection, archives, "{{$1\|\n [[/Comments\|Article assessment comments]]\n$2}}", RegexOptions.IgnoreCase) + restOfArticle; } else { ArticleText = zerothSection + "{{Archives\|auto=yes\|search=yes\|\n* [[/Comments\|Article assessment comments]]\n}}\n\n" + restOfArticle; } return ArticleText; }

Code

public string ProcessArticle(string ArticleText, string ArticleTitle, int wikiNamespace, out string Summary, out bool Skip)
{
  Skip = false;
  Summary = "";
  string zerothSection = WikiRegexes.ZerothSection.Match(ArticleText).Value;
  string restOfArticle = ArticleText.Replace(zerothSection, "");
  string archives = @"\{\{\s*(archives|archive[ _]*box)(\s*\|([^{]|\{[^{]|\{\{[^{}]+\}\})+)\}\}";
  
  if ( Regex.IsMatch(zerothSection, archives, RegexOptions.IgnoreCase) ) {
    ArticleText = Regex.Replace(zerothSection, archives, "{{$1|\n* [[/Comments|Article assessment comments]]\n$2}}", RegexOptions.IgnoreCase) + restOfArticle;
  } else {
    ArticleText = zerothSection + "{{Archives|auto=yes|search=yes|\n* [[/Comments|Article assessment comments]]\n}}\n\n" + restOfArticle;
  }
  
  return ArticleText;
}

I don't think its worth the trouble. If talk page participants want to archive the substituted text or place it in chronological order, they are free to do so. — JJMC89 (T·C) 04:29, 30 March 2016 (UTC)

If we add the last revision date as a timestamp then I believe the archiving bots will make sure it goes into the relevant archive based on this date. The only reason I didn't do this earlier is that the comment may be archived immediately without anyone seeing it. But perhaps this is not so much of a concern — Martin (MSGJ · talk) 10:29, 31 March 2016 (UTC)
I'm not seeing any real gain here and substantial potential to confuse editors by barfing old comments onto the talk page. At the very least, it seems obvious that the discussion on whether this task is appropriate should be taken to a venue with larger traffic. ~ Rob^Talk 02:05, 31 March 2016 (UTC)

For your information, we have had consensus for this task since 2009. If you would care to look through WT:DCS you will see that most people who commented there supported the idea of substituting these comments on the talk page. The exact implementation was never agreed on, which may be why this still hasn't been completed 7 years later. (Outright deletion was also discussed but there were some concerns over loss of attribution.) Substitution and redirection gained the most support. We can take this back to WP:VPR if you insist, but I very much doubt that consensus has changed on this issue. Regards — Martin (MSGJ · talk) 08:33, 31 March 2016 (UTC)

I've always interpreted deprecation to mean "stop using this" while deletion means "actively get rid of this". There's a bit of a difference. I don't see any consensus on that page for deletion, and especially none for throwing everything on the talk page in this manner. ~ Rob^Talk 13:02, 31 March 2016 (UTC)

I didn't claim there was consensus for deletion. If you read the comments on that page to see what people meant by "deprecation" you'll see that the overwhelming majority mention substitution and redirection. What are the alternative methods? For example, simply {{hat}}ting the comments was not proposed by a single person in those discussions. Leaving things the way they are is not deprecation because there is still the occasional edit to these pages (see here) and those with problematic content are not being dealt with. — Martin (MSGJ · talk) 14:51, 31 March 2016 (UTC)

Concur with Rob. As a WP:GNOME myself, I have a lot of patience for fiddly cleanup efforts, but this appears to be a worm-can best left unopened. No harm is caused by old /Comments pages. It's just historical stuff that few care about at all, and fewer and fewer care, less and less, the more the material ages. I'm skeptical we care much if someone vandalizes a /Comments page to be string of expletives, since people don't look at these pages; it would be rather like screaming obscenities into one's pillow, alone, at midnight. — SMcCandlish ☺ ☏ ¢ ≽^ʌⱷ҅_ᴥⱷ^ʌ≼ 09:24, 1 April 2016 (UTC)

LOL to the pillow anology. Okay I do see where you are coming from. There are similar conversations occurring elsewhere on the wiki regarding abandoned userspace drafts and the option of cleaning them up versus ignoring them. (Personally I feel we should take some effort to keep our article talk space reasonably tidy ...) Anyway, seeing as there was a very strong consensus from editors to take some action regarding these subpages and as we have a bot operator ready and willing to do this job, do you think that perhaps your ambivalence should not stand in the way of getting the job done? I do recognise that there is a sort of consensus developing here though as well, so I will make a couple of proposals below. — Martin (MSGJ · talk) 20:35, 2 April 2016 (UTC)
As alluded to above, this might be at a point where we need to consider sending these directly to archives or otherwise just integrating them with {{Archives}} directly instead of sending them to current discussions on the article's talk page. On low-and-zero-traffic articles, sending them to the talk page might not be as bad, admittedly, but all the others will end up with a bot injecting stale, most-likely-irrelevant comments into a stream of current discussions (plus disrupting watchlists for anyone who doesn't ignore bots). A bot might not even need to do anything if we could use a module to test the existence of a page (which is predictably a specific subpage of the corresponding Talk). --slakr^\ talk / 02:32, 2 April 2016 (UTC)

Actually it would be fairly easy to detect which pages have archives, and it would be a small fraction of the total. I would be willing to take a look at these manually and see if the comments can be incorporated into the relevant archive to prevent any disruption to the flow of the talk page. — Martin (MSGJ · talk) 20:35, 2 April 2016 (UTC)
I recognise the concerns of editors on this page, and considering also the strong consensus to clean these subpages up, I have a couple of proposals below. Hopefully we can agree on one of these. — Martin (MSGJ · talk) 20:41, 2 April 2016 (UTC)
- Option 1: the bot will only substitute the contents of the /Comments subpage on talk pages which do not have any archives. The remainder (which will be a tiny proportion) will be dealt with manually and discretion exercised.
- Option 2: the bot will not substitute any comments, but will simply redirect the subpage to the article's talk page. I feel this is preferable to just leaving the subpage untouched as it actively discourages any further use of these pages.
I'd support Option 1, but the discussion of which option has consensus really should happen at a better venue than BRFA. ~ Rob^Talk 20:52, 2 April 2016 (UTC)
Agree, consensus should be established before BRFA, and indeed, ideally before BOTREQ. All the best: Rich Farmbrough, 00:23, 3 April 2016 (UTC).
- As noted above there is broad consensus for the original requested task evident at WT:DCS (originally at WP:VPR) and reaffirmed by several editors recently. However that doesn't seem to stop people opposing it here. (Can't have it both ways!) Tell you what I'll do - I'll post a short notice on WP:VPR asking anyone interested to post here and I'll ping those editors who posted recently at DCS to comment on their preferred option above. — Martin (MSGJ · talk) 09:12, 4 April 2016 (UTC)
- Okay it is now advertised on WP:VPR and I will ping the four editors who commented at WT:DCS one final time to ask them to express their preference either on the original proposal or the two compromise options above: User:Dinoguy1000, User:Hiding, User:PC78, User:Titoxd — Martin (MSGJ · talk) 09:09, 5 April 2016 (UTC)
{{BAGAssistanceNeeded}} It's been a while without further comment. Could we get a decision on this please? — Martin (MSGJ · talk) 21:50, 10 April 2016 (UTC)
- Not many comments from anyone, unfortunately. Which option are we going with? — Earwig ^talk 20:50, 11 April 2016 (UTC)
  - Seems that option 1 may satisfy just about everyone's concerns — Martin (MSGJ · talk) 21:38, 11 April 2016 (UTC)
    - VPR post archived with no responses. Yeah, I guess we'll go forward with option 1. Approved for trial (50 edits). — Earwig ^talk 03:27, 13 April 2016 (UTC)
Option 1 does not satisfy my concerns. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:12, 14 April 2016 (UTC)
Option 1 works for me. Hiding T 21:04, 14 April 2016 (UTC)

Well there are 0 that need tackling manually if you'd care to give a hand! — Martin (MSGJ · talk) 21:26, 14 April 2016 (UTC)

Break

Trial complete. Edits: part A, part B, and one incorrect redirect due to loading the wrong page list. — JJMC89 (T·C) 01:45, 15 April 2016 (UTC)

I don't see any issues, apart from the one identified. Perhaps the edit summaries could be a bit more descriptive, but the links to WP:DCS should suffice. — Martin (MSGJ · talk) 08:59, 15 April 2016 (UTC)

Perhaps it is worth pointing out to other editors on this page that the bot is only dealing with pages which don't have any archive pages (or at least not one of the form /Archive 1 which seems overwhelmingly the favoured naming scheme). Those with archive pages are being dealt with manually for now. — Martin (MSGJ · talk) 09:00, 15 April 2016 (UTC)

While a minority, there are a significant number of archive pages not using that naming scheme. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 08:57, 16 April 2016 (UTC)

If you could give me some examples, I will be happy to add them to the logic that detects the archive pages. Thanks — Martin (MSGJ · talk) 07:57, 18 April 2016 (UTC)

There are 891 talk pages which have precisely one archive (Archive 1). In addition to option 1, could we please get approval to substitute the comments onto /Archive 1 for these pages? Thanks — Martin (MSGJ · talk) 19:48, 20 April 2016 (UTC)

DarafshBot 5

Operator: Darafsh (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 18:36, Saturday, April 2, 2016 (UTC)

Automatic, Supervised, or Manual: Supervised

Programming language(s): Python

Source code available: pagefromfile.py

Function overview: add Infobox to Districts of Iran via User:DarafshBot/bakhsh2

Links to relevant discussions (where appropriate):

Edit period(s): once

Estimated number of pages affected: c. 980

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: Districts of Iran in en.wiki have not Infobox settlement, my bot add them via User:DarafshBot/bakhsh2. these edits are for example: 1, 2, 3 and 4.

Discussion

--Darafsh (Talk) 18:36, 2 April 2016 (UTC)

Approved for trial (50 edits). — Earwig ^talk 20:11, 3 April 2016 (UTC)

@The Earwig: Trial edits

Done Darafsh (Talk) 23:05, 3 April 2016 (UTC)

Trial complete.

@Darafsh: So it's hard for me to sanity-check these edits since I'm not familiar with Iranian geography, but why does Abgarm District get classified in this edit as being part of Buin Zahra County, while the article says it's in Avaj County? — Earwig ^talk 22:21, 9 April 2016 (UTC)

@The Earwig: Abgarm District was a part of Avaj County until the 2006 census, but it's a district in Buin Zahra County now. my database is based on "Census of the Islamic Republic of Iran, 1390 (2011)", while the article's information based on "Census of the Islamic Republic of Iran, 1385 (2006)". Darafsh (Talk) 22:42, 13 April 2016 (UTC)

Minor comment e.g., this — you should try to ensure there's a newline after the template if normal text (the lead) follows it, as it improves readability. --slakr^\ talk / 04:04, 16 April 2016 (UTC)

I would like @Ladsgroup: to take e a quick look on that. -- Magioladitis (talk) 06:41, 19 April 2016 (UTC)

Hey, I will check edits and let you know about it very soon. Made some cosmetic changes to the subpage :)Ladsgroup^overleg 08:15, 19 April 2016 (UTC)

LGTM :)Ladsgroup^overleg 09:30, 19 April 2016 (UTC)

rezabot 3

Operator: Yamaha5 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 10:25, Friday, February 19, 2016 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: interwikidata.py

Function overview: I want to move interwikis to wikidata

Links to relevant discussions (where appropriate):

Edit period(s): daily

Estimated number of pages affected: unknown

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details:

Discussion

Some years ago for old interwiki.py code, user:Rezabot had flag at many local wikis (such as en.wikipedia) also it was global after starting wikidata all of interwiki bots were stopped. now I want to run it with new interwiki code (interwikidata.py) Yamaha5 (talk) 10:25, 19 February 2016 (UTC)

I thought all interwikis had already been migrated to Wikidata; is there really a need for this? — Earwig ^talk 18:50, 20 February 2016 (UTC)

Can you make a few edits with your user account, and show the diffs below to more fully explain what you are trying to do? — xaosflux ^Talk 16:23, 21 February 2016 (UTC)

Couldn't the old interwiki migrate bot be reactivated, if that were the case, instead of having to go through this process? →Σ σς. (Sigma) 02:16, 23 February 2016 (UTC)

This request will soon expire from lack of participation, please review the questions above. — xaosflux ^Talk 02:31, 28 February 2016 (UTC)

Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT ⚡ 05:19, 28 February 2016 (UTC)

@Xaosflux, Σ, and The Earwig:There are many article and pages which have old interwiki like these or these or these and other 250 langs :). also newbies add oldinterwiki to articles like these which was cleaned two week ago!

for bot edits please check this

Yamaha5 (talk) 20:03, 28 February 2016 (UTC)

Why was this task run without approval? — JJMC89 (T·C) 16:35, 4 March 2016 (UTC)

user:Xaosflux asked "Can you make a few edits with your user account, and show the diffs below to more fully explain what you are trying to do?" so I run for few editsYamaha5 (talk) 18:21, 7 March 2016 (UTC)

That means on Yamaha5, not rezabot. — JJMC89 (T·C) 20:20, 7 March 2016 (UTC)

As you know my bot doesn't have flag so it is the same as my user at this time and all of it's edit will be visible, any ways for next edits I will do it by Yamaha5. Yamaha5 (talk) 21:08, 7 March 2016 (UTC)

please some one take a look on this request!Yamaha5 (talk) 15:14, 4 March 2016 (UTC)

Special:Contributions/rezabot

Yamaha5 (talk) 15:15, 4 March 2016 (UTC)

{{OperatorAssistanceNeeded}} See below: — xaosflux ^Talk 02:45, 6 March 2016 (UTC)

This trial was never approved, and should not be running. — xaosflux ^Talk 02:45, 6 March 2016 (UTC)
These edits appear to be doing harm, by removing what appear to be VALID links such as on Surface_weather_analysis. Please explain why these links SHOULD NOT be present, and "because they are interwiki links" is not an acceptable answer here. — xaosflux ^Talk 02:45, 6 March 2016 (UTC)

1-why this trial edits are not valid? what should i do?

2-Interwiki links on Surface_weather_analysis was incorrect and should be remove because they had cofilict on wikidata please check d:Q11157129 and d:Q189796 these items have article on enwiki so frwiki, cswiki ,... shouldn't link to both of them.

this code is standard code and it is tested on many wikis. Yamaha5 (talk) 18:16, 7 March 2016 (UTC)

1) You have to be approved for a trial, before making trial edits. — xaosflux ^Talk 04:19, 8 March 2016 (UTC)
2) Unfortunately, I don't read all these languages. For example on this edit: Special:Diff/708263000 you removed the links to many other languages - and these links are not coming in from Wikidata, making this article have less links. Are you saying these other links are not about this subject and that is why they do not belong? — xaosflux ^Talk 04:19, 8 March 2016 (UTC)

This article completely messed and had incorrect interwikis for example these are links which are removed by bot:

[[cs:Meteorologická mapa]] existed at > d:Q865144

[[de:Wetterkarte]] existed at > d:Q865144

[[es:Frente (meteorología)]] existed at > d:Q865144

[[fr:Front (météorologie)]] existed at > d:Q189796

[[ko:일기도]] existed at > d:Q865144

[[nl:Weerkaart]] existed at > d:Q865144

[[pl:Mapa synoptyczna]] existed at > d:Q865144

[[zh:天氣圖]] existed at > d:Q865144

and Surface weather analysis existed at > d:Q11157129 so because of interwiki conflict these links at that article should be removed and bot's edit was correct. these lang-links connected to Surface weather analysis (d:Q11157129) and Weather front (d:Q189796) and Weather map (d:Q865144)

Yamaha5 (talk) 21:50, 9 March 2016 (UTC)

Hm, so how does the bot actually work? It seems like a process that would require human review, perhaps by merging Wikidata items if there is overlap or figuring out if some articles are misclassified. I know I've manually dealt with this sort of thing in the past. — Earwig ^talk 17:58, 14 March 2016 (UTC)

((BAGAssistanceNeeded))

Bot checks if there is any conflict on the page's itewiki leave it except case which all items have enwiki link.

for this example d:Q11157129 , d:Q189796, d:Q865144 had enwiki link so we can't merge them on wikidata. bot only works on this kind of conflicts and leave the rest.

For mentioned conflicts it will check if all interwiki links exist on wikidata it will clean locally if one of them doesn't exist it will leave that page.

for this example bot checked wikidata items of cs,de,es,fr,ko,nl,pl,zh if all of them have link to their own items so we can clean that local wiki's page.

Note I can deactivate conflict solver part and only import interwiki from without-conflict pages to wikidata and clean it locally Yamaha5 (talk) 07:03, 16 March 2016 (UTC)

Are you using the pywikibot-core version of interwikidata.py, or is it custom code? Legoktm (talk) 07:40, 13 April 2016 (UTC)

@Legoktm: It is custom code based on pywikibot-core version Yamaha5 (talk) 08:02, 14 April 2016 (UTC)

@Ladsgroup: already has a bot doing similar things. I would like to hear from them. -- Magioladitis (talk) 06:39, 19 April 2016 (UTC)

Hey, I wrote that interwikidata.py, the current version in master is useless. The modified version actually just deletes everything (causing this). I have another modified version that I will upload it somewhere or try to get that through code review. I ran it yesterday and it is being ran on weekly basis :)Ladsgroup^overleg 08:07, 19 April 2016 (UTC)

OK I would prefer if @Ladsgroup: does this task since they have written the code and they are directly related to Wikidata. -- Magioladitis (talk) 13:47, 20 April 2016 (UTC)

Approved requests

Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (), while old requests can be found in the archives.

APersonBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 7) Approved 06:48, 19 April 2016 (UTC) (bot has flag)
AnomieBOT III (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 03:58, 13 April 2016 (UTC) (bot has flag)
SporkBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 7) Approved 16:14, 11 April 2016 (UTC) (bot has flag)
JarBot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 20:27, 10 April 2016 (UTC) (bot has flag)
CheckBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 02:06, 10 April 2016 (UTC) (bot has flag)
BU RoBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 8) Approved 23:25, 9 April 2016 (UTC) (bot has flag)
FastilyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 7) Approved 12:12, 8 April 2016 (UTC) (bot has flag)
FastilyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 6) Approved 14:09, 4 April 2016 (UTC) (bot has flag)
Femto Bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 7) Approved 20:57, 3 April 2016 (UTC) (bot has flag)
BU RoBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 9) Approved 20:46, 3 April 2016 (UTC) (bot has flag)
BU RoBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 7) Approved 03:19, 29 March 2016 (UTC) (bot has flag)
JJMC89 bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 18:35, 14 March 2016 (UTC) (bot has flag)
FastilyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 4) Approved 17:58, 14 March 2016 (UTC) (bot has flag)
Monkbot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 10) Approved 03:26, 9 March 2016 (UTC) (bot has flag)
AnomieBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 74) Approved 07:01, 8 March 2016 (UTC) (bot has flag)
Hazard-Bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 34) Approved 03:41, 28 February 2016 (UTC) (bot has flag)
APersonBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 5) Approved 03:47, 27 February 2016 (UTC) (bot has flag)
MoohanBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 9) Approved 00:15, 27 February 2016 (UTC) (bot has flag)
BG19bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 9) Approved 00:09, 27 February 2016 (UTC) (bot has flag)
JJMC89 bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 05:04, 19 February 2016 (UTC) (bot has flag)
FastilyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Approved 04:16, 19 February 2016 (UTC) (bot has flag)
FastilyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 20:29, 18 February 2016 (UTC) (bot has flag)
FastilyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Approved 20:29, 18 February 2016 (UTC) (bot has flag)
EsquivalienceBot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 05:17, 13 February 2016 (UTC) (bot does not require a flag)
RileyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 17) Approved 05:37, 11 February 2016 (UTC) (bot has flag)
Lonjers french region rename bot (BRFA · contribs · actions log · block log · flag log · user rights) Approved 15:19, 8 February 2016 (UTC) (bot has flag)
BattyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 52) Approved 02:10, 1 February 2016 (UTC) (bot has flag)
RileyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 16) Approved 08:18, 31 January 2016 (UTC) (bot has flag)
FastilyBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 1) Approved 06:33, 31 January 2016 (UTC) (bot has flag)
Hazard-Bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 33) Approved 04:32, 31 January 2016 (UTC) (bot has flag)

Denied requests

Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.

wargo32.exe (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 03:30, 16 April 2016 (UTC)

Hot Riley Bot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 23:31, 10 April 2016 (UTC)

TamizhBOT (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 07:24, 10 March 2016 (UTC)

sanskritnlpbot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 22:40, 3 March 2016 (UTC)

KoehlBot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 15:50, 17 January 2016 (UTC)

Bottastic 6 (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 15:40, 12 December 2015 (UTC)

Helperbot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 3) Bot denied 03:48, 4 December 2015 (UTC)

Redirectbot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Bot denied 01:21, 2 November 2015 (UTC)

AWB - mass spelling fix (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 21:11, 1 November 2015 (UTC)

Redirectbot (BRFA · contribs · actions log · block log · flag log · user rights) Bot denied 23:03, 22 October 2015 (UTC)

Tulsibot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Bot denied 11:41, 15 October 2015 (UTC)

Expired/withdrawn requests

These requests have either expired, as information required by the operator was not provided, or been withdrawn. These tasks are not authorized to run, but such lack of authorization does not necessarily follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at anytime. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the respective archives: Expired, Withdrawn.

CheckBot (BRFA · contribs · actions log · block log · flag log · user rights) Withdrawn by operator 00:05, 27 March 2016 (UTC)

JJMC89 bot (BRFA · contribs · actions log · block log · flag log · user rights) Withdrawn by operator 23:19, 13 February 2016 (UTC)

TyAbot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Withdrawn by operator 21:25, 22 January 2016 (UTC)

JackieBot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 1) Expired 16:06, 17 January 2016 (UTC)

Community Tech bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Withdrawn by operator 22:27, 6 January 2016 (UTC)

ASammourBot (BRFA · contribs · actions log · block log · flag log · user rights) Withdrawn by operator 19:27, 7 December 2015 (UTC)

Luke081515Bot (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 2) Expired 03:37, 2 December 2015 (UTC)

BU RoBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 6) Withdrawn by operator 04:50, 1 December 2015 (UTC)

RelentlessBot (BRFA · contribs · actions log · block log · flag log · user rights) Withdrawn by operator 08:03, 25 October 2015 (UTC)

BU RoBOT (BRFA · contribs · actions log · block log · flag log · user rights) (Task: 5) Withdrawn by operator 01:27, 7 October 2015 (UTC)

Wikipedia technical help

Get personal technical help at the Teahouse, Help desk, Village pump (technical), talk pages or IRC.

General technical help	Multilingual support Special Characters Entering Browser notes Troubleshooting Bypass cache Mobile access Printing Keyboard shortcuts Editing Edit toolbar CharInsert Edit conflict VisualEditor User guide Create a page Page history Reverting Page name Introduction User access levels Editnotice IRC Tutorial

Special page related	Special page help Searching Advanced search Logging in Reset passwords Notifications/Echo FAQ Moving a page Fix cut-and-paste moves Watching pages User contributions Emailing users Random pages Logs What links here Related changes Recent changes Pending changes Page Curation Linksearch Page import Edit filter

Wiki markup	Wiki markup main page Cheatsheet Colours use Columns Line-break handling Lists Magic words For beginners Conditional expressions Switch parser function Time function Redirects Sections and TOCs Tables Introduction Conditional tables Sorting Collapsing Advanced table formatting

Links and diffs	URLs Links Permanent link Interwikimedia links Interlanguage links Link color Pipe trick Colon trick Magic links Diffs Simplest diff guide Simple diff and link guide Complete diff and link guide

Media files: images, videos and sounds	Media help Options to hide an image Uploading images Introduction Files Creation and usage Moving to the Commons Visual file markup Images Preparing images for upload Picture tutorial Extended image syntax Gallery tag Graphics tutorials Basic bitmap image editing How to improve image quality Graphics Lab resources Sound file markup SVG help

Other graphics	Family trees Graphs and charts How to create To scale charts Barcharts Math formula Math symbols Rendering math LaTeX symbols Musical scores Musical symbols Timeline EasyTimeline syntax WikiHiero syntax

Templates and Lua modules	Templates Template messages Citation templates Transclusion Labeled section Substitution Advanced template coding Template limits Template sandbox and test cases Template documentation Purge Lua project Lua help To do Requests Resources Guide to Scribbling

Namespaces	Main/Article Talk namespaces Archiving User User page design Project/Wikipedia File MediaWiki Bug reports and feature requests Template Help Category Portal Book Draft Education Program TimedText Module/Lua Topic/Flow Special Media

HTML and CSS	HTML in wikitext Markup validation Span tags Cascading Style Sheets Catalogue of CSS classes Useful styles Classes used in microformats Ambox classes Common.js and common.css

Customisation and tools	Preferences Skins Customizing watchlists Hide pages Gadgets Beta Features User scripts Guide Techniques IRC Scripts User styles Tools Navigation shortcuts Browser tools Alternative browsing Editing tools Optimum tool set Cleaning up vandalism tools Citation tools Wikimedia Labs Toolserver

Automating editing	Bots Creating Twinkle FurMe NPWatcher HotCat WPCleaner igloo AutoWikiBrowser Navigation popups STiki AfC helper script Huggle

source: https://web.archive.org/web/20160430080944/https://en.wikipedia.org/wiki/Wikipedia:Bots/Requests_for_approval

Contents

Current requests for approval

Discussion

Discussion

Bots in a trial period

Discussion

DB Results

Discussion

Discussion

Discussion

Discussion

Discussion

Discussion

Discussion

Bots that have completed the trial period

Discussion

Trial results

Trial 3 results

Discussion

Discussion

Determining automated G8 tagging strategy

Discussion

Break

Discussion

Discussion

Approved requests

Denied requests

Expired/withdrawn requests