This is a page for requesting tasks to be done by bots per the bot policy. This is an appropriate place to put ideas for uncontroversial bot tasks, to get early feedback on ideas for bot tasks (controversial or not), and to seek bot operators for bot tasks. Consensus-building discussions requiring large community input (such as request for comments) should normally be held at WP:VPPROP or other relevant pages (such as a WikiProject's talk page).
You can check the "Commonly Requested Bots" box above to see if a suitable bot already exists for the task you have in mind. If you have a question about a particular bot, contact the bot operator directly via their talk page or the bot's talk page. If a bot is acting improperly, follow the guidance outlined in WP:BOTISSUE. For broader issues and general discussion about bots, see the bot noticeboard.
Before making a request, please see the list of frequently denied bots, either because they are too complicated to program, or do not have consensus from the Wikipedia community. If you are requesting that a template (such as a WikiProject banner) is added to all pages in a particular category, please be careful to check the category tree for any unwanted subcategories. It is best to give a complete list of categories that should be worked through individually, rather than one category to be analyzed recursively (see example difference).
Note to bot operators: The {{BOTREQ}} template can be used to give common responses, and make it easier to keep track of the task's current status. If you complete a request, note that you did with {{BOTREQ|done}}
, and archive the request after a few days (WP:1CA is useful here).
|
Bot to improve names of media sources in references
Many references on Wikipedia point to large media organizations such as the New York Times. However, the names are often abbreviated, not italicized, and/or missing wikilinks to the media organization. I'd like to propose a bot that could go to an article like this one and automatically replace "NY Times" with "New York Times". Other large media organizations (e.g. BBC, Washington Post, and so on) could fairly easily be added, I imagine. - Sdkb (talk) 04:43, 19 November 2018 (UTC)
- I would be wary of WP:CONTEXTBOT. For instance, NYT can refer to a supplement of the Helsingin Sanomat#Format (in addition to the New York Times), and maybe is the main use of Finland-related pages. TigraanClick here to contact me 13:40, 20 November 2018 (UTC)
- @Tigraan:That's a good point. I think it'd be fairly easy to work around that sort of issue, though — before having any bot make any change to a reference, have it check that the URL goes to the expected website. So in the case of the New York Times, if a reference with "NYT" didn't also contain the URL nytimes.com, it wouldn't make the replacement. There might still be some limitations, but given that the bot is already operating only within the limited domain of a specific field of the citation template, I think there's a fairly low risk that it'd make errors. - Sdkb (talk) 10:52, 25 November 2018 (UTC)
- I should add that part of the reason I think this is important is that, in addition to just standardizing content, it'd allow people to more easily check whether a source used in a reference is likely to be reliable. - Sdkb (talk) 22:01, 25 November 2018 (UTC)
- @Sdkb: This is significantly harder than it seems, as most bots are. Wikipedia is one giant exception - the long tail of unexpected gotchas is very long, particular on formatting issues. Another problem is agencies (AP, UPI, Reuters). Often times the NYT is running an agency story. The cite should use NYT in the
|work=
and the agency in the|agency=
but often the agency ends up in the|work=
field, so the bot couldn't blindly make changes without some considerable room for error. I have a sense of what needs to be done: extract every cite on Enwiki with a|url=
containing nytimes.com, extract every|work=
from those and create a unique list, manually remove from the list anything that shouldn't belong like Reuters etc.., then the bot keys off that list before making live changes, it knows what is safe to change (anything in the list). It's just a hell of a job in terms of time and resources considering all the sites to be processed and manual checks involved. See also Wikipedia:Bots/Dictionary#Cosmetic_edit "the term cosmetic edit is often used to encompass all edits of such little value that the community deems them to not be worth making in bulk" .. this is probably a borderline case, though I have no opinion which side of the border it falls other people might during the BRFA. -- GreenC 16:53, 26 November 2018 (UTC)- @GreenC: Thanks for the thought you're putting into considering this idea; I appreciate it. One way the bot could work to avoid that issue is to not key off of URLs, but rather off of the abbreviations. As in, it'd be triggered by the "NYT" in either the work or agency field, and then use the URL just as a confirmation to double check. That way, errors users have made in the citation fields would remain, but at least the format would be improved and no new errors would be introduced. - Sdkb (talk) 08:17, 27 November 2018 (UTC)
- Right that's basically what I was saying also. But to get all the possible abbreviations requires scanning the system because the variety of abbreviations is unknowable ahead of time. Unless pick a few that might be common, but it would miss a lot. -- GreenC 14:54, 27 November 2018 (UTC)
- Well, for NYT at the least, citations with a
|url=https://www.nytimes.com/...
could be safely assumed to be referring to the New York Times. Headbomb {t · c · p · b} 01:20, 8 December 2018 (UTC)- Yeah, I'm not too worried about comprehensiveness for now; I'd mainly just like to see the bot get off the ground and able to handle the two or three most common abbreviation for maybe half a dozen really big newspapers. From there, I imagine, a framework will be in place that'd then allow the bot to expand to other papers or abbreviations over time. - Sdkb (talk) 07:01, 12 December 2018 (UTC)
- Well, for NYT at the least, citations with a
- Right that's basically what I was saying also. But to get all the possible abbreviations requires scanning the system because the variety of abbreviations is unknowable ahead of time. Unless pick a few that might be common, but it would miss a lot. -- GreenC 14:54, 27 November 2018 (UTC)
- @GreenC: Thanks for the thought you're putting into considering this idea; I appreciate it. One way the bot could work to avoid that issue is to not key off of URLs, but rather off of the abbreviations. As in, it'd be triggered by the "NYT" in either the work or agency field, and then use the URL just as a confirmation to double check. That way, errors users have made in the citation fields would remain, but at least the format would be improved and no new errors would be introduced. - Sdkb (talk) 08:17, 27 November 2018 (UTC)
- @Sdkb: This is significantly harder than it seems, as most bots are. Wikipedia is one giant exception - the long tail of unexpected gotchas is very long, particular on formatting issues. Another problem is agencies (AP, UPI, Reuters). Often times the NYT is running an agency story. The cite should use NYT in the
- I am not against this idea totally but the bot would have to be a very good one for this to be a net positive and not end up creating more work. Emir of Wikipedia (talk) 22:18, 14 January 2019 (UTC)
- @Sdkb: you could build a list of unambiguous cases. E.g.
|work/journal/magazine/newspaper/website=NYT
combined with|url=https://www.nytimes.com/...
. Short of that, it's too much of a WP:CONTEXTBOT. I'll also point out that NY Times isn't exactly obscure/ambiguous either.Headbomb {t · c · p · b} 17:47, 27 January 2019 (UTC)- Okay, here's an initial list:
|work/journal/magazine/newspaper/website=NYT
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=NYT
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=NY Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=NY Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=NY Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=NYTimes
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=New York Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=The New York Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=The New York Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=The New York Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=New York Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=New York Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=New York Times
combined with|url=https://www.nytimes.com/...
|work/journal/magazine/newspaper/website=LA Times
combined with|url=https://www.latimes.com/...
|work/journal/magazine/newspaper/website=L.A. Times
combined with|url=https://www.latimes.com/...
|work/journal/magazine/newspaper/website=Los Angeles Times
combined with|url=https://www.latimes.com/...
|work/journal/magazine/newspaper/website=Los Angeles Times
combined with|url=https://www.latimes.com/...
|work/journal/magazine/newspaper/website=WaPo
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=Wa Po
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=Washington Post
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=Washington Post
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=Washington Post
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=Washington Post
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=The Washington Post
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=The Washington Post
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=The Washington Post
combined with|url=https://www.washingtonpost.com/...
|work/journal/magazine/newspaper/website=WSJ
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=WSJ
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=Wall St. Journal
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=Wall Street Journal
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=Wall Street Journal
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=Wall Street Journal
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=The Wall Street Journal
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=The Wall Street Journal
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=Wall Street Journal
combined with|url=https://www.wsj.com/...
|work/journal/magazine/newspaper/website=The Wall Street Journal
combined with|url=https://www.wsj.com/...
- Okay, here's an initial list:
- @Sdkb: you could build a list of unambiguous cases. E.g.
Sdkb (talk) 03:54, 1 February 2019 (UTC)
Changing New York Times to The New York Times would be great. I have seen people going through AWB runs doing it, but seems like a waste of human time. Kees08 (Talk) 23:32, 2 February 2019 (UTC)
- Not really sure changing Foobar to The Foobar is desired in many cases. WP:CITEVAR will certainly apply to a few of those. For NYT/NY Times, WaPo/Wa Po, WSJ, LA Times/L.A. Times, are those guaranteed to a refer to a version of these journals that were actually called by the full name? Meaning that was there as some point in the LA Times's history were "LA Times" or some such was featured on the masthead of the publication, in either print or webform? If so, that's a bad bot task. If yes, then there's likely no issue with it. Headbomb {t · c · p · b} 01:54, 3 February 2019 (UTC)
- For the "the" publications, it's part of their name, so referring to just "Foobar" is incorrect usage. (It's admittedly a nitpicky correction, but one we may as well make while we're in the process of making what I'd consider more important improvements, namely adding the wikilinks to help readers more easily verify the reliability of a source.) Regarding the question of whether any of those publications ever used the abbreviated name as a formal name for something, I'd doubt it, as it'd be very confusing, but I'm not fully sure how to check that by Googling. - Sdkb (talk) 21:04, 3 February 2019 (UTC)
- Not really sure changing Foobar to The Foobar is desired in many cases. WP:CITEVAR will certainly apply to a few of those. For NYT/NY Times, WaPo/Wa Po, WSJ, LA Times/L.A. Times, are those guaranteed to a refer to a version of these journals that were actually called by the full name? Meaning that was there as some point in the LA Times's history were "LA Times" or some such was featured on the masthead of the publication, in either print or webform? If so, that's a bad bot task. If yes, then there's likely no issue with it. Headbomb {t · c · p · b} 01:54, 3 February 2019 (UTC)
- The omission of 'the' is a legitimate stylistic variation. And even if 'N.Y. Times' never appeared on the masthead, the expansion of abbreviations (e.g. N.Y. Times / L.A. Times) could also be a legitimate stylistic variation. The acronyms (e.g. NYT/WSJ) are much safer to expand though. Headbomb {t · c · p · b} 21:41, 3 February 2019 (UTC)
- It is a change I have had to do many times since it is brought up in reviews (FAC usually I think). It would be nice if we could find parameters to make it possible. Going by the article, since December 1, 1896, it has been referred to as The New York Times. The ranges are:
- The omission of 'the' is a legitimate stylistic variation. And even if 'N.Y. Times' never appeared on the masthead, the expansion of abbreviations (e.g. N.Y. Times / L.A. Times) could also be a legitimate stylistic variation. The acronyms (e.g. NYT/WSJ) are much safer to expand though. Headbomb {t · c · p · b} 21:41, 3 February 2019 (UTC)
- September 18, 1851–September 13, 1857 New-York Daily Times
- September 14, 1857–November 30, 1896 The New-York Times
- December 1, 1896–current The New York Times
- New York Times has never been the title of the newspaper, and we could use date ranges to verify we do not hit the edge cases of pre-December 1, 1896 The New York Times articles. There is The New York Times International Edition, but it seems like it has a different base-URL than nytimes.com. I can go through the effort to verify the names of the other publications throughout the years, but do you agree with my assessment of The New York Times? Kees08 (Talk) 01:51, 4 February 2019 (UTC)
Is anyone interested in this? I still think it would save myself a lot of editing time. Headbomb did you have further thoughts? Kees08 (Talk) 16:21, 15 March 2019 (UTC)
- @Kees08: I definitely still am, but I'm not sure how to move the proposal forward from here. - Sdkb (talk) 21:45, 21 March 2019 (UTC)
MOS:ACCESS#Text / MOS:FONTSIZE compliance
Hi. MOS:ACCESS#Text / MOS:FONTSIZE are clear. We are to "avoid using smaller font sizes in elements that already use a smaller font size, such as infoboxes, navboxes and reference sections." However, many infoboxes use {{small}} or the html code, especially around degrees earned (here's one example I corrected yesterday). I used AWB to remove small font from many U.S. politician infoboxes of presidents, senators, and governors, but there are so many more articles that have them. Here's an example for a TV station. I've noticed many movies and TV shows have small text in the infobox as well. Since I cannot calculate how many articles violate this particular rule of MOS, I would like someone to automate a bot to remove small text from infoboxes of all kinds. – Muboshgu (talk) 22:04, 20 December 2018 (UTC)
At least on my screen, your edit had no effect, because as far as I know, there is some sort of CSS style that limits infobox font size to a minimum of 85%. I am pretty sure I just saw that described the other day, but my searches for it have turned up nothing. Maybe someone like TheDJ would know.
If I am correct, that means that edits to remove small templates and tags from infoboxes would be cosmetic edits, which are generally frowned upon.However, there are a heck of a lot of unclosed<small>...</small>
tags within infoboxes, along with small tags wrapping multiple lines, both of which cause Linter errors, so it may be possible to get a bot approved to remove tags as long as fixing Linter errors is in the bot's scope. I welcome corrections on the four things I got wrong in these four sentences. – Jonesey95 (talk) 23:58, 20 December 2018 (UTC)- It's not "cosmetic". It's an accessibility issue. In this version, the BS, MS, and JD in the infobox are smaller than 85%. – Muboshgu (talk) 05:47, 21 December 2018 (UTC)
- FWIW, Firefox's Inspector tells me that "BS" in that version is exactly 85%. – Jonesey95 (talk) 10:29, 21 December 2018 (UTC)
- Odd. That was not the assessment of User:Dreamy Jazz. [1] – Muboshgu (talk) 20:42, 22 December 2018 (UTC)
- Fascinating. I just looked at the two revisions of Brian Bosma in Chrome while not logged in, and I definitely see a size difference in the "BS" and "JD" characters. So these would not be cosmetic edits after all, at least for some viewers using some browsers. (I have struck some of my previous comments.) – Jonesey95 (talk) 21:59, 22 December 2018 (UTC)
- P.S. I found the reference to the small template sizing text at 85% at Template:Small. It looks like I may have misinterpreted that note. – Jonesey95 (talk) 01:42, 23 December 2018 (UTC)
- Fascinating. I just looked at the two revisions of Brian Bosma in Chrome while not logged in, and I definitely see a size difference in the "BS" and "JD" characters. So these would not be cosmetic edits after all, at least for some viewers using some browsers. (I have struck some of my previous comments.) – Jonesey95 (talk) 21:59, 22 December 2018 (UTC)
- Odd. That was not the assessment of User:Dreamy Jazz. [1] – Muboshgu (talk) 20:42, 22 December 2018 (UTC)
- FWIW, Firefox's Inspector tells me that "BS" in that version is exactly 85%. – Jonesey95 (talk) 10:29, 21 December 2018 (UTC)
- It's not "cosmetic". It's an accessibility issue. In this version, the BS, MS, and JD in the infobox are smaller than 85%. – Muboshgu (talk) 05:47, 21 December 2018 (UTC)
@Jonesey95 and Muboshgu: Hello. Although the 85% font-size is defined, the computed value of the font-size is below 11.9px (it is 10.4667px). This is because font-size percentages work based on the parent container, not the document (see 1 under percentages). In this case the infobox has already decreased the font-size to 88% of the document, the font-size computed from the {{small}} tag will be 74.8% smaller than the rest of the document (0.88 * 0.85 = 0.748). This is the case in Firefox, Chrome, Edge (10.4px), Opera and Internet Explorer. This behaviour is the standard and so will be experienced in all browsers. Dreamy Jazz 🎷 talk to me | my contributions 10:46, 23 December 2018 (UTC)
- Yes, here's a demo of what happens when percentages get enclosed by other percentages: Text Text Text Text Text . That goes to five levels, each being 95% of the enclosing element. --Redrose64 🌹 (talk) 12:42, 23 December 2018 (UTC)
- That is helpful. I discovered that I have set my Firefox preferences to prevent the font size from going below 11 pt, which enforces MOS for me. But in Chrome, which I have left unconfigured, that text gets smaller. By all means, let's remove instances of
<small>...</small>
and {{small}} (and its size-reducing siblings) from infoboxes, both in Template space and in article space. – Jonesey95 (talk) 14:31, 23 December 2018 (UTC)
- That is helpful. I discovered that I have set my Firefox preferences to prevent the font size from going below 11 pt, which enforces MOS for me. But in Chrome, which I have left unconfigured, that text gets smaller. By all means, let's remove instances of
- I have been using AWB to help with this issue too. <small> and </small> cam be removed with a simple find and replace but the template is better dealt with using Regex. --Emir of Wikipedia (talk) 21:08, 3 February 2019 (UTC)
- Is there a category and/or method of easily listing these questionable pages? Primefac (talk) 15:44, 10 February 2019 (UTC)
- I think that Special:WhatLinksHere/Template:Small hiding links and redirects but showing transclusions might find what you want but not in a convenient list or category. When I was doing it in AWB I was just loading from the birth year categories. Emir of Wikipedia (talk) 15:58, 17 February 2019 (UTC)
- Is there a category and/or method of easily listing these questionable pages? Primefac (talk) 15:44, 10 February 2019 (UTC)
Auto-archive IP warnings
I imagine it's fairly confusing for IP users to have to scroll through lots of old warnings from previous users of their IP before getting to their actual message. We have Template:Old IP warnings top (and its partner), but it's rarely used—thoughts on writing a bot to automatically apply it to everything more than a yearish ago? Gaelan 💬✏️ 16:21, 10 January 2019 (UTC)
- Technically feasible and is a good idea, IMO. Needs wider community input beyond BOTREQ. -- GreenC 17:09, 10 January 2019 (UTC)
It seems like there is community support to implement this from the discussions. Should be open another discussion to iron out the implementation details? If there is consensus to do this task with a bot, I am willing to do it. Kadane (talk) 05:45, 15 March 2019 (UTC)
Taxa
Bot to create entry in the (english) Wikipedia Category: Plants described in (year)
Data to be taken from Wikidata to give the the year of publication of a taxon and create "Category:Taxa described in ()" within the(English) wikipedia taxon entry, if a wikipedia enty has been created. MargaretRDonald (talk) 22:55, 22 January 2019 (UTC)
- @MargaretRDonald: why? is there any support for such mass-creation? --DannyS712 (talk) 06:04, 1 March 2019 (UTC)
- @DannyS712: Currently we have "Category:Taxa named by x" when a user links to the category, he/she gets a ridiculously uninformative list, which fails to include many of the plants cuthored by x for which there are wikipedia articles. If there were some automatic creation of the category for a plant article, then the only reason that a plant would be missing from the list of taxa authored would be that there was no wikipedia article. As it stands, the category:Taxa named by x is ludicrously unhelpful. See for example, Category:Taxa named by Ferdinand von Mueller. (I put this up here in the hope that others might consider the issue and perhaps do something about it. MargaretRDonald (talk) 06:13, 1 March 2019 (UTC)
- @MargaretRDonald: Is this part of the request below? --DannyS712 (talk) 06:16, 1 March 2019 (UTC)
- Hi @DannyS712: They are related, but slightly different. It is always clear who named the taxon (the final author). It is somewhat less clear the year in which it was described: with some wikipedia editors choosing the year of first the first publication, while others consider that the person(s) who gave the current name should get the year of publication too, in that, they have perfected (refined) the description. Thus, in Decaisnina hollrungii (K.Schum.) Barlow, the year in which the plant is described has been given as that of the publication by [[K.Schum.}, but there is no doubt that the taxon was named by Barlow. (I am not sure what the wikipedia consensus is on this!!) MargaretRDonald (talk) 06:38, 1 March 2019 (UTC)
- @MargaretRDonald: Is this part of the request below? --DannyS712 (talk) 06:16, 1 March 2019 (UTC)
- @DannyS712: Currently we have "Category:Taxa named by x" when a user links to the category, he/she gets a ridiculously uninformative list, which fails to include many of the plants cuthored by x for which there are wikipedia articles. If there were some automatic creation of the category for a plant article, then the only reason that a plant would be missing from the list of taxa authored would be that there was no wikipedia article. As it stands, the category:Taxa named by x is ludicrously unhelpful. See for example, Category:Taxa named by Ferdinand von Mueller. (I put this up here in the hope that others might consider the issue and perhaps do something about it. MargaretRDonald (talk) 06:13, 1 March 2019 (UTC)
Bot to create category "Category:Taxa described by ()"
The bot would use the wikidata taxon entry to find the auhor of a taxon, and then use it again to find the corresponding author article to find the appropriate author category. (This will not always work - but will work in large number of cases. Thus, the English article for "Edward Rudge" corresponds to the category:"Category:Taxa named by Edward Rudge", and the simple strategy outlined here would work for Edward Rudge, Stephen Hopper and .... The category created would be an entry in the article. MargaretRDonald (talk) 23:08, 22 January 2019 (UTC)
- @MargaretRDonald: why? is there any support for such mass-creation? Also, what do you mean by
the category created would be an entry in the article
, and do you want "described by" or "named by"? --DannyS712 (talk) 06:05, 1 March 2019 (UTC)- @DannyS712: 1. See my answer to your preceding question. 2. There are two categories related to authorship and publication: (i) Category:Plants described in (year), and (ii) Category:Taxa named by (author). You can see how they are used in (for example) Velleia paradoxa. For my money I am not sure that I would really want to know what plants were described in 1810, but I would certainly like, when clicking on Category:Taxa named by Robert Brown, to be getting a complete list of wikipedia articles for which this is true. (Hope this explains why I think it important) MargaretRDonald (talk) 06:26, 1 March 2019 (UTC)
- @MargaretRDonald: So basically, add "Plants described in ___" and "Taxa named by ___" to all currently existing taxa pages if they are missing? --DannyS712 (talk) 06:35, 1 March 2019 (UTC)
- Yes. That would be great. That is, "Category:Plants described in ___" and "Category:Taxa named by ___" to the end of the taxon page.. MargaretRDonald (talk) 06:39, 1 March 2019 (UTC)
- @MargaretRDonald: is this at all related to the
|authority
parameter in {{Speciesbox}} and its ilk? That would make this a lot simpler... --DannyS712 (talk) 06:51, 1 March 2019 (UTC)- @DannyS712: For the author, yes. It is the parameter
|authority
in {{Speciesbox}}. The year is not. It is found associated with the basionym in Wikidata entry (an entry which is often missing from wikidata, but if it exists that would be the safest place to take it from). Most articles show the author of the basionym (the name in the brackets), bur have no taxonomy section and even when they do it is unstructured text... So probably the year of the description is in the too-hard basket. (But as I indicated, I find the year category somewhat less important..) MargaretRDonald (talk) 07:07, 1 March 2019 (UTC)
- @DannyS712: For the author, yes. It is the parameter
- @MargaretRDonald: is this at all related to the
- Yes. That would be great. That is, "Category:Plants described in ___" and "Category:Taxa named by ___" to the end of the taxon page.. MargaretRDonald (talk) 06:39, 1 March 2019 (UTC)
- @MargaretRDonald: So basically, add "Plants described in ___" and "Taxa named by ___" to all currently existing taxa pages if they are missing? --DannyS712 (talk) 06:35, 1 March 2019 (UTC)
- @DannyS712: 1. See my answer to your preceding question. 2. There are two categories related to authorship and publication: (i) Category:Plants described in (year), and (ii) Category:Taxa named by (author). You can see how they are used in (for example) Velleia paradoxa. For my money I am not sure that I would really want to know what plants were described in 1810, but I would certainly like, when clicking on Category:Taxa named by Robert Brown, to be getting a complete list of wikipedia articles for which this is true. (Hope this explains why I think it important) MargaretRDonald (talk) 06:26, 1 March 2019 (UTC)
And if we were to do this the result would be that we would get, e.g., a list of accepted taxa named by John Lindley, and not a whole ragtag list of plants where the assigning of the initial genus is now considered incorrect. In achieving that we could be a far better resource than IPNI. MargaretRDonald (talk) 06:57, 1 March 2019 (UTC)
- @MargaretRDonald: I don't think that wikipedia is going to be a better resource than IPNI for this field - ~maybe~ wikispecies? In any event, this task is beyond my abilities, but hopefully my questions have made it clearer to others what you are requesting. --DannyS712 (talk) 07:06, 1 March 2019 (UTC)
- @DannyS712: Probably not a better resource, but an extremely useful resource should it list only accepted names. (IPNI lists everything for an author and it then can require checking every name in say tropicos or Plants of the world to find which of them are accepted, a considerable task.) MargaretRDonald (talk) 07:14, 1 March 2019 (UTC)
- @MargaretRDonald: yeah, but a bot shouldn't tag those without the same or comparable sources... --DannyS712 (talk) 07:35, 1 March 2019 (UTC)
- (Not sure what you are trying to say here..) MargaretRDonald (talk) 07:42, 1 March 2019 (UTC) (In any case my comment on lists of accepted species was little more than a throw-away comment.) I just find it frustrating that "Category:Taxa named by Robert Brown" is not remotely within cooee of being so. And if the parameter authority in the species box were to be used it might just come within cooee of being so. MargaretRDonald (talk) 07:42, 1 March 2019 (UTC)
- I'm saying that unless the reliable sources are there, we shouldn't be adding the category, especially not with a bot --DannyS712 (talk) 07:45, 1 March 2019 (UTC)
- There are many reliable sources, which usually agree, but like all things requiring man-power, they can be out of sync and people in disagreement. I think it would be better if we used wikidata (with whatever its errors) to populate these categories. The result would be better than the entirely misleading stuff we have now where almost none of the taxa named by a person show up because of the failure by humans to populate the categories. MargaretRDonald (talk) 15:36, 15 March 2019 (UTC)
- I'm saying that unless the reliable sources are there, we shouldn't be adding the category, especially not with a bot --DannyS712 (talk) 07:45, 1 March 2019 (UTC)
- (Not sure what you are trying to say here..) MargaretRDonald (talk) 07:42, 1 March 2019 (UTC) (In any case my comment on lists of accepted species was little more than a throw-away comment.) I just find it frustrating that "Category:Taxa named by Robert Brown" is not remotely within cooee of being so. And if the parameter authority in the species box were to be used it might just come within cooee of being so. MargaretRDonald (talk) 07:42, 1 March 2019 (UTC)
- @MargaretRDonald: yeah, but a bot shouldn't tag those without the same or comparable sources... --DannyS712 (talk) 07:35, 1 March 2019 (UTC)
- @DannyS712: Probably not a better resource, but an extremely useful resource should it list only accepted names. (IPNI lists everything for an author and it then can require checking every name in say tropicos or Plants of the world to find which of them are accepted, a considerable task.) MargaretRDonald (talk) 07:14, 1 March 2019 (UTC)
Detect Hijacked journals
Stop Predatory Journals maintains a list of hijacked journals. Could someone search wikipedia for the presence of hijacked URLs and produce a daily/weekly/whateverly report? Maybe have a WP:WCW task for it too? Headbomb {t · c · p · b} 00:09, 4 February 2019 (UTC)
- This is a good idea. Made a script to scrape the site and search WP, it found three domains in 11 articles. -- GreenC 16:50, 4 February 2019 (UTC)
Extended content
|
---|
https://scholarlyoa.com/other-pages/hijacked-journals/u
http://www.bnas.org/
http://acjournal.in/journal-of-renewable-natural-resources-bhutan
|
@Headbomb: can post the report on a regular basis if there is a page. Script takes less than 20 seconds to complete so not expensive on resources. -- GreenC 17:02, 4 February 2019 (UTC)
- They were found with CirrusSearch (Elasticsearch) it got some close matches. -- GreenC 17:27, 7 March 2019 (UTC)
- @GreenC: [2] is a better link than the above one for hijacked journals. It's pretty much the same as the old link, but this one is updated. In particular, there's an additional journal (Arctic, at the very bottom of the page).
- There's a few place a report like that could be generated. Category talk:Hijacked journals seems as good a place as any. I'd suggest creating a section and just overwriting it every day (if there's a change). Headbomb {t · c · p · b} 17:46, 7 March 2019 (UTC)
- How about WP:Hijacked journals / WP:HIJACKJOURNAL (an essay or how-to) that can define the meaning, describe the problem for wikipedia, link to external sites, and link to the bot-generated list as a sub-page. Nothing complicate but a central place for discussion and info that can be linked to from other pages. -- GreenC 18:08, 7 March 2019 (UTC)
- If we have a dedicated page, Wikipedia:Reliable sources/Hijacked journals seems to be the natural place to me. Headbomb {t · c · p · b} 18:13, 7 March 2019 (UTC)
- How about WP:Hijacked journals / WP:HIJACKJOURNAL (an essay or how-to) that can define the meaning, describe the problem for wikipedia, link to external sites, and link to the bot-generated list as a sub-page. Nothing complicate but a central place for discussion and info that can be linked to from other pages. -- GreenC 18:08, 7 March 2019 (UTC)
- They were found with CirrusSearch (Elasticsearch) it got some close matches. -- GreenC 17:27, 7 March 2019 (UTC)
List of values used for Template:Tooltip
Could someone generate a list of values used for Template:Tooltip (the redirect, not Template:Abbr) in a table form, so it would be easier to see what needs to be converted to {{abbr}} per the result of this discussion? --Gonnym (talk) 14:20, 15 February 2019 (UTC)
Doing... Dat GuyTalkContribs 15:54, 15 February 2019 (UTC)
- Gonnym I can't find what Kuznetsov-class aircraft carrier has that transcludes the template. Could you help me figure it out? Dat GuyTalkContribs 16:33, 15 February 2019 (UTC)
- In addition, would you like me to also look if the articles that transclude the tooltip template also match the [Abbr/Abrrv/What is] templates? It seems like they're redirects. Also pinging @Amorymeltzer: fyi. Dat GuyTalkContribs 16:47, 15 February 2019 (UTC)
- Regarding your second question, no need. Only {{Tooltip}} was discussed as deprecated in that discussion. Regarding the first issue though. Wow. I've looked over that article multiple times and inside the templates used on that page and I can't seem to figure out where tooltip is used. Nothing seems to be using it. --Gonnym (talk) 17:06, 15 February 2019 (UTC)
- It was Template:Ukrainian ships, I've removed the use. ~ Amory (u • t • c) 17:13, 15 February 2019 (UTC)
- Regarding your second question, no need. Only {{Tooltip}} was discussed as deprecated in that discussion. Regarding the first issue though. Wow. I've looked over that article multiple times and inside the templates used on that page and I can't seem to figure out where tooltip is used. Nothing seems to be using it. --Gonnym (talk) 17:06, 15 February 2019 (UTC)
- @Amorymeltzer and Gonnym: Before I finish it, does User:DatGuy/sandbox look good? Dat GuyTalkContribs 18:01, 15 February 2019 (UTC)
- I can't speak for Gonnym, but one thing I think would be helpful (at least, how I was planning on thinking about it) is to know which of these are within the same template or table. I imagine that'd be harder to handle, but ideally many of the uses in mainspace could be replaced by a wrapper template for Module:Sports table, so knowing what the common pairings would be helpful. ~ Amory (u • t • c) 18:07, 15 February 2019 (UTC)
- That looks good. Do you think it is possible to list only unique pairings and the number of times it appears? So for example, list only once the "Ref."/"Reference". If this is possible, it will help in deciding if this is something that can be done with AWB or a bot. It will also make reading the table easier. If it can't, the current table still helps a lot though. --Gonnym (talk) 21:31, 15 February 2019 (UTC)
- I don't believe that checking if it's inside a template/table is simple/worth the time, but I've made User:DatGuy/sandbox and User talk:DatGuy/sandbox. They will be updated with sandbox1 accordingly when article size exceeds limits. Dat GuyTalkContribs 23:45, 15 February 2019 (UTC)
- Since I see (at least) two different "GD" entries in User talk:DatGuy/sandbox, I'm assuming you managed to get unique pairings right? If that is true, could you also add the 2nd argument column to this table? This is very helpful btw. I've already identified a few thousand easy replacements. --Gonnym (talk) 23:56, 15 February 2019 (UTC)
- I'm not sure what the duplicate entries are actually. I've attempted a fix. Dat GuyTalkContribs 23:59, 15 February 2019 (UTC)
- @Gonnym and Amorymeltzer: Well, seems like it's being a bit of a pain in the ass due to article size limits. It has calculated 24000 uses. The pages are User talk:DatGuy/sandbox and User:DatGuy/sandbox(0-22). Dat GuyTalkContribs 09:44, 16 February 2019 (UTC)
- Yeah, it still has a lot of uses, but for example, just "Pts" alone has 10705 uses. Just reconfirming with you, are all "Pts" uses using the same second argument value? --Gonnym (talk) 09:47, 16 February 2019 (UTC)
- Haha indeed! I was surprised you were so confident. Still, it's helpful to have. I've got a lot on my plate at the moment, but if I get a chance in the next month or so, I'll try and work on finding the uses that are the same (e.g. all the headers with W/D/L, those with W/D/L/Pts, etc.). ~ Amory (u • t • c) 11:34, 17 February 2019 (UTC)
- Since I see (at least) two different "GD" entries in User talk:DatGuy/sandbox, I'm assuming you managed to get unique pairings right? If that is true, could you also add the 2nd argument column to this table? This is very helpful btw. I've already identified a few thousand easy replacements. --Gonnym (talk) 23:56, 15 February 2019 (UTC)
- I can't speak for Gonnym, but one thing I think would be helpful (at least, how I was planning on thinking about it) is to know which of these are within the same template or table. I imagine that'd be harder to handle, but ideally many of the uses in mainspace could be replaced by a wrapper template for Module:Sports table, so knowing what the common pairings would be helpful. ~ Amory (u • t • c) 18:07, 15 February 2019 (UTC)
Credits adapted from
Thousands of articles about music artists, albums and songs reference the source in the body text (example: OnePointFive). Such references belong in a <ref> block at the end of the page and not in the body text. Most of these references follow a common pattern, so I hope this kind of edit can be made by a bot.
I suggest making a bulk replacement from
= =Track listing= =
Credits adapted from [[Tidal (service)|Tidal]].<ref name="Tidal">{{cite web|url=https://listen.tidal.com/album/93301143|title=ONEPOINTFIVE / Aminé on TIDAL|publisher=Tidal|accessdate=August 15, 2018}}</ref>
to
= =Track listing<ref name="Tidal">{{cite web|url=https://listen.tidal.com/album/93301143|title=ONEPOINTFIVE / Aminé on TIDAL|publisher=Tidal|accessdate=August 15, 2018}}</ref>= =
Difference sources: Tidal (service), “the album notes”, “the album sleeve”, “the album notes”, “the liner notes of XXX” Different heading names, including “Track listing”, “Personnel”, ”Credits and personnel”. Variants: “Credits adapted from XXX”, “All credits adapted from XXX”, “All personnel credits adapted XXX”
Does this sound feasible/sensible? --C960657 (talk) 17:14, 28 February 2019 (UTC)
- References should not be located in section titles. Pretty sure there is a guideline about it, and not good for a couple reasons. The correct way is current, create a line that says "Source: [1]" or something. -- GreenC 17:43, 28 February 2019 (UTC)
Citations should not be placed within, or on the same line as, section headings.
WP:CITEFOOT — JJMC89 (T·C) 03:38, 1 March 2019 (UTC)- Also (from MOS:HEADINGS):
Section headings should: ... Not contain links, especially where only part of a heading is linked.
Unless you use pure plain-text parenthetical referencing, refs always generate a link. --Redrose64 🌹 (talk) 12:41, 1 March 2019 (UTC)
Fix 'background' in sortable tables
See background (pardon the pun).
The idea is to change the css element background
to background-color
(and other similar attributes) in sortable tables (example). Headbomb {t · c · p · b} 19:14, 5 March 2019 (UTC)
- @Magioladitis: The checkwiki team could also get in on this. Headbomb {t · c · p · b} 19:18, 5 March 2019 (UTC)
- I would be sensitive here to whether there is another background style declared, as
background
is shorthand for a number of attributes. Otherwise seems like a good idea. --Izno (talk) 22:44, 5 March 2019 (UTC) - Oppose as written. Changing
background
tobackground-style
would break all existing uses, becausebackground-style
is not a defined property. See CSS Backgrounds and Borders Module Level 3 for examples of valid property names. --Redrose64 🌹 (talk) 13:03, 7 March 2019 (UTC)
Bot to generate list of editor's creations which have been tagged for improvements
This would be useful for New Page Patrol: it would save us sending multiple messages about an editor's creations (which can cause upset) and show clearly what the problem is and what articles have been identified as needing improvements. This has been requested more than once of me by an editor and I've had to find and list them manually. It would also benefit other editors - I would love to look over which of my creations have tags and improve them. This would give creators (if they want to) the chance to make improvements and bring down the backlogs. Is it feasible? Thanks for looking into this, Boleyn (talk) 08:43, 9 March 2019 (UTC)
- @Boleyn: Maybe ask at User talk:Community Tech bot? Currently, Wikipedia:Database reports/Editors eligible for Autopatrol privilege already tracks if a user's pages have been tagged, so they might be able to help (though the code is on github). That specific report is overseen by User:MusikAnimal (WMF), so pinging @MusikAnimal if they want to chime in. --DannyS712 (talk) 08:48, 9 March 2019 (UTC)
- Thanks for the suggestions, DannyS712. Boleyn (talk) 08:57, 9 March 2019 (UTC)
- I think this is better fit for an external tool, rather than a bot. I have debated for some time adding this functionality to XTools. The problem is the relevant maintenance categories are different on every wiki. I suppose we can just make them configurable. I'll look into it!
- In the meantime, quarry:query/34173 is an example query you could use to find such articles. Note that this does not encompass all maintenance categories, just the major ones. You can fork the query and tweak it as desired. Best, — MusikAnimal talk 18:42, 9 March 2019 (UTC)
- Thaks, MusikAnimal, that's really helpful. Adamtt9, you may want to check this out, and thanks for raising the idea. Boleyn (talk) 08:28, 10 March 2019 (UTC)
- Thanks for the suggestions, DannyS712. Boleyn (talk) 08:57, 9 March 2019 (UTC)
Make Articles in Compliance with MOS:SURNAME
I've noticed that a lot of articles are not in compliance with MOS:SURNAME, especially in Category:Living people. I've manually changed a few pages, but as a programmer, I think this could be greatly automated. Any repeats of the full name, or the first name, beyond the title, first sentence, and infobox should not be allowed and replaced with the last name. I can help out in creating a bot that can accomplish this. InnovativeInventor (talk) 01:21, 21 March 2019 (UTC)
Just bumped into this: Wikipedia_talk:Manual_of_Style/Biography#Second_mention_of_forenames, so there should be detection of other people with the same last name. Additionally, this bot should intend to provide support for humans, not to automate the whole thing (as context is important). InnovativeInventor (talk) 03:57, 21 March 2019 (UTC)
- @InnovativeInventor: Is this about the ordering of names in a category page, or about the use of names in the article prose? --Redrose64 🌹 (talk) 17:07, 21 March 2019 (UTC)
- @Redrose64: This is about the reuse of names in the article prose and ensuring that the full name is only mentioned once (excluding ambiguous cases where the full name is necessary to clarify the subject of the sentence). InnovativeInventor (talk) 19:40, 21 March 2019 (UTC)
- I don't like this, and I'm calling WP:CONTEXTBOT on it. Consider somebody from Iceland, such as Katrín Jakobsdóttir - the top of the article has
- Or somebody from a family with several notable members - have a look at Johann Ambrosius Bach (which is quite short) and consider how it would look if we used only surnames: After Bach's death, his two children, Bach and Bach, moved in with his eldest son, Bach. --Redrose64 🌹 (talk) 21:05, 21 March 2019 (UTC)
- @Redrose64: The idea is that this will be a human-assisted bot, not a completely automated bot. Just something that can speed up the process. I agree that it depends on the context. But, it would be nice to assist efforts to regularize articles that are not in compliance with MOS:SURNAME.InnovativeInventor (talk) 03:23, 22 March 2019 (UTC)
- InnovativeInventor - Considering it will be human assisted, wouldn't it be better to include the functionality inside AWB or create a user script? Kadane (talk) 21:35, 22 March 2019 (UTC)
- Kadane I think something that can crawl all of Wikipedia's bio pages would be better. Not sure though. I'm not familiar with the best way to help regularize all the bio pages. InnovativeInventor (talk) 23:46, 22 March 2019 (UTC)
- @Redrose64: The idea is that this will be a human-assisted bot, not a completely automated bot. Just something that can speed up the process. I agree that it depends on the context. But, it would be nice to assist efforts to regularize articles that are not in compliance with MOS:SURNAME.InnovativeInventor (talk) 03:23, 22 March 2019 (UTC)
- @Redrose64: This is about the reuse of names in the article prose and ensuring that the full name is only mentioned once (excluding ambiguous cases where the full name is necessary to clarify the subject of the sentence). InnovativeInventor (talk) 19:40, 21 March 2019 (UTC)
A heads up for AfD closers re: PROD eligibility when approaching NOQUORUM
When an AfD discussion ends with no discussion, WP:NOQUORUM indicates that the closing admin should treat the article as one would treat an expired PROD. One mundane part of this process is specifically checking whether the article is eligible for PROD ("the page is not a redirect, never previously proposed for deletion, never undeleted, and never subject to a deletion discussion"). It would be really nice, when an AfD listing is reaching full term (seven days) with no discussion, if a bot could check the subject's page history and leave a comment on, say, the beginning of the listing's seventh day as to whether the article is eligible for PROD (a simple yes/no). If impossible to check each aspect of PROD eligibility, it would at least be helpful to know whether the article has been proposed for deletion before, rather than having to scour the page history. A bot here could help the closing admin more easily determine whether to relist or soft delete. More discussion here. czar 21:12, 23 March 2019 (UTC)
- @Czar: preliminary thoughts:
- Does that sound about right in terms of automatically verifying prod eligibility? --DannyS712 (talk) 21:37, 23 March 2019 (UTC)
- @DannyS712, I would add to #4: check the talk page for history templates indicating prior deletion listings. E.g., it's possible that the previous AfD was under a different article title altogether. (Since those instances would get complicated, would also be helpful for the AfD comment to note if the article was previously live under another title so the closer can manually investigate.) re: #2, I would consider searching edit summaries for either added or removed PRODs or mentions of deletion (as PRODs not added via script may have bad edit summaries). Otherwise this sounds great to me! czar 21:54, 23 March 2019 (UTC)
- @Czar: okay, this seems like something I could do, but it would be a while before a bot was up and running. As far as I can tell, the hardest part will be parsing the AfD itself - how to detect if other users have cast a !vote, rather than just commenting, sorting the AfD, etc. Furthermore, since I'm not very original and implement most of my bot tasks via either AWB (not very usable in this case) and javascript, the javascript bot tasks are generally just automatically running a user script on multiple pages. So first, I will be able to have a script that alerts the user if an AfD could be subject to PROD, and then post such notices automatically. The first part is just a normal user script, so it (I think) doesn't need a BRFA, and I'll let you know when I have a working alpha and am ready to start integrating the second part. This will be a while though, so if anyone else wants to tackle this bot request I won't be offended :). Thanks, --DannyS712 (talk) 22:07, 23 March 2019 (UTC)
- @DannyS712, I would add to #4: check the talk page for history templates indicating prior deletion listings. E.g., it's possible that the previous AfD was under a different article title altogether. (Since those instances would get complicated, would also be helpful for the AfD comment to note if the article was previously live under another title so the closer can manually investigate.) re: #2, I would consider searching edit summaries for either added or removed PRODs or mentions of deletion (as PRODs not added via script may have bad edit summaries). Otherwise this sounds great to me! czar 21:54, 23 March 2019 (UTC)
- This seems vaguely related to this discussion on VPT. --Izno (talk) 00:41, 24 March 2019 (UTC)
- Yes Izno, you are correct. I will make a note there that a bot request is the manner being pursued. I think your idea of an edit filter might also be useful. That would ensure the presence of a specific string of text in the edit summary which the bot could search for IAW #2. I agree that simply adding a message to the effect that the subject being discussed either is or is not eligible for soft deletion without relisting would be good for the initial iteration and suggest that it might be best to maintain that as the functional standard indefinitely. I do want to thank the many editors who have stepped up to assist in this effort. I am proud of my affiliation with such a fine lot. Sincerely.--John Cline (talk) 01:52, 24 March 2019 (UTC)
Indian settlements: updating census data
Most articles on settlements in India (eg. Bambolim) still use 2001 census data. They need to be updated to use the 2011 census data. SD0001 (talk) 18:10, 29 March 2019 (UTC)
- Is 2011 Census data available on WikiData? Template:Austria metadata Wikidata provides an example template and User:GreenC bot/Job 12 was a recent BRFA to add the template to Austria settlement articles: Example. -- GreenC 19:16, 29 March 2019 (UTC)
- I don't think they're there on wikidata. This site does provide the data in what could be considered machine-readable format, though. SD0001 (talk) 16:08, 30 March 2019 (UTC)
- Another site is https://www.census2011.co.in If these sites were scraped and converted to CSV, the data could be uploaded to Wikidata via Wikipedia:Uploading metadata to Wikidata. Although this is a big job given the size of India, and the next census is in 2021, when it would be done over again. The number of potential locations must be immense, I went to http://www.censusindia.gov.in/pca/Searchdata.aspx and entered "Hyderabad" and it brought up a list of villages one having a population of 40 people, although which village of "Haiderpur" it is who knows as there are many listed. -- GreenC 17:28, 30 March 2019 (UTC)
- The link I've given above already has the data in in Excel format. Ignore the columns part-A ebook and part-B ebook, what we need are the ones under "Town amenities" and "Village amenities". That's two Excel sheets for each of the 35 Indian states and union territories. Some of these files are phenomenally large as you said - Andhra Pradesh contains 27800 villages, for instance. SD0001 (talk) 20:54, 30 March 2019 (UTC)
- Ah I see better. Checking Assam "Town Amenities" spreadsheet, for "Goalpara" (line #17), it has a population of 11,617 but our Goalpara says 48,911. If we assume this is for the Goalpara district it is 1,008,959, but in the spreadsheet it only adds up to about 20,000 (line #15-#18). Since most people there speak Goalpariya it seems unlikely there was a sudden population loss due to emigration. Are the spreadsheet numbers in some other counting system, or decimal offset? -- GreenC 22:34, 30 March 2019 (UTC)
- GreenC, 11617 is the number of households. Population is 53430, which is reasonable. To get total population of Goalpara district, you need to add up populations in line #15-#25 plus line #2161-#2989 in 'Village amenities' sheet, which roughly gives a figure close to 1,008,959. SD0001 (talk) 23:22, 30 March 2019 (UTC)
- Ah thanks again, SD0001! A program to extract and collate the data looks like the next step. I can't do it immediately as I am backlogged with programming projects. Extracting the data and uploading to Wikipedia per Wikipedia:Uploading metadata to Wikidata would be more than half the battle. Also ping User:Underlying lk who made the Wikidata instructions. -- GreenC 00:19, 31 March 2019 (UTC)
- It seems like we have 2011 population figures for over 70,000 Wikidata entities, though once we only consider entities with an en.wiki article, it drops to less than 4,000.--eh bien mon prince (talk) 05:15, 31 March 2019 (UTC)
- Interesting queries, thanks. Notice some Wikidata entries are referenced some not. Probably the data was loaded by different processes with variable levels of reliability and completeness. I would not be comfortable loading into encyclopedia until it has been checked against a known source and the source field updated. Found Administrative divisions of India helpful to understand the census divisions though the more I look the bigger and more complex it becomes. -- GreenC 14:14, 31 March 2019 (UTC)
- It seems like we have 2011 population figures for over 70,000 Wikidata entities, though once we only consider entities with an en.wiki article, it drops to less than 4,000.--eh bien mon prince (talk) 05:15, 31 March 2019 (UTC)
- Ah thanks again, SD0001! A program to extract and collate the data looks like the next step. I can't do it immediately as I am backlogged with programming projects. Extracting the data and uploading to Wikipedia per Wikipedia:Uploading metadata to Wikidata would be more than half the battle. Also ping User:Underlying lk who made the Wikidata instructions. -- GreenC 00:19, 31 March 2019 (UTC)
- GreenC, 11617 is the number of households. Population is 53430, which is reasonable. To get total population of Goalpara district, you need to add up populations in line #15-#25 plus line #2161-#2989 in 'Village amenities' sheet, which roughly gives a figure close to 1,008,959. SD0001 (talk) 23:22, 30 March 2019 (UTC)
- Ah I see better. Checking Assam "Town Amenities" spreadsheet, for "Goalpara" (line #17), it has a population of 11,617 but our Goalpara says 48,911. If we assume this is for the Goalpara district it is 1,008,959, but in the spreadsheet it only adds up to about 20,000 (line #15-#18). Since most people there speak Goalpariya it seems unlikely there was a sudden population loss due to emigration. Are the spreadsheet numbers in some other counting system, or decimal offset? -- GreenC 22:34, 30 March 2019 (UTC)
- The link I've given above already has the data in in Excel format. Ignore the columns part-A ebook and part-B ebook, what we need are the ones under "Town amenities" and "Village amenities". That's two Excel sheets for each of the 35 Indian states and union territories. Some of these files are phenomenally large as you said - Andhra Pradesh contains 27800 villages, for instance. SD0001 (talk) 20:54, 30 March 2019 (UTC)
- Another site is https://www.census2011.co.in If these sites were scraped and converted to CSV, the data could be uploaded to Wikidata via Wikipedia:Uploading metadata to Wikidata. Although this is a big job given the size of India, and the next census is in 2021, when it would be done over again. The number of potential locations must be immense, I went to http://www.censusindia.gov.in/pca/Searchdata.aspx and entered "Hyderabad" and it brought up a list of villages one having a population of 40 people, although which village of "Haiderpur" it is who knows as there are many listed. -- GreenC 17:28, 30 March 2019 (UTC)
- I don't think they're there on wikidata. This site does provide the data in what could be considered machine-readable format, though. SD0001 (talk) 16:08, 30 March 2019 (UTC)
- @Magnus Manske: This might be Gameable. --Izno (talk) 15:57, 31 March 2019 (UTC)
CFDS tagging and listing for "eSports" categories
There was a request to move categories with "eSports" to "esports" per WP:C2D at WT:VG, but that list is sizable. Is there someone here who can take care of the listing and tagging? (Avoid the WikiProject assessment categories.) --Izno (talk) 18:04, 31 March 2019 (UTC)
- @Izno: sure, I've added it to my current BRFA (WP:Bots/Requests for approval/DannyS712 bot 13) as a request for a trial. --DannyS712 (talk) 19:50, 1 April 2019 (UTC)
WikiProject Civil Rights Movement
I'm trying to set-up a bot to perform assessment and tagging work for Wikipedia:WikiProject Civil Rights Movement. The bot would need to rely only on keywords present in pages. The bot would provide a list of prospective pages that appear to satisfy rules given it. An example of what the project is seeking is something similar to User:InceptionBot. WikiProject Civil Rights Movement uses that bot to generate report Wikipedia:WikiProject Civil Rights Movement/New articles. Whereas that bot generates a report of new pages, the desired bot would assess old pages. Mitchumch (talk) 16:27, 1 April 2019 (UTC)
- At Wikipedia:Village pump (technical)#Assessment and tagging bot I didn't intend that you should try to set up your own bot. There are plenty of bots already authorised to carry out WikiProject tagging runs. Just describe the selection criteria, and we'll see who picks it up. --Redrose64 🌹 (talk) 19:46, 1 April 2019 (UTC)
- The selection criteria are keywords on pages:
- civil rights movement
- civil rights activist
- black panther party
- black power
- martin luther king
- student nonviolent coordinating committee
- congress of racial equality
- national associaton for the advancement of colored people
- naacp
- urban league
- southern christian leadership conference
- Mitchumch (talk) 22:02, 1 April 2019 (UTC)
- The selection criteria are keywords on pages:
Deal with links to split article (Batting average)
About 6 months ago Batting average was split into a short parent article about the concept of batting average across sports and 2 child articles Batting average (cricket) and Batting average (baseball) dealing with the specifics of the metric in the individual sports. Articles related to each sport still point to the parent article but should generally point to the sport specific one. After some searches using AWB, I found just over 15k links to Batting average. Using a recursive category search, I found that Category:Cricketers, Category:Seasons in cricket and Category:Years in cricket account for about 3k links and Category:Baseball players, Category:Seasons in baseball, Category:Years in baseball about 12k. There are about 300 remaining links in none of these categories, I am working through those manually with AWB. As an aside, a lot of the baseball players have a link in both an infobox and in article text. I had the cricketer infobox changed already, as that had a hardcoded link to the parent article.
The plan would be to replace
[[Batting average]]
with[[Batting average (cricket)|]]
[[Batting average|foo]]
with[[Batting average (cricket)|foo]]
in the first set of categories and
[[Batting average]]
with[[Batting average (baseball)|]]
[[Batting average|foo]]
with[[Batting average (baseball)|foo]]
in the second set. A lot of the non-piped links use lower-case, so don't know if that needs another set of rules. I'm also assuming that the pipe trick works in bot edits, otherwise the replacement text will need to be slightly expanded. I can provide the lists I created of the links to the article, of the categories and then intersections if this helps. Spike 'em (talk) 20:27, 1 April 2019 (UTC)
pipe trick works in bot edits
It does outside of references and other tags. --Izno (talk) 20:42, 1 April 2019 (UTC)
- Why not:
[[Batting average]]
with[[Batting average (cricket)|Batting average]]
- It would be standard, and less error prone for other bots/tools. -- GreenC 14:26, 2 April 2019 (UTC)
- Sure, no problem with that. As I said, the above relies on the pipe trick and it should be no different for a bot to replace the string with a slightly longer one. Spike 'em (talk) 14:46, 2 April 2019 (UTC)
- Another idea is a bot could word check for "cricket" in baseball articles and "baseball" in the cricket articles and log those aside. To help avoid cases where a cricket article might be talking about baseball (rare for sure). -- GreenC 15:03, 2 April 2019 (UTC)
- Sure, no problem with that. As I said, the above relies on the pipe trick and it should be no different for a bot to replace the string with a slightly longer one. Spike 'em (talk) 14:46, 2 April 2019 (UTC)
Population for Spanish municipalities
Adequately sourced population figures for all Spanish municipalities can be deployed by using {{Spain metadata Wikidata}}, as was recently done for Austria. See this diff for an example of the change.--eh bien mon prince (talk) 11:35, 11 April 2019 (UTC)
BRFA filed Well, since my bot for Austria is already written and completed, I might as well do this too. -- GreenC 13:47, 11 April 2019 (UTC)
Russia district maps
Replace image_map
with {{Russia district OSM map}} for all the articles on this list, as in this diff. The maps are already displayed in the articles, but currently this is achieved through a long switch function on {{Infobox Russian district}}; transcluding the template directly would be more efficient.--eh bien mon prince (talk) 11:58, 11 April 2019 (UTC)
- @Underlying lk: should be pretty similar to the German maps, right? --DannyS712 (talk) 22:31, 11 April 2019 (UTC)
- Yes pretty much. In fact, the German template is based on this one.--eh bien mon prince (talk) 13:26, 12 April 2019 (UTC)
- @Underlying lk: I can do this. I have a few BRFAs currently open, but once some finish I'll file one for this task --DannyS712 (talk) 04:20, 14 April 2019 (UTC)
- Yes pretty much. In fact, the German template is based on this one.--eh bien mon prince (talk) 13:26, 12 April 2019 (UTC)
Category:Pages using deprecated image syntax
Category:Pages using deprecated image syntax has over 89k pages listed, making manually fixing these not possible. Could a bot be created to handle this? --Gonnym (talk) 06:18, 12 April 2019 (UTC)
- @Gonnym: I might be able to help, but can you give some examples of the specific edits that would need to be made (ideally with diffs) and how to screen for those? Thanks, --DannyS712 (talk) 06:26, 12 April 2019 (UTC)
- Pages in this category use a template that uses Module:InfoboxImage in a
{{#invoke:InfoboxImage|InfoboxImage|image={{{image|}}}|size={{{image_size|}}}|sizedefault=frameless|upright={{{image_upright|1}}}|alt={{{alt|}}}}}
style that pass to the|image=
field an image syntax in the format|image=File:Example.jpg
. However, as per usual when dealing with templates, the exact parameters used and their names will differ between the templates. So for example:
- Pages in this category use a template that uses Module:InfoboxImage in a
- {{Infobox television}} has
{{#invoke:InfoboxImage|InfoboxImage|image={{{image|}}}|size={{{image_size|}}}|sizedefault=frameless|upright={{{image_upright|1.13}}}<!-- 1.13 is the most common size used in TV articles. -->|alt={{{image_alt|{{{alt|}}}}}}}}
- {{Infobox television season}} has
{{#invoke:InfoboxImage|InfoboxImage|image={{{image|}}}|size={{{image_size|{{{imagesize|}}}}}}|sizedefault=frameless|upright={{{image_upright|1}}}|alt={{{image_alt|{{{alt|}}}}}}}}
- {{Infobox television episode}} has
{{#invoke:InfoboxImage|InfoboxImage|image={{{image|}}}|size={{{image_size|}}}|sizedefault=frameless|alt={{{alt|}}}}}
Also, an image isn't the only value that can be passed in |image=File:Example.jpg
, but it sometimes is combined with an image size and caption, which will need to be extracted and passed through the correct parameters. --Gonnym (talk) 06:37, 12 April 2019 (UTC)
- @Gonnym: okay, now it looks way more complicated. Maybe 1 infobox at a time. Can you provide some diffs for a few different types of cases with an infobox of your choice? Thanks, --DannyS712 (talk) 06:41, 12 April 2019 (UTC)
- The West Wing (season 3) ({{Infobox television season}}) has
image=[[File:West Wing S3 DVD.jpg|250px]]
. Instead it should be,|image=West Wing S3 DVD.jpg
and|image_size=250px
(it can also be without "px" as the module does that automatically). - Red Dwarf X has
image=[[File:Red Dwarf X logo.jpg|alt=Logo for the tenth series of ''Red Dwarf''|250px]]
. Instead it should be,|image=Red Dwarf X logo.jpg
,|image_size=250px
and|image_alt=Logo for the tenth series of Red Dwarf
.
- The West Wing (season 3) ({{Infobox television season}}) has
- For a better systematic approach though, maybe it would be better finding out what the top faulty templates are, and create a mapping of what parameters the templates use and their names. If the bot can check the template name and know what parameters to use, this should speed things up.--Gonnym (talk) 07:00, 12 April 2019 (UTC)
- @Gonnym: And now I'm completely lost. I don't think I'm the right bot op to help with this, sorry. --DannyS712 (talk) 07:02, 12 April 2019 (UTC)
- I think someone could start with {{Infobox election}}, which appears to have roughly 11,000 articles in the error category. Here's a sample edit. Basically, for this template, you need to remove the initial brackets and the "File:" part of the image parameter value, then move the pixel specification (which may come in a variety of forms, like "x150px" or "150x150px") to the next line to a new
|image#_size=
parameter. The number "#" needs to match the image# parameter, e.g.|image2=
gets|image2_size=
. Drop me a line if this is confusing; I feel like it's a lot to explain in a short paragraph.
- I think someone could start with {{Infobox election}}, which appears to have roughly 11,000 articles in the error category. Here's a sample edit. Basically, for this template, you need to remove the initial brackets and the "File:" part of the image parameter value, then move the pixel specification (which may come in a variety of forms, like "x150px" or "150x150px") to the next line to a new
-
- This may be a good mini-project to discuss at length at Category talk:Pages using deprecated image syntax. – Jonesey95 (talk) 07:59, 12 April 2019 (UTC)
- In many cases, the
|image_size=250px
(or equivalent) may simply be omitted, because most infoboxes are set up to use a default size where none has been set (example). In my opinion, falling back to the default is preferable since it gives a consistent look between articles. --Redrose64 🌹 (talk) 12:46, 12 April 2019 (UTC)- Mostly true, but unfortunately, that is not the case at {{Infobox election}}, as you can see in this before-and-after comparison. – Jonesey95 (talk) 13:15, 12 April 2019 (UTC)
- It appears that Number 57 (talk · contribs) is against the proposal. --Redrose64 🌹 (talk) 13:44, 12 April 2019 (UTC)
- I guess I was pinged because of this edit? I don't really understand what is being discussed here, but removing the image size parameters like this edit means that the images in the infobox are different sizes – is this because there is no default size for this infobox, or the default size is for a single dimension (and not all photos have the same aspect ratio)? Can the default size be set to 150x150 (which is the most commonly used size)? Cheers, Number 57 13:52, 12 April 2019 (UTC)
- {{Infobox election}} has a default size of 50px for
|flag_image=
, a 300px for|map_image#=
and no default for|image#=
which defaults then to frameless (which I'm not sure what it is). If there is a correct size that the template should use, then the template should probably be edited to handle it. --Gonnym (talk) 14:02, 12 April 2019 (UTC) - (edit conflict) @Number 57: If you use the
|image1=[[File:Soleiman Eskandari.jpg|150x150px]]
format it puts the page into Category:Pages using deprecated image syntax, because the parameter is intended for a bare filename and nothing else, as in|image1=Soleiman Eskandari.jpg
. --Redrose64 🌹 (talk) 14:05, 12 April 2019 (UTC)- OK. I have no problem with using some other way to get matching image sizes, but if it is added as a default, it needs to be a two-dimensional, otherwise it ends up in a bit of a mess where images have different aspect ratios. Number 57 14:07, 12 April 2019 (UTC)
- Redrose64: your edit, like my edit that I linked above (and self-reverted) resulted in image sizes that look bad. Either the template needs to be modified, or the image sizes need to be preserved in template parameter values within the article, but removing them changes the image rendering in a negative way in that article (and presumably others). – Jonesey95 (talk) 17:00, 12 April 2019 (UTC)
- OK. I have no problem with using some other way to get matching image sizes, but if it is added as a default, it needs to be a two-dimensional, otherwise it ends up in a bit of a mess where images have different aspect ratios. Number 57 14:07, 12 April 2019 (UTC)
- {{Infobox election}} has a default size of 50px for
- I guess I was pinged because of this edit? I don't really understand what is being discussed here, but removing the image size parameters like this edit means that the images in the infobox are different sizes – is this because there is no default size for this infobox, or the default size is for a single dimension (and not all photos have the same aspect ratio)? Can the default size be set to 150x150 (which is the most commonly used size)? Cheers, Number 57 13:52, 12 April 2019 (UTC)
- It appears that Number 57 (talk · contribs) is against the proposal. --Redrose64 🌹 (talk) 13:44, 12 April 2019 (UTC)
- Mostly true, but unfortunately, that is not the case at {{Infobox election}}, as you can see in this before-and-after comparison. – Jonesey95 (talk) 13:15, 12 April 2019 (UTC)
- In many cases, the
- This may be a good mini-project to discuss at length at Category talk:Pages using deprecated image syntax. – Jonesey95 (talk) 07:59, 12 April 2019 (UTC)
- @Gonnym: And now I'm completely lost. I don't think I'm the right bot op to help with this, sorry. --DannyS712 (talk) 07:02, 12 April 2019 (UTC)
Wikipedia:Good articles/mismatches
The Wikipedia:Good articles/mismatches page details some conflicts with good articles and usually indicates a mistake of some sort that needs to be sorted out. Category:Good articles means that an article has the green spot that indicates it is classified as good, while Category:Wikipedia good articles are articles which have undergone a review. So the In Category:Good articles but not Category:Wikipedia good articles
indicates that a good article symbol may be present on an article that has not actually undergone a review. Wikipedia:Good articles/all is a list of all good articles and is manually updated. The last two headings usually indicate articles that have not been added after passing a review or removed after being delisted.
This page was originally created by JJMC89 a year ago using AWB after I requested it. At the time it contained thousands of mismatches[5]. We have just resolved all those, mainly through the efforts of DepressedPer. I was hoping there could be a bot that would update the page periodically so we can keep on top of any further mismatches. I have tried running it myself through AWB, but the number of articles is too large to do in one hit. There was also an issue that articles that had been moved would show up as a mismatch if the name was different at the Wikipedia:Good articles/all page. Maybe there is a better workaround for this, the last time I just renamed the articles at the GA list but that was quite time consuming. Regards AIRcorn (talk) 04:47, 13 April 2019 (UTC)
- @Aircorn: I can run it with AWB, my computer seems to be able to handle it. Do you know what AWB conditions they used? --DannyS712 (talk) 05:02, 13 April 2019 (UTC)
- Thanks for the offer. I believe it is more a numbers thing than a power one (although my computer is certainly lacking the last one). From my understanding you need to be an administrator to run AWB above a certain number of entries. If I run it it misses a whole lot. As far as I can tell it was made by going to tools and using list comparer. You add each category as a source (or links in the case of Wikipedia:Good articles/all) and compare. It should show you which ones were unique in each list and which were common. The unique ones can then be saved and copied under the correct header in Wikipedia:Good articles/mismatches. It is not too laborious, but if I run it it only pulls 25000 from Category:Good articles when there are nearly 30000 entries. Also I don't know how to account for redirects. If you can figure out a better way to make it work it would be interesting to see how many new mismatches have occurred in the last year, but I was ultimately hoping for a more automated update process. AIRcorn (talk) 06:01, 13 April 2019 (UTC)
- I would be happy to setup an automated cron-based bot on Toolforge do this. There would be no manual processes involved it would run at a set time and be totally hands off. It will pull the data via the API. This is something my tool wikiget does. I even have an example of how to do list compare in the documentation. -- GreenC 06:22, 13 April 2019 (UTC)
- Thanks for the offer. I believe it is more a numbers thing than a power one (although my computer is certainly lacking the last one). From my understanding you need to be an administrator to run AWB above a certain number of entries. If I run it it misses a whole lot. As far as I can tell it was made by going to tools and using list comparer. You add each category as a source (or links in the case of Wikipedia:Good articles/all) and compare. It should show you which ones were unique in each list and which were common. The unique ones can then be saved and copied under the correct header in Wikipedia:Good articles/mismatches. It is not too laborious, but if I run it it only pulls 25000 from Category:Good articles when there are nearly 30000 entries. Also I don't know how to account for redirects. If you can figure out a better way to make it work it would be interesting to see how many new mismatches have occurred in the last year, but I was ultimately hoping for a more automated update process. AIRcorn (talk) 06:01, 13 April 2019 (UTC)
- Petscan is your friend. In GA but not WGA; In WGA but not GA; in GA but not linked from Wikipedia:GA/All; linked from Wikipedia:GA/All but not in GA. You can get different output on the output tab. --Izno (talk) 13:34, 13 April 2019 (UTC)
Request to add "List of Medal of Honor in non-combat incidents" in 185 articles of recipients that received them.
Request to add "List of Medal of Honor recipients in non-combat incidents" in 185 recipients that are still dated with the old main's article's title. — Preceding unsigned comment added by XXzoonamiXX (talk • contribs) 04:02, 14 April 2019 (UTC)
- @XXzoonamiXX: can you explain? Do you just want the redirects to be bypassed? (Replace the old title with the new title in links?) --DannyS712 (talk) 04:12, 14 April 2019 (UTC)
- There are 185 persons with the old main article's title that I just recently changed so yes replace the old title of each recipient with the new title I changed in the "See also" sections. XXzoonamiXX (talk) 04:22, 14 April 2019 (UTC)
- @XXzoonamiXX: the old title is now a redirect to the new page, so that's not needed. See Wikipedia:Redirect#Do not "fix" links to redirects that are not broken for more. --DannyS712 (talk) 04:30, 14 April 2019 (UTC)
- I'm not talking about that, I'm talking about editing and changing the old title into a new one in many recipenets "See Also" sections. Otherwise, i'll give people impression that it's what the old title implied rather than clicking on it for a deeper subject. — Preceding unsigned comment added by XXzoonamiXX (talk • contribs) 04:51, 14 April 2019 (UTC)
- @XXzoonamiXX: the old title is now a redirect to the new page, so that's not needed. See Wikipedia:Redirect#Do not "fix" links to redirects that are not broken for more. --DannyS712 (talk) 04:30, 14 April 2019 (UTC)
- There are 185 persons with the old main article's title that I just recently changed so yes replace the old title of each recipient with the new title I changed in the "See also" sections. XXzoonamiXX (talk) 04:22, 14 April 2019 (UTC)