Wikipedia:Bots/Noticeboard

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Bots noticeboard

This is a message board for coordinating and discussing bot-related issues on Wikipedia (also including other programs interacting with the MediaWiki software). Although this page is frequented mainly by bot owners, any user is welcome to leave a message or join the discussion here.

If you want to report an issue or bug with a specific bot, follow the steps outlined in WP:BOTISSUE first. This not the place for requests for bot approvals or requesting that tasks be done by a bot. General questions about the MediaWiki software (such as the use of templates, etc.) should be asked at Wikipedia:Village pump (technical).


BAG nomination requirement tweaks[edit]

Just a notice that I removed the requirement to post a notice at WT:BOTS (diff) in favour of WT:BOTPOL. WT:BOTS hasn't been a hub of discussion in years now that WP:BOTN is the central place, and WT:BOTPOL is a lot more relevant since that is the policy page most people concerned with the interpretation of bot policy would actually watch.

Feel free to revert if you object. Headbomb {t · c · p · b} 23:18, 1 December 2020 (UTC)

Maybe VPM should be changed to VPT as well? VPT has more active watchers, and the people watching it are probably more likely to be interested than those watching/regular at VPM. ProcrastinatingReader (talk) 23:21, 1 December 2020 (UTC)
If we change things, and I'm not saying we should, I'd go WP:VPP over WP:VPT personally. VPT, while technical, is mostly about technical issues with templates, modules, the software/html of the site, etc. VPP is at least policy-related. Headbomb {t · c · p · b} 23:48, 1 December 2020 (UTC)
VPT is more than that. It's the one well-watched page for all things technical, and bots are of course technical. My own BAG nomination received zero comments in the first two days after I notified all prescribed venues. Then I posted another notification at VPT following which there were 5 comments within hours. – SD0001 (talk) 05:59, 2 December 2020 (UTC)
What about VPP? That page has 3500+ watchers, but I don't know how many of them are 'recent'. Headbomb {t · c · p · b} 16:56, 2 December 2020 (UTC)
Since we're talking about it, the notification to AN seems useless (besides for admin bots, maybe). --Izno (talk) 14:08, 2 December 2020 (UTC)
Useless maybe, but it is a highly watched board. 913 recent talk page watchers, compared to 611 of VPR, 561 of VPT, 301 of VPM, or 128 of this page. Aside from the Main Page or advertising on watchlists I'm not sure there are better venues to get attention from likely interested individuals. Or well, there is always ANI (1,253 watchers) ProcrastinatingReader (talk) 14:18, 2 December 2020 (UTC)
I believe the idea was to let a variety of forums know about the nomination, so that BAG didn't become a clique of self-selected people. This mostly satisfies people who watch AN as a preventative measure against the abuse admin powers, or something. With the logic that this lets them monitor BAG for similarly appointing grossly incompetent people. If we want more advertising, we could always add BAG nominations to the current admin/crat nomination templates, although that might have some unintended consequences of increasing drama. Headbomb {t · c · p · b} 16:56, 2 December 2020 (UTC)
Regarding that last point... yeah, pretty much no. BAG technically gives no user rights, so while it is a good position to give a rubber stamp for a bot task, I don't think it merits the scrutiny of an RfX (and the drama that comes with it). Primefac (talk) 18:25, 2 December 2020 (UTC)
That's also my opinion. And even if somehow we had the power to grant the bot flag, it's still not something that would warrant RFA-levels of scrutiny/drama either. Headbomb {t · c · p · b} 18:52, 2 December 2020 (UTC)
  • Before this gets archived, final thoughts on changing WP:VPM to either WP:VPT or WP:VPP? My preference is the former, since it seems generally only technically-involved people comment on BAG noms, so there's an increased chance of participation from a VPT notification than a VPP one I feel. ProcrastinatingReader (talk) 20:01, 31 December 2020 (UTC)

An upgrade to Cluebot NG[edit]

I've been working on a new vandalism detection system for some time now. While the system that I have created appears not to be as good as Cluebot NG overall, it does have some strengths where Cluebot is weak. For example, it uses a grammar check rating to assign each edit a score between -1 and 1 as a measure of whether an edit makes an article's grammar better or worse. This is a pretty good predictor of whether or not an edit is vandalism.

My idea is to install my new system as a supplement to Cluebot NG. In order to do that, I will need to use the confidence score that Cluebot outputs as an input to my new vandalism detector. The new bot will use this information in combination with some other predictors I have derived (like grammar check) to catch some vandalism that Cluebot misses. We should be able to do this without increasing the overall ratio of false positive to true positives. I am training my bot on two datasets called the PAN-WVC-10 and PAN-WVC-11. To finish my project, all I need is Cluebot's confidence scores. Can someone here help me run Cluebot NG on these datasets? Sam at Megaputer (talk) 14:09, 6 December 2020 (UTC)

One of the operators may be able to help: @Cobi, Rich Smith, and DamianZaremba: ProcrastinatingReader (talk) 15:29, 6 December 2020 (UTC)
Thanks for the ping PR, @Sam at Megaputer: that is pretty good. I would suggest talking to @Cobi: about this, maybe use 'Email This User' to pop him an email as he is not always on Wikipedia - RichT|C|E-Mail 16:15, 6 December 2020 (UTC)
Thanks to both of you! I have sent the email. Hopefully he will reply soon. Sam at Megaputer (talk) 16:30, 6 December 2020 (UTC)
@Sam at Megaputer: if you haven't already, you may also want to check out mw:ORES and mw:ORES review tool. If you have come up with a new way to help score changes, you may be able to integrate to that system instead or as well. The benefit of ORES scoring is that it runs server side and can inject scoring points in to the feed that the existing other secondary checks (such as bots) use. — xaosflux Talk 19:55, 7 December 2020 (UTC)
@Xaosflux: Thanks! My project may potentially have something to add to ORES also, so this is worth looking into. I noticed in my brief assessment of the tool the ORES system for evaluating the quality of an edit appears to be far less sophisticated than Cluebot's. For example, it uses a list of words commonly found in damaging/undamaging edits to predict whether an edit is good or not while Cluebot uses a Naive Bayes classifier. It may be possible to implement a similar system for ORES if I can talk to the right people. The challenge for me here is that I am not actually a programmer so much as I am a data analyst. The software I use makes it easy for me to perform the analysis, but I may need some help installing the result. Sam at Megaputer (talk) 21:03, 7 December 2020 (UTC)
  • @Rich Smith: I'm still waiting to hear back from Cobi, but I have something that may interest you in the meantime. It is my understanding that Cluebot whitelists registered users with more than 50 edits and IPs with more than 250 edits since most vandalism comes from new users. I think that these numbers were chosen based on qualitive observation and without looking at any hard data? My research shows that IPs continue to vandalize at a rate of around 15% even up to their 500th edit, while the vandalism rate for logged in users falls below that number by the time they reach their third edit. For a registered user making their 50th edit, the probability that this edit is vandalism is only about 3%. So in conclusion, I think that the threshold for whitelisting should be much higher for IPs and much lower for registered users. We may be able to significantly increase the performance of Cluebot just by moving these thresholds around. Sam at Megaputer (talk) 21:21, 8 December 2020 (UTC)
    Now that's super interesting! Thank you very much for making this. Enterprisey (talk!) 10:15, 2 January 2021 (UTC)

Monkbot 18[edit]

Requesting re-examination of Monkbot 18, specifically its edits to hyphenate parameters. Concerns were raised about this particular aspect of this task in its BRFA, and subsequently on the bot maintainer's talk page. The maintainer was asked to provide links to discussions demonstrating a consensus either to deprecate all unhyphenated variants, or specifically to deprecate |accessdate=, a particularly widely used example; he was unable to do so. The maintainer subsequently started a discussion to seek confirmation after the fact that this deprecation was acceptable, and although the discussion had limited community representation, it revealed further concerns with both the bot task and the intent to remove support for this parameter. Given the scale of this task, impacting over a third of the site's articles, the maintainer was asked to open an RfC to seek wider community consensus on this issue; thus far no such RfC has taken place, but the task has resumed regardless. Nikkimaria (talk) 13:58, 9 December 2020 (UTC)

I might consider an RfC were it obvious that a goodly portion of the community were up in arms about this bot task. Monkbot task 18 has now made 225k+ edits. All of those edits appear on a large number of watch lists yet there has been no uprising. Yes, there are a few who object. That there have been so few objections suggests to me that the community as a whole are either indifferent to, or approve of, the bot's edits.
Trappist the monk (talk) 14:36, 9 December 2020 (UTC)
"There hasn't been an uprising" isn't a go-ahead to disable functionality and change over two million more articles. Please pause this edit until you have obtained consensus for it. Nikkimaria (talk) 14:57, 9 December 2020 (UTC)
You claim that this bot task will disable functionality. I think that you need to demonstrate where that is happening. Yes, the bot task will change over two million ... articles. That was stated in the initial creation of the bot's WP:BRFA; see here.
Trappist the monk (talk) 15:06, 9 December 2020 (UTC)
The BRFA stated that the task would "replace all of the to-be-deprecated parameter names". It did not demonstrate a consensus to deprecate specific parameters, nor point to any global consensus on deprecation. Further you have stated here that "At the end of the deprecation period, support for the parameter name will be withdrawn and the parameter name will no longer work". No consensus for removing this functionality has been demonstrated. And yet your bot continues even now to remove these parameters at speed, without establishing a consensus that the community actually deliberately wants them to be removed. Find consensus for that first, and then restart your bot, rather than perpetually restarting it and claiming a lack of pitchforks as a substitute for real consensus. Nikkimaria (talk) 15:17, 9 December 2020 (UTC)
Yes, that is what I wrote in the WP:BRFA. Except that it is converting parameter names that will be deprecated to parameter names that will not be deprecated, decisions with regard to actual deprecation are not taken in a BRFA nor by the bot. Yes, when a deprecation period ends, support for the deprecated parameter will be withdrawn, so yes, the once-deprecated parameter will not work and cs1|2 templates will emit error messages to notify editors of that fact. You continue to claim that task 18 is removing functionality but have not shown where functionality is being removed.
Trappist the monk (talk) 16:13, 9 December 2020 (UTC)
decisions with regard to actual deprecation are not taken in a BRFA nor by the bot. The bot is removing parameters where no consensus to deprecate currently exists. That is not appropriate. Nikkimaria (talk) 16:41, 9 December 2020 (UTC)
@Nikkimaria: can you provide a diff to a specific edit that you think this bot has made that you think is harmful, and describe why you think it is harmful to readers and/or editors? This is just to see if there should be some sort of urgent injunction while the discussion continues. — xaosflux Talk 16:57, 9 December 2020 (UTC)
Here is a sample edit. The problem occurs because the bot run creates a fait-accompli situation with regards to parameter deprecation - see response to GreenC below. Nikkimaria (talk) 17:46, 9 December 2020 (UTC)
@Nikkimaria: thanks for the note, I'm not seeing a reason to do something drastic like block this bot while this is being further discussed - on initial review: (a) this has no impact on readers (b) this is following the formatting for parameters as documented for {{Cite web}}. It could be argued that while there is an alias this is a completely useless or purely cosmetic edit (in that it does not change the reader-facing rendered version of the page). This statement should not be taken as an endorsement or condemnation of the edits - just that it doesn't appear to be rising to level of needing immediate intervention. — xaosflux Talk 17:57, 9 December 2020 (UTC)
Umm, Monkbot task 18 is a cosmetic bot; every edit summary that the bot writes states that; the WP:BRFA states that; the bot's documentation states that.
Trappist the monk (talk) 18:25, 9 December 2020 (UTC)
@Trappist the monk: I didn't review the BRFA on this, was only jumping in to quickly see if something was running amok and needing emergency action, which I don't think is necessary. — xaosflux Talk 19:19, 9 December 2020 (UTC)
Nothing wrong with that edit. The bot did what it is designed to do. It deleted the one empty |accessdate=, renamed |accessdate= to |access-date= (8×), |archivedate= to |archive-date= (4×), and |archivurl= to |archive-url= (4×). There were no cs1|2 errors before the edit and no cs1|2 errors after the edit. Nothing was broken.
Trappist the monk (talk) 18:25, 9 December 2020 (UTC)
  • Support Nikkimaria's approach on this, for the given rationale. --Francis Schonken (talk) 15:32, 9 December 2020 (UTC)
  • Support established consensus. The bot had consensus in the BRFA |access-date= was specifically discussed. It had consensus at CS1|2 talk. It has defacto consensus due to the high visibility and very low complaint rate (most complainers were asking questions not demanding it stop). The bot is being copied other people are already doing the same thing manually and AWB. This is now the third forum discussion. Nikkimaria is pushing for a fourth place because the previous discussions and adoptions by the community are not "real". It's time to move on and put the recriminations behind. -- GreenC 16:51, 9 December 2020 (UTC)
    • As above, a lack of pitchforks is not a substitute for actual discussion with the wider community who will be impacted by the proposed deprecation being implemented by the bot. This also isn't the place for that discussion - the only reason we're here is because the bot began removing parameters without an established consensus for deprecation, and then people point to those removals as establishing a consensus. If you (using the generic "you" here) think that all unhyphenated parameters should be disabled and got rid of, then have an RfC that puts that outcome specifically to the community. Running the bot first is putting the cart before the horse. Nikkimaria (talk) 17:46, 9 December 2020 (UTC)
  • I don't have strong opinions on this, but there was a period when the bot was working on linguistics articles and it was mildly annoying. Not enough for me to make a fuss since the bot would be gone soon enough, but I considered it. I would support having an RfC on this just to get some clarity and hear concerns from those outside the bot and CS1|2 communities. Wug·a·po·des 23:48, 9 December 2020 (UTC)
    That's interesting. Monkbot doesn't work on groups of related articles (at least not intentionally). For every day since task 18 began, I have grabbed a list of typically somewhere between 15 and 18 thousand articles from various cirrus searches (usually three or more searches). That list gets filtered for duplicates (because an article might match multiple search criteria). I then give the filtered list to Module:Sandbox/trappist the monk/random sort which scrambles the list. There is, of course, no guarantee that the scrambling won't put related articles together because that is the nature of random. Nor is there any guarantee that the searches won't turn up related articles. Were these linguistics articles edited one right after the other or were they spaced-out in time?
    Trappist the monk (talk) 00:29, 10 December 2020 (UTC)
    Oh I'm almost sure it's a sampling bias; I do watchlist a lot of language articles after all! For the most part, they were spaced out in time, but given my watchlist habits, I only ever really saw changes to linguistics articles and never really anything else. Like I said, it's not a big deal for me but I think that even if the sample is random, people will still perceive it as working on particular article sets because they only watch particular article sets. Part of why I think a wider RfC would be helpful is that it also has the effect of advertising what the bot's going to be doing, how it's doing it, and ways that editors can opt out if they want. This way people don't feel like it's singling out "their" pages or whatever.
    And while I like the idea of kindly robots roaming articles and fixing stuff, the concept of a robot just passing through as it goes about its business is still kinda strange for most people. Our bot tasks are generally like cell phones--people have to interact with them or trigger them in some specific way. MonkBot's task 18 is more like a roomba--it just goes about its day cleaning stuff. Where phones are essentially really fancy calculators, roombas are in that uncanny valley of kinda having a mind of their own. (If you ever watched arrested development there's a scene where the character "Buster" feeds the roomba trash because it seemed hungry which is funny because they are more like pets than tools).
    So, yeah, I'm not trying to say task 18 should be stopped, but I think there's a human-machine interaction problem that we need to ease into better. Thinking on this more, you might want to make MonkBot more predictable. going off the recent cosmetic bot day RfC, I think we might want to limit it to once a week and just call that Cosmetic Bot Day (like scheduling your roomba). You can have it work through category trees rather than work randomly, so that it can post a notice on WikiProject talk pages in advance letting editors opt out (roombas work room by room rather than vacuuming about randomly after all). Maybe maintain some category black list so that particular "rooms" don't get "vacuumed". Maybe this is a bad analogy, but I think personifying the bot helps us to understand that initial uneasy feeling people get and to come up with changes that help people feel like the bot is working with them not just on its own. Wug·a·po·des 01:46, 18 December 2020 (UTC)
  • Support Nikkimaria's objection because such a wide-ranging cosmetic change should have had wide discussion. Further discussion at Help talk:Citation Style 1#deprecation and removal of nonhyphenated multiword parameter names.
··gracefool 💬 09:51, 15 December 2020 (UTC)
  • Having these purely cosmetic changes hit my watchlist over and over was disruptive; when bots are constantly hitting your watchlist to install personal cosmetic preferences, it makes it harder to monitor real edits and catch vandalism. SandyGeorgia (Talk) 00:50, 18 December 2020 (UTC)
    Curious, how many articles are on your watchlist and how many MonkBot task 18 edits do you see per day, on average? ProcrastinatingReader (talk) 01:16, 18 December 2020 (UTC)
    If we're talking numbers, I have 2492 pages on my watchlist of which 1691 are articles. And in the last 3 days (including today), I saw MonkBot 18 anywhere from 5 to 12 times per day. Or about 0.5% of the article part of my watchlist per day. Headbomb {t · c · p · b} 02:53, 18 December 2020 (UTC)
    Which are all very easily hidden, since they are all flagged as bot edits. Headbomb {t · c · p · b} 02:55, 18 December 2020 (UTC)
    My watchlist yesterday had 370 entries, 33 by Monkbot. I understand that I can suppress bot entries, but that is generally unwise. Just now, I had to fix unintended, but easily forseeable, effects of a bot removing navboxes. BTW, I don't understand when aliases became expensive or why they are undesirable. -- Michael Bednarek (talk) 03:11, 18 December 2020 (UTC)
    Trappist, shouldn’t these numbers be lower? Depends on if Cirrus is truly random too, I guess. Maybe some kind of cat check can help? If has a cat seen in last 150 articles, skip. ProcrastinatingReader (talk) 10:21, 18 December 2020 (UTC)
    That said, this kind of spam may be normal. MonkBot's current rate is about 22,000 edits per day (15 per minute), it seems. In other words, it will process 3 million articles in 4.5 months (~140 days). So if you have 1691 articles on your watchlist, assuming uniform distribution you should see 12 MonkBot edits per day. If the rate is moved to taking 6 months to complete, the average drops to 9 edits per day for Headbomb, or from 36 to 28 for Euryalus. But I'm not sure this makes the bot run any less problematic for people, to be honest, as that remains quite numerous and extending the run to take, say, a year becomes a bit unproductive and doesn't help much here either. Some kind of cat check may help, but it may also cause certain cats to get clogged together for more spam towards the end of this run. So for this I don't really have any ideas other than WP:HIDEBOTS. ProcrastinatingReader (talk) 13:28, 18 December 2020 (UTC)
    [Shouldn't] these numbers be lower? I don't have an answer for that. If the coin toss creates an edit list that matches your watchlist, your watchlist will be flooded on that day. The coin toss can also land the other way creating an edit list where none of the articles on your watchlist get edited. The odds, as I understand it, are the same for both extremes and all points in between. I suppose that you could give me the content of your watchlist (mainspace only) and a date. With those, I can run the bot against your watchlist on that date. Thereafter you should see only the occasional watchlist hit. Of course, anyone with similar interests will get flooded on the same day so I suspect that editors should probably not ask to have their watchlists processed...
    Trappist the monk (talk) 14:40, 18 December 2020 (UTC)
  • Doesn't rise to the level of pitchforks but am also finding the current Monkbot watchlist spam a little wearing. I have about 5000 pages watchlisted including many naval vessels, and the Monkbot accessdate changes seem very numerous. Appreciate that watchlist spam is already a regular consideration in bot approval discussions, and thank you to the BRFA regulars for that. Thanks also to Trappist for taking the time to aim for improved and standard template formatting in the first place. However, urge everyone involved to do please keep spam impacts in mind when balancing benefits vs drawbacks for additional bot tasks. -- Euryalus (talk) 13:03, 18 December 2020 (UTC)
    Genuinely out of curiosity, how comparable is that to the number of edits by Citation bot? Primefac (talk) 16:51, 18 December 2020 (UTC)
    • Citation bot made 3096 edits on December 17th. Headbomb {t · c · p · b} 17:15, 18 December 2020 (UTC)
:@Primefac: And using the totally unscientific analysis of my current watchlist for today (last 17 hours) Monkbot appears 31 times as most recent editor of an article, Citationbot is at 5 and Cluebot and Lowercasesigmabot have 2 each. A bunch of others get a single hit. As above it's certainly not pitchforks at midnight stuff, just a mild request to keep keeping spam in mind with cosmetic approvals. -- Euryalus (talk) 17:20, 18 December 2020 (UTC)
Well, 31 makes sense. See my calculation above - assuming a uniform distribution you'd see 36 per day. I think one solution (for everyone) is turning User:UncleDouggie/smart_watchlist into a gadget, enabling it by default and hiding the cosmetic bots by default (and people can toggle it off, if they want). xaosflux may know more of how bad of an idea this is? A solution for your end only is to install that script yourself (see WP:HIDEBOTS). Even if Trappist makes the bot 4x slower, you'd still see a 'substantial' number of edits, depending on what substantial is for you. ProcrastinatingReader (talk) 20:01, 18 December 2020 (UTC)
That's too niche to be a gadget. But we could update the instructions to mention the one-click script installation via user preferences. Headbomb {t · c · p · b} 20:07, 18 December 2020 (UTC)
@ProcrastinatingReader: "hiding" someone on watchlist with a client script doesn't fix the problem that it obscures any edit before it - the hiding would make it look like there was no change at all. — xaosflux Talk 23:50, 18 December 2020 (UTC)
If phab:T250856 ever got resolved, it wouldn't be an issue. Primefac (talk) 01:21, 19 December 2020 (UTC)
@Xaosflux: idea: what if "Latest revision only" is disabled in the server-side call, and instead that functionality is implemented client-side (if it's not already)? As in, the results show multiple revisions for this page, then the client trims them down to only one. Then it could also delete any bot revisions too, safely? (also pinging in Enterprisey for this thought, given experience from your section watchlists script). Is this feasible & a good idea? ProcrastinatingReader (talk) 21:59, 30 December 2020 (UTC)

So this is just going to continue even though there isn't consensus on whether it's even achieving something good — let alone the spam issue? ··gracefool 💬 08:33, 23 December 2020 (UTC)

@ProcrastinatingReader: forcing people to use "Expand watchlist to show all changes, not just the most recent" isn't a great solution, and think it will impact recent changes in a different way. — xaosflux Talk 23:32, 30 December 2020 (UTC)
My thought was this is done behind the scenes using JS. Users wouldn't do anything differently compared to now, or see results differently (minus the bot), but I figured what I described would be a technical way to get around the obscuring issue. ProcrastinatingReader (talk) 23:35, 30 December 2020 (UTC)
Yeah, I see what you're saying and it would probably work, but we might as well fix this in PHP (as we'll run into the same issue that's being argued about on Phab about precisely which edits to hide). Enterprisey (talk!) 03:17, 16 January 2021 (UTC)

A user has requested the attention of a member of the Bot Approvals Group. Once assistance has been rendered, please deactivate this tag by replacing it with {{tl|BAG assistance needed}}. So here we are a month later, and it's pretty clear that there is no consensus either here or at Help_talk:Citation_Style_1#deprecation_and_removal_of_nonhyphenated_multiword_parameter_names for what the bot is doing. Despite this the bot continues. Nikkimaria (talk) 16:53, 16 January 2021 (UTC)

  • Support - This had very limited discussion and goes against MOS:STYLERET and WP:COSMETICBOT. I take no issue with the hyphenating, I take issue with the removal of spacing and blank lines of wikitext that are used for formatting and readability. Frankly, I'll be undoing about 250 edits by this bot. - Floydian τ ¢ 00:53, 17 January 2021 (UTC)

Tangential discussion on the effects of the bot on templated references[edit]

  • Support Nikkimaria's objection. The cosmetic change doesn't seem provide any benefit--the aesthetic benefit is subjective and negligible. On the other hand, there is a tangible problem that is more nefarious than watchlist spam. I think the bot should be stopped from this task because the task does not include a test to see if the bot is damaging articles. It has damaged dozens; here are the ones I know about so far and have recently fixed:


The issue is that MonkBot will edit a template and change a reference definition in that template, but not check the results of the effects of that edit in any of the places where the template is used. I think this is careless and negligent -- if the bot is expected to make hundreds of thousands of edits, shouldn't it have some quality checks to make sure it's not adding errors to articles?
The problem is that many reference definitions are duplicated. An article might define a reference named <ref name="math">, and it might include one (or more!) templates that do so, too. As long as the reference definitions are exactly the same (character for character, including whitespace and casing) the rendering code eats the duplication and considers them the same. MonkBot will come along and change one definition from accessdate to access-date. It won't check the results of that change and instead save the change to move along as rapidly as possible.
Of course, now that the references with the same name are different, articles referencing the template might end up with a message like "Cite error: The named reference "math" was defined multiple times with different content (see the help page)" in the references section. A human has to come along and find this error, trace through the template inclusion, and find a way to recover from MonkBot's "cosmetic" edit. I don't think the value of the chanange justify the risk and work to find and recover these problems.
Referencing and inclusion in WikiPedia are surprisingly fragile, despite being core features (and requirements!) of the encyclopedia. For sure,. While I don't think it's up to MonkBot to fix them, I don't think it should be allowed to make the problem worse -- or to be allowed to make any edits without first checking to see if it is introducing new errors or not. To trust those features to a robot which is too lazy to check its own work for errors done's seem like the right way to make progress. -- Mikeblas (talk) 14:46, 23 December 2020 (UTC)
I clicked on 5 of those links and MonkBot did not edit any of them? ProcrastinatingReader (talk) 14:59, 23 December 2020 (UTC)
Of the 19 listed references only three were edited. In neither Special:Diff/991972924 nor Special:Diff/991111878 did the bot introduce errors, and the errors seen following Special:Diff/992002956 were already present in the previous revision. This is a nonsense objection due to misunderstanding how the bot works (in addition, the "fix one template but not the other" argument is silly, if it's going to fix one parameter use it will fix them all). GIGO is not a reason to stop a bot. Primefac (talk) 15:22, 23 December 2020 (UTC)
The bot edited templates used by these articles. For example, 1890 Calgary municipal election invokes {{Calgary municipal election, 1890/Position/Councillor}} and {{Calgary municipal election, 1890/Position/Mayor}}, which were both edited by MonkBot. The The edits the bot made to the templates changed named references they have in common, made them different, and created new "Cite error" error messages in the articles I list. Here are some more articles disrupted which were fine before MonkBot edited referenced templates, but were broken afterwards:
-- Mikeblas (talk) 15:33, 23 December 2020 (UTC)
Why are articles using (heck, re-defining) citations which are within a template? In any case, should named references be defined within a template of this kind, in the first place? ProcrastinatingReader (talk) 15:40, 23 December 2020 (UTC)
This is exactly my point - it's a one-off GIGO situation. Primefac (talk) 15:43, 23 December 2020 (UTC)
Many (like nearly all, right?) articles in Wikipedia include templates. An article can be damaged without editing the article itself. In many (most?) of the cases I list, MonkBot's damage comes from editing a template used by the article. All of the "Leap Year Starting" articles reference {{List of calendars}}, which was edited by MonkBot for this "Task 18" activity. The articles rendered just fine before the edit to the template. After the edit to the template, the articles rendered with errors. Note that, when you look at an old version of an article, it renders with the *current* version of the template. That can be deceiving, so it takes some effort to understand and diagnose these issues. I don't think there's a way to deny they were caused by MonkBot's edits to the templates.
To understand the errors in {{2011 League of Ireland Premier Division table}} and {{2013 League of Ireland Premier Division table}}, please consider the changes that MonkBot made to League of Ireland Premier Division table the 2011 League of Ireland Premier Division table template and the League of Ireland Premier Division table the 2013 League of Ireland Premier Division table. Note that the 2011 table template was manually fixed by User:Trappist the monk after a previous bot edit damaged some of the same the including articles earlier this month. -- Mikeblas (talk) 15:55, 23 December 2020 (UTC)
You are missing my point entirely. If an article is going to define a named reference, then it should be named something other than the named reference that a template is using. Honestly, the template itself probably shouldn't be using a named reference either. This is not the bot's fault, but a "garbage-in, garbage-out" situation created by a series of bad edits between the article writers and the template editors. Primefac (talk) 15:59, 23 December 2020 (UTC)
Basically, my point is that if a bot exposes bad practice on the part of other users, we should fix that bad practice, not shut down the bot for exposing it. Primefac (talk) 16:01, 23 December 2020 (UTC)
And I'm afraid you're choosing to ignore my point entirely. At one moment, the articles rendered fine. (Therefore, they're not garbage.) Then, the bot came along and made a change. The bot didn't check to see if its change causes a new and visible error in the article, and instead it just saves the change and moves along. Now, after the bot's change, the article is objectively worse.
Good software (these bots are made of software) checks to see if it is getting garbage input. If it is, it either ignores that input -- even if it means it can't do its intended work; or makes a solid assumption based on the apparent intent of that input
If we give a bot garbage input (Go to the grocery and get a bottle of pickle milk!) we shouldn't accept garbage output (The bot returned with a brand new Mercedes Benz.) with a shrug of the shoulders. Instead, the bot should stop: There is no such thing as pickle milk, and we know that. What did you really mean? (Knowing my luck, the almond milk people have given up and moved along to pickle milk, but ...)
I'm not proposing that we shut down the bot. I propose that we stop it from doing this task until it can be corrected to not do damage when it gets this arguably bad input. It's pretty easy to detect errors after any changes made--just preview the page and look for errors that weren't there before. Why does the bot not do this? We know that the structure of wikicode that builds the corpus is very much irregular and fragile; it's hard for people to write, and difficult for machines to accurately parse. If we take a step back adn recognize that, I think we pretty naturally realize that automata that doesn't check for errors as it works is bad.
The bot could also be enhanced to build a work list that traverse the included texts and identifies the actual location of the duplicate references that you consider garbage. Referencing is difficult, and that difficulty is multiplied by templates and transclusion. Imagine a bot that was helpful in this task rather than regressive.
-- Mikeblas (talk) 16:29, 23 December 2020 (UTC)
Check through the archives of User talk:AnomieBOT. There you will see dozens of complaints that the bot has "broken" something, when all it is doing is exactly what it's supposed to be doing. A bot cannot check for every possible way that a page will be broken, or (in this case) how editing one page affects a completely different article entirely! I do not deny that this is an issue, but it's an issue with the editors, not the bot. Primefac (talk) 17:47, 24 December 2020 (UTC)
Most articles include templates. But I do not think most articles include a content template which defines a named reference, and on top of that the article not only reuses the named reference, it redefines it, with the same name. Off the top of my head, I cannot think of why a template would use named references, unless it is part of a collection with other templates that should be used on the same article (and in such a case, it should use an obscure ref name that is not reused outside of that collection). ProcrastinatingReader (talk) 15:59, 23 December 2020 (UTC)
Many of these articles are using single-transclusion templates to hold article content, which seems like a nice list of TFD "subst and delete" candidates to me. But I digress. If the bot is going to edit templates that contain named references, it probably needs to immediately perform the same task on articles that transclude that template. That would bring the named references back into sync. – Jonesey95 (talk) 16:03, 23 December 2020 (UTC)
Some of them do (the Calgary election ones, specifically) but the rest are multiple-use tempaltes. I think the approach you give would be one way to consider; there are probably many other ways, and lots of trade-offs to consider. And that's what I insist: that we recognize how irregular and fragile the corpus is before letting overly simple robots run around and edit it. Really, it makes matters worse. -- Mikeblas (talk) 16:24, 23 December 2020 (UTC)
Category:Templates that generate named references catalogs more than 600 templates that generate named references. Note that this category is manually built -- I figure the actual count is an order of magnitude higher. -- Mikeblas (talk) 16:24, 23 December 2020 (UTC)
4720 templates at the time of this writing, fwiw. Also, if you're going to reply to someone after someone's already replied with a different comment, please put your comments after theirs so the threading is correct. Primefac (talk) 17:54, 24 December 2020 (UTC)
Your search only finds ref name=. There are many other ways to write a template that generates a named reference -- invoking another template, most notably. Also, not always one space between "ref" and "name", might have a "group" parameter, spacing around equals, and so on. And of course, templates are only one facet of this problem; transclusion is a whole 'notha level. Also, some false alarms for templates that generate syntax for named references, but don't generate names references. My point was, though, that this isn't a small problem; and there are plenty of motivations for creation of a template that generates a named reference.
Sorry -- Like the majority of editors, I'm not familiar with the rules for participating in these complicated discussions, and wish that Wikipedia had any amount of tooling for doing so. My vision is quite poor, and counting colons is tedious and difficult for me. In fact, I usually refrain from participating for these reasons (and a few others), but feel strongly enough about disruptive bot behaviour that I tried to join in. I regret that you found my style distracting from my intended messages. -- Mikeblas (talk) 19:14, 24 December 2020 (UTC)
No worries, the indenting comment was not meant as any sort of black mark, just a comment about how things are "usually" done. My prior comments (and edit summaries) came from a place of not knowing where you were coming from, so any perceived annoyance or anger were entirely my fault and I apologize for that. Primefac (talk) 21:16, 24 December 2020 (UTC)
Because they can, I guess. Repeated definitions (that don't differ) are absorbed and rendered correctly. Editors make template inclusion structures that range from trivial to Byzantine, and it can be quite difficult to track down which wiki code defines which references. Many templates rely on the absorption of repeated references in order to simplify their use and implementation. Maybe another case is that editors don't know the reference is redefined; they don't exhaustively unroll the possibly nested template inclusions, read LUA code that implements templates, unwind the three or four forms of partial transclusion, and ...
Point is, though, that this robot comes along and disrupts what was working. Maybe it's not ideal to have repeated references spread across different inclusions, but it works -- and doesn't generate an error. Until this robot changes it, doesn't check for newly introduced problems, saves its change and leaves the article in worse shape (now with an big red error message). If the robot doesn't check that it's doing damage, it shouldn't be trusted to edit autonomously or automatically. -- Mikeblas (talk) 16:04, 23 December 2020 (UTC)
Referencing and inclusion in WikiPedia are surprisingly fragile, despite being core features (and requirements!) of the encyclopedia. Yes. And references in an article that are mirrored in templates that are transcluded into that article are some of the most fragile of the fragile. I suspect that this fragility is why editors create specific-source templates. Specific-source templates can be transcluded into an article and also into other templates that the article transcludes. This mechanism is as robust as it gets because there is there is only one 'source' for the citation.
Alternately, editors can use a self-closed <ref name="math" /> tag in the body of the template and do this (list defined referencing):
<noinclude>
{{reflist|refs=
<ref name="math">{{cite web|url=http://www.staff.science.uu.nl/~gent0113/calendar/isocalendar.htm |author=Robert van Gent |title=The Mathematics of the ISO 8601 Calendar |publisher=Utrecht University, Department of Mathematics |date=2017 |access-date=20 July 2017}}</ref>
}}
[[Category:Calendar templates]]
</noinclude>
Of course this is still problematic because the article and template references are independent so they can easily fall out of sync. Mirroring of citation templates in articles and the templates that the articles transclude is poor practice and should be discouraged.
Trappist the monk (talk) 16:12, 23 December 2020 (UTC)
It's also problematic, @Trappist the monk: because it's not the status quo. Your proposal would require manual edits to a large (and unknown) number of templates and articles, thoughtful considration of layout and declaration order, and so on. At this time, there are countless articles written that render just fineand are only broken as User:Monkbot visits them and makes changes without checking for problems. It's pretty clear that "detecting" referencing problems in this bot action wasn't an intended (or known) consequence of the proposed action for the robot, and so there was never consensus or acknowledgement that the robot should be doing it. Really, this is starting to seem a lot like WP:POINT. Please consider pausing this task until the community can agree upon a way forward. -- Mikeblas (talk) 17:51, 24 December 2020 (UTC)
Specific-source templates are fragile with link rot. When an external site changes [servers, software, ownership, domains, url layout, etc], they never in my years of experience change all URLs equally. This creates problems with custom source templates which treat all URLs equally, which might have been true at the time the template was created. The only solution is to go through every instance of the template, check the new URL is working and if not delete the template and replace it with a CS1|2. This work is laborious and often does not get done, the end result is link rot on Wikipedia. This can be avoided by using standard templates like CS1|2 for which standardized tools are available. -- GreenC 16:28, 23 December 2020 (UTC)
Perhaps we are talking about different things? For my meaning of 'specific-source', I mean a template that holds a single complete citation (perhaps single-source is a better term) so a template {{The Mathematics of the ISO 8601 Calendar}} would hold only:
{{cite web|url=http://www.staff.science.uu.nl/~gent0113/calendar/isocalendar.htm |author=Robert van Gent |title=The Mathematics of the ISO 8601 Calendar |publisher=Utrecht University, Department of Mathematics |date=2017 |access-date=20 July 2017}}
{{The Mathematics of the ISO 8601 Calendar}} could be transcluded into into articles and into templates and when those transclusions are wrapped in <ref name="...">...</ref> (where the name="..." attribute is the same) will be seamlessly merged together.
I think that you are describing the case where a template takes an argument parameter that is used to modify a base url?
Trappist the monk (talk) 16:58, 23 December 2020 (UTC)
Sorry I misinterpreted. With a static URL it would be no problem with link rot, actually an improvement. -- GreenC 17:14, 23 December 2020 (UTC)
If that's done, I feel like {{The Mathematics of the ISO 8601 Calendar}} would be better off as a subpage of something (eg "{{Citation dictionary/The Mathematics of the ISO 8601 Calendar}}"), so we don't have citation templates littered around everywhere, most probably with no categories to link them. ProcrastinatingReader (talk) 17:35, 23 December 2020 (UTC)
Just as a note, we tried that with {{cite doi}}, which was eventually determined to be not a great idea (i.e. creating thousands of subpages with one-value citations). Primefac (talk) 17:47, 24 December 2020 (UTC)

Mikeblas using DiscussionTools will help. It adds a “reply” button next to messages. Go to Special:MyPage/common.js and add:

if ( $( '#ca-addsection' ).length ) mw.loader.using( 'ext.discussionTools.init' );

ProcrastinatingReader (talk) 20:31, 24 December 2020 (UTC)

Thanks, I'll see if I can give that a try. Meanwhile, why was my input excluded from the Monkbot discussion above? Are only certain people allowed to give input about these topics? -- Mikeblas (talk) 20:48, 24 December 2020 (UTC)
Your input was not excluded, but it started a discussion that is now longer than everyone else's opinions listed in the main section, which is why I split it off into its own section. Primefac (talk) 21:16, 24 December 2020 (UTC)
Thanks for the clarification. It seems like there's not concensus around this change, and the way the change is implemented is causing damage. How do we stop this bot from continuing this task? -- Mikeblas (talk) 20:04, 26 December 2020 (UTC)

Further discussion on potential bot disruption[edit]

Note that MonkBot is damaging references that use any transclusion, not just templates. I fixed these this morning:
How do we make the robot stop making breaking changes? -- Mikeblas (talk) 20:04, 26 December 2020 (UTC)
As before, it’d help if you also linked the damaged transclusion. But generally yes, the namespace of the transclusion (whether it be template, article or Wikipedia) is not relevant. ProcrastinatingReader (talk) 21:46, 26 December 2020 (UTC)
Sorry, I didn't see a previous request for links to the transclusions. Here is wider enumeration:

The bot continuously makes these errors since it doesn't check its own work. If more examples are needed, please let me know -- they are easier to find than they are to write up. -- Mikeblas (talk) 16:52, 27 December 2020 (UTC)

I randomly clicked on a few of those (e.g [2], [3]), and I've yet to find anything broken. So what exactly is broken, and what exactly caused the break, if anything? Headbomb {t · c · p · b} 17:58, 27 December 2020 (UTC)
Also, really, if there's a bug, really WP:BOTISSUE should be followed, and the bot operator contacted about the bug, rather than come to BOTN first. Headbomb {t · c · p · b} 18:30, 27 December 2020 (UTC)
The author has been contacted at least twice. See my talk page and theirs. -- Mikeblas (talk) 18:32, 27 December 2020 (UTC)
I've already fixed most of those in this set. Usually, the problem is a red error message in the references section of the article that says Cite error: The named reference "licenca" was defined multiple times with different content (see the help page). The message indicates that Wikipedia, when rendering the page, has found two <ref name=licenca"> tags in the total source of the article (including transclusions and template expansions, and ...). Those tags surround references that have different definitions; "different" means any binary change in the string, so space and casing are both sensitive.
This example is from 2016–17 Slovenian PrvaLiga at this moment -- of course, if someone edits anything, it can change or go away. The article itself directly defines <ref name=licenca">. But it also includes {{2016–17 Slovenian PrvaLiga table}}, which had an exact duplicate of that reference definition ... until Monkbot changed it. This bot makes changes to articles (including templates) and does not preview or revisit the changed article to see if new errors have appeared. It also doesn't scan the text of the article to test for problems like duplicate reference definitions.
Sure, we shouldn't have references of the same name in the wiki code source stream for an article. Some go as far as to say we shouldn't have named references in templates at all. But in reality, the corpus is full of such constructs and we must assume that any article contains any number of errors (or warnings or regrettable constructs or ... severe, or not) before we edit them. These duplicates are normally handled fine -- as long as the definition text is identical.
Then, concisely: the problem is that Monkbot makes edits without testing for the new rendering errors it might be causing. In this case, it's a common enough problem that I think the bot should be stopped from doing this task. Maybe it can come back if it can be shown that it has been more carefully coded and doesn't so frequently create problems. -- Mikeblas (talk) 18:32, 27 December 2020 (UTC)
Or, the bot's logic could simply be tweaked to edit both the transcluded template and the article one after the other. Headbomb {t · c · p · b} 18:37, 27 December 2020 (UTC)
It might help if Monkbot ran through Category:Pages with duplicate reference names and applied Task 18 periodically (once per day for articles added to the category?) in order to catch these transclusion-related problems. All of the articles listed above, and a half-dozen more from the category that I picked at semi-random, were easily fixed by hyphenating parameters in the articles. Trappist the monk, would that be possible? – Jonesey95 (talk) 00:10, 28 December 2020 (UTC)
Did that yesterday and again this morning. I don't see that it made an appreciable difference. Did it?
Trappist the monk (talk) 12:52, 28 December 2020 (UTC)
Worth noting that, per the examples Mikeblas gives above, they're all a particular category of articles (eg lots of "YYYY Calgary municipal election", lots of "Leap year starting on ..."). Since the issue is redefined references, I imagine it's a case of certain editors doing this over articles they worked on. So the issue probably crops up in sets of related articles they touched.
That said, just out of curiosity, why does MonkBot end up editing similar sets of articles all within days, often minutes, of each other? See the edit links Mikeblas gives above. Surely randomness working properly would make this and this (+ half a dozen more in series) basically impossible? ProcrastinatingReader (talk) 14:04, 28 December 2020 (UTC)
Why is it possible to toss a coin ten times and get heads ten times? Any list [randomizer] working properly can return a randomized list that is exactly the same as the source list. I recall starting task 18a the first time without having randomized the list. When template names begin with digits, they alpha-sort to the head of the list so these examples may have been part of that group before I recognized my error and scrambled the list of remaining templates.
Trappist the monk (talk) 14:37, 28 December 2020 (UTC)
Sure, but when you flip a coin 10 times, get heads each time, and then do it again 5 times and reproduce the same results, one should ask whether they've ended up with a biased coin. Plus, since the randomised list is (I presume) a subset of all articles in each run iteration, I'm not sure they should even so often end up on the same processing list? Just making sure the randomisation isn't broken. ProcrastinatingReader (talk) 14:44, 28 December 2020 (UTC)
I was not thinking of a double-headed coin. The scrambler is at Module:Sandbox/trappist the monk/random sort. The scrambled list of articles that the bot is currently working on is at Module talk:Sandbox/trappist the monk/random sort. If you edit that talk page you can see the alpha-sorted source-list. The source list is (of course) a subset of all articles but is also the result of some number of cirrus searches that, for awb, will return at most 5000 article names per search. The current list was a few different searches plus the content of Category:CS1 errors: empty unknown parameters‎ filtered to remove duplicates and anything not in article namespace. I don't know how cirrus search decides which articles it will return from the many articles that it finds.
I think that the scrambler code is sufficiently documented that you should be able to see how it works. If you can see a way to improve it, let me know.
Trappist the monk (talk) 15:15, 28 December 2020 (UTC)
Re "an appreciable difference", the article count in the duplicate reference name category was about 850 fifteen hours ago and is 770 now, so I think that it did make a difference. – Jonesey95 (talk) 14:20, 28 December 2020 (UTC)
I've made between 20 and 25 manual fixes. That counts fix actions; it doesn't count how the fixes might fan out to affect the number of articles that fall out of the dupe names error category. It might be difficult to isolate causes and effects, then ... and would be best if the bot itself was checking its own work. -- Mikeblas (talk) 04:26, 30 December 2020 (UTC)
It could -- I'm sure there are several ways to "fix" the issue. I think your proposal has to involve traversing all the templates and transclusions in the article, recursively, to find different definitions that might have been produced. Some articles have simple structures; maybe not much to traverse. Others (like the tropical storms and hurricanes articles; or many of the soccer/football articles; or ...) have very involved structures. This doesn't seem exactly trivial (parameters, LUA templates/modules, redirects, ...) and so I think it's best to stop the bot until it's fixed. That's driven by the opinion the belief that there's a priority on not regressing content, particularly for low-value cosmetic changes. -- Mikeblas (talk) 04:26, 30 December 2020 (UTC)
If the bot can run through the error category once a day, that should take care of many of the problems that creep in as it runs. Another option would be for the bot to run on groups of articles in alphabetical order or within categories instead of its current quasi-random selection, but that would probably bother editors with watchlists containing groups of articles. I think if I were an editor with groups of articles on my watchlist, I would rather get slammed on a single day and get it over than deal with a steady drip, but to each their own. – Jonesey95 (talk) 05:22, 30 December 2020 (UTC)
If this issue relates to transcluded templates that have cite templates, then could the bot create an article list of each template needing to be edited, plus the articles transcluding the templates, then process those. That way disruption due to this issue should be even less. Rjwilmsi 15:26, 30 December 2020 (UTC)
It is mostly that, and the bot should probably work in that way, as I suggested above: If the bot is going to edit templates that contain named references, it probably needs to immediately perform the same task on articles that transclude that template. The trickier situation is articles that transclude all or part of another article; those may require working through categories instead of following the bot's current quasi-random approach. – Jonesey95 (talk) 16:12, 30 December 2020 (UTC)

Please share your experience with bots on Wikipedia![edit]

Hi there!

We are researchers interested in understanding how bots are created and managed on Wikipedia. We admire the culture of collaboration that you have built here, and are interested in it as a model that follows the ideals of the “old” Internet and collective governance structures. So we would love to talk to editors, BAG group members, bot operators and enthusiasts and hear about your experience. Please respond to us here, leave us a message in our talk pages or email us if you are interested in sharing your experience! We’ll reach out and set up a 30-45 minute interview over the platform of your choice.

Stay safe and happy holidays!

Bei Yan (talk page) Assistant Professor, Stevens Institute of Technology

Virginia Leavell (talk page) PhD Candidate, University of California, Santa Barbara — Preceding unsigned comment added by Momobay (talkcontribs) 22:58, 17 December 2020 (UTC)

Just letting everyone know this is a legit research group, and they have interviewed me on September 21st. They're very nice people, and I would encourage anyone that reads this to bring their own viewpoints and perspective to the table. Headbomb {t · c · p · b} 00:22, 18 December 2020 (UTC)
@MBisanz, Xaosflux, Anomie, Maxim, MaxSem, and SQL: pinging you specifically because you're all dinosaurs with a lot more knowledge about pre-bot and early-bot policy days than me. My knowledge of bots start around July 2008, but hadn't paid much attention to events that surrounded bots in those days. Headbomb {t · c · p · b} 20:16, 18 December 2020 (UTC)
Nice initiave. -- Magioladitis (talk) 21:30, 31 December 2020 (UTC)

Running bot from AWS[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Resolved – ip-exempt needed to use the API from AWS. — xaosflux Talk 15:03, 19 January 2021 (UTC)

I'm working on running Rick Bot from AWS rather than my own computer - with the eventual goal of running a generalized meta-bot capable of fetching the source for a list of bots from Wikipedia and running them (addressing the problem of bot disappearance). I'm using AWS serverless infrastructure for this which means the bot runs as a lambda function within AWS. I'm at the point where the bot is trying to edit a page and it turns out the entire AWS IP address range is blocked as an open proxy. I'm expecting I can control the apparent external IP address coming from AWS, but before getting too deep into this thought it might be worth asking if anyone else has run into this and has found a solution. Thanks! -- Rick Block (talk) 01:07, 9 January 2021 (UTC)

This happens whenever you host a bot on a hosting provider, it happened to ProcBot as well. The IPs are blocked globally and locally. The bot flag grants local "ipblock-exempt" so if you're using your bot account this shouldn't happen. ProcrastinatingReader (talk) 01:19, 9 January 2021 (UTC)
(edit conflict) Aren't bot accounts automatically exempt from IP blocks? * Pppery * it has begun... 01:19, 9 January 2021 (UTC)
@Rick Block: are you trying to edit here on enwiki, or another project? The bot account should be able to bypass ip restriction. Are you using WebUI or API? For API are you using OAUTH or BotPassword? — xaosflux Talk 11:04, 9 January 2021 (UTC)
I was trying it out with a user account on enwiki. Sounds like it should work with the bot account. I'm using pywikibot. Thanks everyone! -- Rick Block (talk) 01:41, 10 January 2021 (UTC)
@Rick Block: if you need to do some temporary testing you may give an account you control ip-block exempt for the test (set an expiration on the grant). — xaosflux Talk 14:45, 12 January 2021 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Should Cewbot remove interlanguage link templates once local articles exist?[edit]

There is currently no consensus that {{ill}} should be preserved in linking articles after the appropriate English Wikipedia article has been created, so there is nothing for BAG or anyone else to do here. In fact, if anything consensus currently seems to exist in the opposite direction, that {{ill}} should be removed. If you want to try to force the bot to preserve {{ill}} in some form even after the target article exists, you'll first need to seek community consensus that it should leave what some will see as "clutter" in the source. Your best bet for that is probably WP:Village pump (policy) or WP:Village pump (proposals). Anomie 13:29, 19 January 2021 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

This task ([[4]]) destroys the information about which other wikis have relevant articles about the subject. For instance, if the article is created, and then one week later is deleted again, the {{ill}} link is gone and the work of the editor who originally placed it there has been binned.

Over at User talk:Kanashimi/Archive 1#Task 1 Convert interlanguage link templates with local article to wikilinks I was told this place would be a better venue for this discussion. Please note in particular the suggestion Have the bot add the recently-enabled |display=force to {{Interlanguage link}} if the English page exists. You eliminate the expensive parser function call, you don't have the "what happens if the en-wiki article disappears" problem. by davidwr.

As far as I can see neither this specific proposal not the greater issue got a proper resolution. Cheers, CapnZapp (talk) 14:20, 12 January 2021 (UTC)

Extended content
I have opened a related/followup discussion at Template talk:Interlanguage link#Suppressing Crewbot conversion offering other alternatives. I have put notifications on the relevant WikiProject talk pages. I figure after a week we will have sense of what action is desired by those who use the template. davidwr/(talk)/(contribs) 14:38, 12 January 2021 (UTC)
Turns out this isn't related. CapnZapp (talk) 14:50, 13 January 2021 (UTC)

I have created a talk section over at the {{ill}} talk page if y'all rather discuss there than here: Template talk:Interlanguage link#Task 1 Convert interlanguage link templates with local article to wikilinks (again). Best, CapnZapp (talk) 14:56, 13 January 2021 (UTC)

I don't see why the bot should eliminate any ill links at all. As someone who edits frequently in articles with interlanguage links and adds them where needed, I'm opposed to anything which reduces the links between sister projects (which is a stated goal of Wikipedia). In addition, regarding the case described at the ill talk page of an ill link to a page later deleted where the foreign links had already been removed by cewbot, such decisions should not be made by bot at all, but by people, who can better evaluate the likelihood of such an article being deleted. Failing that, if consensus supports bot removal, then I would urge an automatic delay of some interval to be discussed, perhaps a year, to make the "deleted target" case less likely to occur, but I think that would be second best. The template already hides foreign links when the article on en-wiki is created; this should be sufficient; the rest should be left to discussion among editors concerned, who can consider what's best for the article and its future and discuss amongst themselves to find consensus, and not to a bot. Mathglot (talk) 18:05, 13 January 2021 (UTC)
I too regularly edit articles with {{ill}} links and create them often. They can be sometimes elaborate in order to deal with foreign scripts or naming conventions (Hungarian), so the resulting Wikitext often looks frightening. Therefore, removing those constructs once local articles exists is a welcome cleanup task. Can anyone provide an example of an {{ill}}-linked article that was converted to a local link and the article was then deleted? -- Michael Bednarek (talk) 01:21, 14 January 2021 (UTC)
Mathglot, as was mentioned (and directly asked for at least once) in the archived discussion linked by the OP, and to follow along with Michael's comment: your line of thinking has not been substantiated, and as far as I can tell is little more than FUD and/or concern about hypothetical "what ifs". The bot already has a delay built in (a week if I remember correctly) and I see no reason to extend that or cease its operation. Primefac (talk) 01:49, 14 January 2021 (UTC)
In addition, {{ill|Alexandra-Therese Keining|sv}} displays as Alexandra-Therese Keining with a link to the currently existing enwiki article and that does not help "links between sister projects". That single template adds 2 to the expensive parser function count—that quickly adds up and makes editing the page slower. An example of Cewbot removing ill is diff. Perhaps the bot's edit summary could show the wikitext that is removed so searching for "{{ill" in history would find it. Please do not pad out the wikitext with a hidden comment. Johnuniq (talk) 02:35, 14 January 2021 (UTC)

The question here is why remove the template completely? It recently was given the ability to skip the computationally intensive part, with |display=force. So far I see these arguments:

A) the resulting Wikitext often looks frightening I sincerely do not see this as a good argument - one template isn't worse than another, and certainly editing a page with loads of complex references can be just as hard if not harder - finding the actual display text among all the citation data. Basically let us assume WP:COMPETENCE. I do acknowledge there might well be exceptional cases where massive amounts of {{ill}} templates turns the page into alphabet soup, but that's no good excuse for Wiki-wide bot intervention. Such pages should be fixed manually. Or at the very least, by a bot doing reversible work, or where all human-added data are retained.

B) the bot delays one week One week is obviously not enough. Even the quick PROD procedure takes one week. An AfD definitely takes longer. Waiting one week is enough only for speedy delete cases. But this is really irrelevant - if the bot stopped using its current scorched earth approach, there would be no reason to delay at all. See proposal at the end.

C) removing those constructs once local articles exists is a welcome cleanup task No it isn't. A template is not a "construct". It's a template, and Wikipedia uses loads of templates that aren't removed just to clean up the page. Please do not claim consensus for this conclusion when participating in a discussion directly contesting such a consensus! I started the original talk discussion (link at top of section) because I see it as problematic we have a bot that undoes the contributions of humans.

D) as far as I can tell is little more than FUD and/or concern about hypothetical "what ifs" Stop belittling and dismissing other users, Primefac. Language of this sort is inflammatory and not taken in good faith. This is not the first time I have asked you to remain polite, and I will ignore it except to say that you are merely trying to shift the burden of proof away. Since the bot has gotten new capabilities YOU need to argue why the bot should keep destroying the work of human editors even though it can easily be modified not to, while still avoiding any computational load.

Let us change the bot's behavior to use this new |display= parameter. Let us furthermore change the bot to change this parameter back if and when it detects the link has gone red again. This way, humans are freed from needless busywork, and we have one less bot that actively undoes human contributions. CapnZapp (talk) 10:18, 14 January 2021 (UTC)

  • Short of an RFC deciding that interwikilinks are preferable to enwiki links, which I do not see for one second passing, Cewbot is functionning as intended, and is supported by consensus. This is the English Wikipedia, and we link to our own articles in the mainspace. {{ill}} is specifically for pages with no current enwiki pages. Once those pages exist, {{ill}} serves no purpose and should be removed. Headbomb {t · c · p · b} 20:30, 17 January 2021 (UTC)
Thank you for your opinion. A few remarks, though: First, nobody is arguing to make "interwikilinks preferable to enwiki links". Second, "Cewbot is functionning as intended" is irrelevant for this discussion - that's a truism that doesn't support either viewpoint, meaning if we do arrive at a consensus to save {{ill}} templates then of course the bot is no longer working as intended, and so it will be tweaked. Finally, if I may: you aren't directly meeting the argument made here, but can I assume that, in your opinion, 1) {{ill}} templates aren't worth saving even after their computational load is eliminated, and 2) the work of editors adding {{ill}} templates isn't worth saving, User:Headbomb? Thanks for any clarification you can provide. CapnZapp (talk) 22:04, 17 January 2021 (UTC)
"The work of editors adding {{ill}} templates isn't work saving." It is saved, in the form of a link to an enwiki article with interwikilinks to other languages available in their standard location. When there's an enwiki article, {{ill}} no longer serves any useful purpose endorsed by the community. And unless you can point to an RFC where the community has decided that {{ill}} should be preserved after an enwiki article exists, you will not find much support to alter Cewbot's behaviour. Headbomb {t · c · p · b} 00:28, 18 January 2021 (UTC)
@Headbomb: If I hear you correctly, you are saying that we are "putting the cart before the horse" and run a full RFC on the question about whether keeping ill templates around or preserving the ability to automatically restore them if the en-wiki page is later deleted is desirable. Am I hearing you correctly? davidwr/(talk)/(contribs) 00:38, 18 January 2021 (UTC)
I assume no one wants to convert every plain [[Example]] link to use {{ill}}. Therefore Cewbot is working well, and the only discussion to have is whether one week is sufficient delay. This noticeboard is not the place where that should be decided. Johnuniq (talk) 02:06, 18 January 2021 (UTC)
Just to note - User:Headbomb appears to entirely ignore the entire point of this discussion: saying It is saved, in the form of a link to an enwiki article with interwikilinks to other languages available in their standard location. means they haven't read the actual complaint. Regards, CapnZapp (talk) 07:48, 19 January 2021 (UTC)
User:Johnuniq I don't follow your line of reasoning. Instead of me trying to interpret what appears to be quite absurd, could I ask you to explain? CapnZapp (talk) 07:53, 19 January 2021 (UTC)
And finally, about the "this isn't the place" comment. I would like to make everybody aware that in my attempts to raise awareness of this issue, I was redirected here by User:Primefac - twice. So please User:Johnuniq: if you want to argue this isn't the place to have this discussion, do suggest which place is better. CapnZapp (talk) 07:59, 19 January 2021 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Edit confirmed protection[edit]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Resolved – Operator adjusted the bot grant. — xaosflux Talk 22:16, 17 January 2021 (UTC)

The FACBot reported an error when it tried to update Belarus, an article with WP:ECP. The FABot account is more than 30 days' old and has more than 500 edits, so I presume that ECP locks out bots, but cannot find any documentation to this effect. Does anyone know anything about this? Hawkeye7 (discuss) 02:51, 17 January 2021 (UTC)

Wikipedia:User access levels#Extendedconfirmed states "This access is included and bundled in the bot and sysop (administrator) user groups", and FACBot is indeed a bot, so this is surprising. Are you positive the error was related to the EC protection? — The Earwig talk 05:28, 17 January 2021 (UTC)
Yeah, per Special:ListGroupRights the "bot" flag has "extendedconfirmed" bundled into it. ProcrastinatingReader (talk) 11:42, 17 January 2021 (UTC)
Definitely would help us debug if we knew the specific error, unless it was just a generic "could not edit page" (in which case we'll just have to guess). Primefac (talk) 12:22, 17 January 2021 (UTC)
The Bot reported this error: unable to edit 'Belarus' (3) : protectedpage: This page has been protected to prevent editing or other actions. I confirmed that the protection was ECP, and was able to edit it myself to make the change the Bot was attempting. Hawkeye7 (discuss) 19:41, 17 January 2021 (UTC)
@Hawkeye7, The Earwig, ProcrastinatingReader, and Primefac: this is likely because the bot operator is using the api and has not included the 'edit protected pages' grant for the bot - I was able to duplicate this with a BotPasswords style grant and an API edit. — xaosflux Talk 14:57, 17 January 2021 (UTC)
@Xaosflux: How is the 'edit protected pages' grant included? Hawkeye7 (discuss) 19:41, 17 January 2021 (UTC)
@Hawkeye7: the operator picks what they want to be allowed in their grant, if using BotPasswords it is in Special:BotPasswords and if using OAUTH it is on meta:Special:OAuthManageMyGrants - log in to the webui as the bot and then view/update the grants as needed. — xaosflux Talk 21:24, 17 January 2021 (UTC)
Note to perform an action it must be both allowed in the grant and the account must have the actual permission to perform the action. — xaosflux Talk 21:26, 17 January 2021 (UTC)
Thank you! I have updated "edit protected pages" grant for the bot, and confirmed that the account does have actual permission. I guess when I set the Bot up I was not thinking of ECP. Hawkeye7 (discuss) 21:44, 17 January 2021 (UTC)

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Cyberbot II reported at AIV[edit]

Moved from WP:AIV – ~ ToBeFree (talk) 19:15, 17 January 2021 (UTC)

Cyberbot II (talk • contribs • deleted contribs • nuke contribs • logs • filter log • block user • block log) – Not vandalism but malfunctioning [5][6]. The operator was notified days ago but is inactive. (CC) Tbhotch 19:06, 17 January 2021 (UTC)

Looks like the typically database lag error. Bot isn't malfunctionning, but there should be a built-in protection for page-blanking. Headbomb {t · c · p · b} 19:19, 17 January 2021 (UTC)
  • I've gone ahead and blocked the account. Whatever the issue is, it definitely shouldn't be blanking pages. Per User:Cyberbot II/Run/PC, it seems like the task shouldn't be running at all? Mz7 (talk) 19:20, 17 January 2021 (UTC)
  • (non-admin) I have changed the status of the bot on User:Cyberbot II/Status to say "disable" - to reflect the bot's block by Mz7. P,TO 19104 (talk) (contribs) 23:57, 17 January 2021 (UTC)