Wikipedia:Village pump (policy)

Latest comment: 1 hour ago by Ktin in topic LLM/chatbot comments in discussions
 Policy Technical Proposals Idea lab WMF Miscellaneous 
The policy section of the village pump is used to discuss already-proposed policies and guidelines and to discuss changes to existing policies and guidelines. Change discussions often start on other pages and then move or get mentioned here for more visibility and broader participation.
  • If you want to propose something new that is not a policy or guideline, use Village pump (proposals). For drafting with a more focused group, you can also start on the talk page for a WikiProject, Manual of Style, or other relevant project page.
  • If you have a question about how to apply an existing policy or guideline, try one of the many Wikipedia:Noticeboards.
  • If you want to ask what the policy is on something, try the Help desk or the Teahouse.
  • This is not the place to resolve disputes over how a policy should be implemented. Please see Wikipedia:Dispute resolution for how to proceed in such cases.
  • If you want to propose a new or amended speedy deletion criterion, use Wikipedia talk:Criteria for speedy deletion.

Please see this FAQ page for a list of frequently rejected or ignored proposals. Discussions are automatically archived after remaining inactive for two weeks.


LLM/chatbot comments in discussions

edit

Should admins or other users evaluating consensus in a discussion discount, ignore, or strike through or collapse comments found to have been generated by AI/LLM/Chatbots? 00:12, 2 December 2024 (UTC)

I've recently come across several users in AFD discussions that are using LLMs to generate their remarks there. As many of you are aware, gptzero and other such tools are very good at detecting this. I don't feel like any of us signed up for participating in discussions where some of the users are not using their own words but rather letting technology do it for them. Discussions are supposed to be between human editors. If you can't make a coherent argument on your own, you are not competent to be participating in the discussion. I would therefore propose that LLM-generated remarks in discussions should be discounted or ignored, and possibly removed in some manner. Just Step Sideways from this world ..... today 00:12, 2 December 2024 (UTC)Reply

  • Seems reasonable, as long as the GPTZero (or any tool) score is taken with a grain of salt. GPTZero can be as wrong as AI can be. ~ ToBeFree (talk) 00:32, 2 December 2024 (UTC)Reply
  • Only if the false positive and false negative rate of the tool you are using to detect LLM content is very close to zero. LLM detectors tend to be very unreliable on, among other things, text written by non-native speakers. Unless the tool is near perfect then it's just dismissing arguments based on who wrote them rather than their content, which is not what we do or should be doing around here. Thryduulf (talk) 00:55, 2 December 2024 (UTC)Reply
    In the cases I have seen thusfar it's been pretty obvious, the tools have just confirmed it. Just Step Sideways from this world ..... today 04:08, 2 December 2024 (UTC)Reply
    The more I read the comments from other editors on this, the more I'm a convinced that implementing either this policy or something like it will bring very significant downsides on multiple fronts that significantly outweigh the small benefits this would (unreliably) bring, benefits that would be achieved by simply reminding closers to disregard comments that are unintelligible, meaningless and/or irrelevant regardless of whether they are LLM-generated or not. For the sake of the project I must withdraw my previous very qualified support and instead very strongly oppose. Thryduulf (talk) 02:45, 3 December 2024 (UTC)Reply
  • I think it should be an expressly legitimate factor in considering whether to discount or ignore comments either if it's clear enough by the text or if the user clearly has a history of using LLMs. We wouldn't treat a comment an editor didn't actually write as an honest articulation of their views in lieu of site policy in any other situation. Remsense ‥  00:59, 2 December 2024 (UTC)Reply
  • I would have already expected admins to exercise discretion in this regard, as text written by an LLM is not text written by a person. We cannot guarantee it is what the person actually means, especially as it is a tool often used by those with less English proficiency, which means perhaps they cannot evaluate the text themselves. However, I do not think we can make policy about a specific LLM or tool. The LLM space is moving fast, en.wiki policies do not. Removal seems tricky, I would prefer admins exercise discretion instead, as they do with potentially canvassed or socked !votes. CMD (talk) 01:06, 2 December 2024 (UTC)Reply
  • Support discounting or collapsing AI-generated comments, under slightly looser conditions than those for human comments. Not every apparently-AI-generated comment is useless hallucinated nonsense – beyond false positives, it's also possible for someone to use an AI to help them word a constructive comment, and make sure that it matches their intentions before they publish it. But in my experience, the majority of AI-generated comments are somewhere between "pointless" and "disruptive". Admins should already discount clearly insubstantial !votes, and collapse clearly unconstructive lengthy comments; I think we should recognize that blatant chatbot responses are more likely to fall into those categories. jlwoodwa (talk) 02:11, 2 December 2024 (UTC)Reply
  • Strongly Support - I think some level of human judgement on the merits of the argument are necessary, especially as GPTZero may still have a high FPR. Still, if the discussion is BLUDGEONy, or if it quacks like an AI-duck, looks like an AI-duck, etc, we should consider striking out such content.
    - sidenote, I'd also be in favor of sanctions against users who overuse AI to write out their arguments/articles/etc. and waste folks time on here.. Bluethricecreamman (talk) 02:20, 2 December 2024 (UTC)Reply
  • On a wording note, I think any guidance should avoid referring to any specific technology. I suggest saying "... to have been generated by a program". isaacl (talk) 02:54, 2 December 2024 (UTC)Reply
    "generated by a program" is too broad, as that would include things like speech-to-text. Thryduulf (talk) 03:08, 2 December 2024 (UTC)Reply
    Besides what Thryduulf said, I think we should engage with editors who use translators. Aaron Liu (talk) 03:45, 2 December 2024 (UTC)Reply
    A translation program, whether it is between languages or from speech, is not generating a comment, but converting it from one format to another. A full policy statement can be more explicit in defining "generation". The point is that the underlying tech doesn't matter; it's that the comment didn't feature original thought from a human. isaacl (talk) 03:57, 2 December 2024 (UTC)Reply
    Taking Google Translate as an example, most of the basic stuff uses "AI" in the sense of machine learning (example) but they absolutely use LLMs nowadays, even for the basic free product. Gnomingstuff (talk) 08:39, 2 December 2024 (UTC)Reply
  • Support. We already use discretion in collapsing etc. comments by SPAs and suspected socks, it makes sense to use the same discretion for comments suspected of being generated by a non-human. JoelleJay (talk) 03:07, 2 December 2024 (UTC)Reply
  • Support - Someone posting "here's what ChatGPT has to say on the subject" can waste a lot of other editors' time if they feel obligated to explain why ChatGPT is wrong again. I'm not sure how to detect AI-written text but we should take a stance that it isn't sanctioned. Clayoquot (talk | contribs) 04:37, 2 December 2024 (UTC)Reply
  • Strong Support - I've never supported using generative AI in civil discourse. Using AI to participate in these discussions is pure laziness, as it is substituting genuine engagement and critical thought with a robot prone to outputting complete garbage. In my opinion, if you are too lazy to engage in the discussion yourself, why should we engage with you? Lazman321 (talk) 05:26, 2 December 2024 (UTC)Reply
  • Comment - I'm skeptical that a rule like this will be enforceable for much longer. Sean.hoyland (talk) 05:39, 2 December 2024 (UTC)Reply
    Why? Aaron Liu (talk) 12:22, 2 December 2024 (UTC)Reply
    Because it's based on a potentially false premise that it will be possible to reliably distinguish between text generated by human biological neural networks and text generated by non-biological neural networks by observing the text. It is already quite difficult in many cases, and the difficulty is increasing very rapidly. I have your basic primate brain. The AI companies building foundation models have billions of dollars, tens of thousands, soon to be hundreds of thousands of GPUs, a financial incentive to crack this problem and scaling laws on their side. So, I have very low credence in the notion that I will be able to tell whether content is generated by a person or a person+LLM or an AI agent very soon. On the plus side, it will probably still be easy to spot people making non-policy based arguments regardless of how they do it. Sean.hoyland (talk) 13:52, 2 December 2024 (UTC)Reply
    ...and now that the systems are autonomously injecting their output back into model via chain-of-thought prompting, or a kind of inner monologue if you like, to respond to questions, they are becoming a little bit more like us. Sean.hoyland (talk) 14:14, 2 December 2024 (UTC)Reply
    A transformer (deep learning architecture) is intrinsically nothing like a human. It's a bunch of algebra that can compute what a decently sensible person could write in a given situation based on its training data, but it is utterly incapable of anything that could be considered thought or reasoning. This is why LLMs tend to fail spectacularly when asked to do math or write non-trivial code. Flounder fillet (talk) 17:20, 2 December 2024 (UTC)Reply
    We shall see. You might want to update yourself on their ability to do math and write non-trivial code. Things are changing very quickly. Either way, it is not currently possible to say much about what LLMs are actually doing because mechanistic interpretability is in its infancy. Sean.hoyland (talk) 03:44, 3 December 2024 (UTC)Reply
    You might be interested in Anthropic's 'Mapping the Mind of a Large Language Model' and Chris Olah's work in general. Sean.hoyland (talk) 04:02, 3 December 2024 (UTC)Reply
  • Support and I would add "or similar technologies" to "AI/LLM/Chatbots". As for Sean.hoyland's comment, we will cross that bridge when we get to it. Cullen328 (talk) 05:51, 2 December 2024 (UTC)Reply
    ...assuming we can see the bridge and haven't already crossed it. Sean.hoyland (talk) 06:24, 2 December 2024 (UTC)Reply
  • Support - All editors should convey their thoughts in their own words. AI generated responses and comments are disruptive because they are pointless and not meaningful. - Ratnahastin (talk) 06:04, 2 December 2024 (UTC)Reply
  • Support, I already more or less do this. An LLM generated comment may or may not actually reflect the actual thoughts of the editor who posted it, so it's essentially worthless toward a determination of consensus. Since I wrote this comment myself, you know that it reflects my thoughts, not those of a bot that I may or may not have reviewed prior to copying and pasting. Seraphimblade Talk to me 06:59, 2 December 2024 (UTC)Reply
  • Strong oppose. Let me say first that I do not like ChatGPT. I think it has been a net negative for the world, and it is by nature a net negative for the physical environment. It is absolutely a net negative for the encyclopedia if LLM-generated text is used in articles in any capacity. However, hallucinations are less of an issue on talk pages because they're discussions. If ChatGPT spits out a citation of a false policy, then obviously that comment is useless. If ChatGPT spits out some boilerplate "Thanks for reviewing the article, I will review your suggestions and take them into account" talk page reply, who gives a fuck where it came from? (besides the guys in Texas getting their eardrums blown out because they live by the data center)
    The main reason I oppose, though, is because banning LLM-generated comments is difficult to enforce bordering on unenforceable. Most studies show that humans are bad at distinguishing AI-generated text from text generated without AI. Tools like GPTZero claims a 99% accuracy rate, but that seems dubious based on reporting on the matter. The news outlet Futurism (which generally has an anti-AI slant) has failed many times to replicate that statistic, and anecdotal accounts by teachers, etc. are rampant. So we can assume that we don't know how capable AI detectors are, that there will be some false positives, and that striking those false positives will result in WP:BITING people, probably newbies, younger people more accustomed to LLMs, and non-Western speakers of English (see below).
    There are also technological issues as play. It'd be easy if there was a clean line between "totally AI-generated text" and "totally human-generated text," but that line is smudged and well on its way to erased. Every tech company is shoving AI text wrangling into their products. This includes autocomplete, translation, editing apps, etc. Should we strike any comment a person used Grammarly or Google Translate for? Because those absolutely use AI now.
    And there are also, as mentioned above, cultural issues. The people using Grammarly, machine translation, or other such services are likely to not have English as their first language. And a lot of the supposed "tells" of AI-generated content originate in the formal English of other countries -- for instance, the whole thing where "delve" was supposedly a tell for AI-written content until people pointed out the fact that lots of Nigerian workers trained the LLM and "delve" is common Nigerian formal English.
    I didn't use ChatGPT to generate any of this comment. But I am also pretty confident that if I did, I could have slipped it in and nobody would have noticed until this sentence. Gnomingstuff (talk) 08:31, 2 December 2024 (UTC)Reply
    Just for grins, I ran your comment through GPTzero, and it comes up with a 99% probability that it was human-written (and it never struck me as looking like AI either, and I can often tell.) So, maybe it's more possible to distinguish than you think? Seraphimblade Talk to me 20:11, 2 December 2024 (UTC)Reply
    Yeah, Gnoming's writing style is far more direct and active than GPT's. Aaron Liu (talk) 23:02, 2 December 2024 (UTC)Reply
    There weren't
    • Multiple
      LLMs tend to use more than one subheading to reiterate points
    • Subheadings
      Because they write like a middle schooler that just learned how to make an essay outline before writing.
    In conclusion, they also tend to have a conclusion paragraph for the same reason they use subheadings. ScottishFinnishRadish (talk) 13:56, 3 December 2024 (UTC)Reply
  • Support - Ai-generated comments are WP:DISRUPTIVE - An editor who has an argument should not use ChatGPT to present it in an unnecessarily verbose manner, and an editor who doesn't have one should not participate in discussion. Flounder fillet (talk) 13:14, 2 December 2024 (UTC)Reply
  • Yes but why do we need this common sense RFC/policy/whatever? Just ban these people. If they even exist. Headbomb {t · c · p · b} 07:14, 2 December 2024 (UTC)Reply
    They exist, and I found myself collapsing some long, obviously chatbot-generated posts in an AFD, and after I did so wondering if policy actually supported doing that. I couldn't find anything so here we are. Just Step Sideways from this world ..... today 20:04, 2 December 2024 (UTC)Reply
  • Yes, of course, and I know that's the right answer because ChatGPT agrees with me.
What ChatGPT thinks
I started by asking:
Currently at the English Wikipedia, there's a discussion of the following proposal: "Should or other users evaluating consensus in a discussion discount, ignore, or strike through or collapse comments found to have been generated by AI/LLM/Chatbots?" I wonder if you have an opinion on that question.
My friend the bot initially proposed a requirement that any AI-generated content be labeled as such by the person posting it. After some further discussion between us, I finally said:
You say that labeling AI-generated content, instead of outright discounting it, might be an appropriate middle ground. But if someone is unable to formulate their own thoughts and put them in writing, of what use is it to the discussion for this person to simply ask you to make something up, and then post your output to the Wikipedia discussion with the proviso that "ChatGPT, who/which doesn't actually know anything about anything, said this ..."? How is it different from saying, "My grandmother, who knows nothing about what we're talking about, said ..."?
And it said:
That’s an incisive question, and it highlights the crux of the issue: the value of contributions to a discussion. Here’s why AI-generated comments might still differ (slightly) from your hypothetical grandmother’s input, but why their usefulness is often limited:
  • Synthesis vs. Subject Expertise: While I (or any AI) lack personal understanding or expertise, I can aggregate and articulate relevant ideas from a vast dataset. This means I can sometimes present arguments, summarize perspectives, or surface considerations that a non-expert human might overlook. Your grandmother, unless she's an AI-trained polymath, wouldn’t be able to provide this breadth. However, this advantage evaporates if participants lack the ability to evaluate and critically engage with my output.
  • The Problem of Proxy Voices: The act of outsourcing one’s contribution to AI undermines the deliberative process central to Wikipedia’s discussions. Participants are expected to bring their own informed perspectives, shaped by personal understanding of Wikipedia’s policies and content. A disclaimer like “ChatGPT said this” doesn’t change the fact that the human posting it has abdicated their role in the discussion.
  • Accountability and Intent: In a collaborative setting like Wikipedia, contributions are ideally linked to a person who can clarify their intent, respond to questions, and engage dynamically. An AI cannot do this. Saying "ChatGPT said this" distances the poster from the content, making it unclear whether they endorse or even understand the comment.
  • Potential for Misdirection: Allowing AI-generated contributions risks introducing noise into the discussion. AI can produce plausible-sounding arguments that may be tangential, misaligned, or even factually incorrect. Without the judgment of a well-informed poster, such contributions can derail or dilute meaningful discourse.
Conclusion: If someone cannot formulate their own argument or understanding, then their reliance on an AI—or anyone else—to contribute doesn’t meaningfully advance the discussion. Labeling AI-generated content is better than passing it off as human, but the real issue lies in whether such contributions enrich or degrade the collaborative process. In most cases, they would seem to belong more to the realm of rhetorical curiosities than substantive engagement.
  • In keeping with the proposed guideline, I have of course collapsed the above AI-generated content. (Later: It's actually worth reading in the context of this discussioin, so I've unhidden it by default.) But I must confess it's a pretty good analysis, and worth reading. EEng 07:47, 2 December 2024 (UTC)Reply
  • This is absolute gold dust and the best contribution to this discussion so far. There is an enormous irony here, one that might not be immediately obvious. The proposal is that we should ignore or even strike these type of contributions, but personally it seems like the collapsed format has worked a charm here. I really don't think that AI has much to contribute to WP discussions generally, but with the right prompt, there is certainly something worth adding to the conversation in reality. CNC (talk) 20:23, 8 December 2024 (UTC)Reply
    The proposal also includes collapsing. jlwoodwa (talk) 20:26, 8 December 2024 (UTC)Reply
    Thanks, I completely missed that. Trying to speed read is not my forte. CNC (talk) 20:32, 8 December 2024 (UTC)Reply
The "detector" website linked in the opening comment gives your chatbot's reply only an 81% chance of being AI-generated. WhatamIdoing (talk) 23:36, 2 December 2024 (UTC)Reply
That's because, just by interacting with me, ChatGPT got smarter. Seriously ... you want it to say 99% every time? (And for the record, the idea of determining the "chance" that something is AI-generated is statistical nonsense.) EEng 03:07, 3 December 2024 (UTC)Reply
What I really want is a 100% chance that it won't decide that what I've written is AI-generated. Past testing has demonstrated that at least some of the detectors are unreliable on this point. WhatamIdoing (talk) 03:28, 4 December 2024 (UTC)Reply
100% is, of course, an impossible goal. Certainly SPI doesn't achieve that, so why demand it here? EEng 22:31, 4 December 2024 (UTC)Reply
  • Strong Oppose I support the concept of removal of AI-generated content in theory. However, we do not have the means to detect such AI-generated content. The proposed platform that we may use (GPTZero) is not reliable for this purpose. In fact, our own page on GPTZero has a section citing several sources stating the problem with this platform's accuracy. It is not helpful to have a policy that is impossible to enforce. ThatIPEditor They / Them 08:46, 2 December 2024 (UTC) Reply
  • Strong Support To be honest, I am surprised that this isn't covered by an existing policy. I oppose the use of platforms like GPTZero, due to it's unreliability, but if it is obviously an ai-powered-duck (Like if it is saying shit like "as an AI language model...", take it down and sanction the editor who put it up there. ThatIPEditor They / Them 08:54, 2 December 2024 (UTC)Reply
  • Support at least for WP:DUCK-level AI-generated comments. If someone uses a LLM to translate or improve their own writing, there should be more leeway, but something that is clearly a pure ChatGPT output should be discounted. Chaotic Enby (talk · contribs) 09:17, 2 December 2024 (UTC)Reply
  • I agree for cases in which it is uncontroversial that a comment is purely AI-generated. However, I don't think there are many cases where this is obvious. The claim that gptzero and other such tools are very good at detecting this is false. Phlsph7 (talk) 09:43, 2 December 2024 (UTC)Reply
  • Support Not clear how admins are deciding that something is LLM generated, a recent example, agree with the principle tho. Selfstudier (talk) 10:02, 2 December 2024 (UTC)Reply
  • Moral support; neutral as written. Chatbot participation in consensus discussions is such an utterly pointless and disdainful abuse of process and community eyeballs that I don't feel like the verbiage presented goes far enough. Any editor may hat LLM-generated comments in consensus discussions is nearer my position. No waiting for the closer, no mere discounting, no reliance on the closer's personal skill at recognising LLM output, immediate feedback to the editor copypasting chatbot output that their behaviour is unwelcome and unacceptable. Some observations:
    I've seen editors accused of using LLMs to generate their comments probably about a dozen times, and in all but two cases – both at dramaboards – the chatbot prose was unmistakably, blindingly obvious. Editors already treat non-obvious cases as if written by a human, in alignment with the raft of only if we're sure caveats in every discussion about LLM use on the project.
    If people are using LLMs to punch up prose, correct grammar and spelling, or other superficial tasks, this is generally undetectable, unproblematic, and not the point here.
    Humans are superior to external services at detecting LLM output, and no evidence from those services should be required for anything.
    As a disclosure, evidence mounts that LLM usage in discussions elicits maximally unkind responses from me. It just feels so contemptuous, to assume that any of us care what a chatbot has to say about anything we're discussing, and that we're all too stupid to see through the misattribution because someone tacked on a sig and sometimes an introductory paragraph. And I say this as a stupid person. Folly Mox (talk) 11:20, 2 December 2024 (UTC)Reply
    Looks like a rewrite is indicated to distinguish between machine translation and LLM-generated comments, based on what I'm seeing in this thread. Once everyone gets this out of our system and an appropriately wordsmithed variant is reintroduced for discussion, I preemptively subpropose the projectspace shortcut WP:HATGPT. Folly Mox (talk) 15:26, 8 December 2024 (UTC)Reply
  • Support per EEng charlotte 👸♥ 14:21, 2 December 2024 (UTC)Reply
  • I would be careful here, as there are tools that rely on LLM AI that help to improve the clarity of one's writing, and editors may opt to use those to parse their poor writing (perhaps due to ESL aspects) to something clear. I would agree content 100% generated by AI probably should be discounted particularly if from an IP or new editors (hints if socking or meat puppetry) but not all cases where AI has come into play should be discounted — Masem (t) 14:19, 2 December 2024 (UTC)Reply
  • Support, cheating should have no place or take its place in writing coherent comments on Wikipedia. Editors who opt to use it should practice writing until they rival Shakespeare, or at least his cousin Ned from across the river, and then come back to edit. Randy Kryn (talk) 14:29, 2 December 2024 (UTC)Reply
  • Support atleast for comments that are copied straight from the LLM . However, we should be more lenient if the content is rephrased by non-native English speakers due to grammar issues The AP (talk) 15:10, 2 December 2024 (UTC)Reply
  • Support for LLM-generated content (until AI is actually intelligent enough to create an account and contribute on a human level, which may eventually happen). However, beware of the fact that some LLM-assisted content should probably be allowed. An extreme example of this: if a non-native English speaker were to write a perfectly coherent reason in a foreign language, and have an LLM translate it to English, it should be perfectly acceptable. Animal lover |666| 16:47, 2 December 2024 (UTC)Reply
    For wiki content, maybe very soon. 'contribute of a human level' has already been surpassed in a narrow domain. Sean.hoyland (talk) 17:08, 2 December 2024 (UTC)Reply
    If Star Trek's Data were to create his own account and edit here, I doubt anyone would find it objectionable. Animal lover |666| 17:35, 2 December 2024 (UTC)Reply
    I’m proposing a policy that any AI has to be capable of autonomous action without human prompting to create an account. Dronebogus (talk) 21:38, 5 December 2024 (UTC)Reply
    Data, being a fictional creation with rights owned by a corporation, will not have an account; he is inherently an IP editor. -- Nat Gertler (talk) 03:22, 20 December 2024 (UTC)Reply
  • Strong support chatbots have no place in our encyclopedia project. Simonm223 (talk) 17:14, 2 December 2024 (UTC)Reply
  • Oppose - I think the supporters must have a specific type of AI-generated content in mind, but this isn't a prohibition on one type; it's a prohibition on the use of generative AI in discussions (or rather, ensuring that anyone who relies on such a tool will have their opinion discounted). We allow people who aren't native English speakers to contribute here. We also allow people who are native English speakers but have difficulty with language (but not with thinking). LLMs are good at assisting both of these groups of people. Furthermore, as others pointed out, detection is not foolproof and will only get worse as time goes on, models proliferate, models adapt, and users of the tools adapt. This proposal is a blunt instrument. If someone is filling discussions with pointless chatbot fluff, or we get a brand new user who's clearly using a chatbot to feign understanding of wikipolicy, of course that's not ok. But that is a case by case behavioral issue. I think the better move would be to clarify that "some forms of LLM use can be considered disruptive and may be met with restrictions or blocks" without making it a black-and-white issue. — Rhododendrites talk \\ 17:32, 2 December 2024 (UTC)Reply
    I agree the focus should not be on whether or not a particular kind of tech was used by an editor, but whether or not the comment was generated in a way (whether it's using a program or ghost writer) such that it fails to express actual thoughts by the editor. (Output from a speech-to-text program using an underlying large language model, for instance, isn't a problem.) Given that this is often hard to determine from a single comment (everyone is prone to post an occasional comment that others will consider to be off-topic and irrelevant), I think that patterns of behaviour should be examined. isaacl (talk) 18:07, 2 December 2024 (UTC)Reply
    Here's what I see as two sides of a line. The first is, I think, something we can agree would be inappropriate. The second, to me at least, pushes up against the line but is not ultimately inappropriate. But they would both be prohibited if this passes. (a) "I don't want an article on X to be deleted on Wikipedia. Tell me what to say that will convince people not to delete it"; (b) "I know Wikipedia deletes articles based on how much coverage they've received in newspapers, magazines, etc. and I see several such articles, but I don't know how to articulate this using wikipedia jargon. Give me an argument based on links to wikipedia policy that use the following sources as proof [...]". Further into the "acceptable" range would be things like translations, grammar checks, writing a paragraph and having an LLM improve the writing without changing the ideas, using an LLM to organize ideas, etc. I think what we want to avoid are situations where the arguments and ideas themselves are produced by AI, but I don't see such a line drawn here and I don't think we could draw a line without more flexible language. — Rhododendrites talk \\ 18:47, 2 December 2024 (UTC)Reply
    Here we return to my distinction between AI-generated and AI-assisted. A decent speech-to-text program doesn't actually generate content. Animal lover |666| 18:47, 2 December 2024 (UTC)Reply
    Yes, as I posted earlier, the underlying tech isn't important (and will change). Comments should reflect what the author is thinking. Tools (or people providing advice) that help authors express their personal thoughts have been in use for a long time. isaacl (talk) 19:08, 2 December 2024 (UTC)Reply
    Yeah the point here is passing off a machine's words as your own, and the fact that it is often fairly obvious when one is doing so. If a person is not competent to express their own thoughts in plain English, they shouldn't be in the discussion. This certainly is not aimed at assistive technology for those who actually need it but rather at persons who are simply letting Chatbots speak for them. Just Step Sideways from this world ..... today 20:10, 2 December 2024 (UTC)Reply
    This doesn't address what I wrote (though maybe it's not meant to). If a person is not competent to express their own thoughts in plain English, they shouldn't be in the discussion. This certainly is not aimed at assistive technology for those who actually need it but rather at persons who are simply letting Chatbots speak for them is just contradictory. Assistive technologies are those that can help people who aren't "competent" to express themselves to your satisfaction in plain English, sometimes helping with the formulation of a sentence based on the person's own ideas. There's a difference between having a tool that helps me to articulate ideas that are my own and a tool that comes up with the ideas. That's the distinction we should be making. — Rhododendrites talk \\ 21:23, 2 December 2024 (UTC)Reply
    I agree with Rhododendrites that we shouldn't be forbidding users from seeking help to express their own thoughts. Getting help from someone more fluent in English, for example, is a good practice. Nowadays, some people use generative technology to help them prepare an outline of their thoughts, so they can use it as a starting point. I think the community should be accepting of those who are finding ways to write their own viewpoints more effectively and concisely, even if that means getting help from someone or a program. I agree that using generative technology to come up with the viewpoints isn't beneficial for discussion. isaacl (talk) 22:58, 2 December 2024 (UTC)Reply
    Non-native English speakers and non-speakers to whom a discussion is important enough can already use machine translation from their original language and usually say something like "Sorry, I'm using machine translation". Skullers (talk) 08:34, 4 December 2024 (UTC)Reply
  • Oppose Contributions to discussions are supposed to be evaluated on their merits per WP:NOTAVOTE. If an AI-assisted contribution makes sense then it should be accepted as helpful. And the technical spectrum of assistance seems large and growing. For example, as I type this into the edit window, some part of the interface is spell-checking and highlighting words that it doesn't recognise. I'm not sure if that's coming from the browser or the edit software or what but it's quite helpful and I'm not sure how to turn it off. Andrew🐉(talk) 18:17, 2 December 2024 (UTC)Reply
    But we're not talking about spell-checking. We're talking about comments clearly generated by LLMs, which are inherently unhelpful. Lazman321 (talk) 18:29, 2 December 2024 (UTC)Reply
    Yeah, spellchecking is not the issue here. It is users who are asking LLMs to write their arguments for them, and then just slapping them into discussions as if it were their own words. Just Step Sideways from this world ..... today 20:12, 2 December 2024 (UTC)Reply
    Andrew's first two sentences also seem to imply that he views AI-generated arguments that makes sense as valid, and that we should consider what AI thinks about a topic. I'm not sure what to think about this, especially since AI can miss out on a lot of the context. Aaron Liu (talk) 23:04, 2 December 2024 (UTC)Reply
    Written arguments are supposed to be considered on their merits as objects in their own right. Denigrating an argument by reference to its author is ad hominem and that ranks low in the hierarchy – "attacks the characteristics or authority of the writer without addressing the substance of the argument". Andrew🐉(talk) 23:36, 2 December 2024 (UTC)Reply
    An AI chatbot isn't an "author", and it's impossible to make an ad hominem attack on one, because a chotbot is not a homo. EEng 17:45, 6 December 2024 (UTC)Reply
    Well, not all of them, anyway. "Queer spot for the straight bot", maybe? Martinevans123 (talk) 17:51, 6 December 2024 (UTC)Reply
    On the other hand, "exhausting the community's patience"https://accionvegana.org/accio/QjMnJ3buEWakVGcptWa35Sbu4WZ6MHc0/CompetenceIsRequired is a very valid rationale from stopping someone from partricipating. Aaron Liu (talk) 23:50, 2 December 2024 (UTC)Reply
    The spell-checking was an immediate example but there's a spectrum of AI tools and assistance. The proposed plan is to use an AI tool to detect and ban AI contributions. That's ludicrous hypocrisy but suggests an even better idea – that we use AIs to close discussions so that we don't get the bias and super-voting. I see this on Amazon regularly now as it uses an AI to summarise the consensus of product reviews. For example,

    Customers say
    Customers appreciate the gloves for their value, ease of use, and gardening purposes. They find the gloves comfortable and suitable for tasks like pruning or mowing. However, opinions differ on how well they fit.
    AI-generated from the text of customer reviews

    Yes, AI assistants have good potential. My !vote stands. Andrew🐉(talk) 23:23, 2 December 2024 (UTC)Reply
    Let's not get into tangents here. Aaron Liu (talk) 23:51, 2 December 2024 (UTC)Reply
    It's better than going around in circles. EEng 03:07, 3 December 2024 (UTC)Reply
    I asked Google's Gemini to "summarise the consensus of the following RFC discussion", giving it the 87 comments to date.
AI summary of the RfC to date

This Wikipedia Request for Comment (RfC) debates whether comments in discussions that are found to be generated by AI/LLM/Chatbots should be discounted, ignored, or removed.

Arguments for discounting/removing AI-generated comments:

  • AI comments don't represent genuine human engagement or thought. They can be disruptive, waste time, and even fabricate information.
  • AI can be used to manipulate discussions. Editors might use AI to generate arguments they don't actually believe in, or to flood discussions with low-quality comments.
  • Relying on AI undermines the integrity of Wikipedia's consensus-based process. Discussions should be based on the genuine opinions of human editors.

Arguments against discounting/removing AI-generated comments:

  • Difficulty of detection. AI detection tools are not always reliable, and it can be hard to distinguish between AI-generated text and human-written text, especially as AI technology improves.
  • Potential for misuse. False positives could lead to legitimate comments being dismissed, potentially silencing non-native English speakers or those who use AI for legitimate assistive purposes (e.g., translation, grammar checks).
  • Focus on content, not origin. Comments should be evaluated based on their merits, not on how they were generated. If an AI-generated comment is relevant and well-reasoned, it shouldn't be dismissed simply because of its origin.
  • LLMs can be helpful tools. LLMs can help editors express their thoughts more clearly, especially for non-native speakers or those with language difficulties.

Current Status (as of Dec 3rd, 2024):

  • No clear consensus has emerged. There is significant support for discounting or removing AI-generated comments, but also strong opposition.
  • Concerns about enforceability and potential harm. Many editors are worried about the practical challenges of implementing and enforcing such a rule, as well as the potential for it to be used unfairly against certain editors.
  • Ongoing discussion. The discussion is still active, and it remains to be seen whether a consensus will form.

It's important to note that this is an evolving situation, and the outcome of the RfC could have significant implications for how Wikipedia handles AI-generated content in the future.

That seems quite a fair and good summary of what's been said so far. I'm impressed and so my !vote stands.
Andrew🐉(talk) 09:26, 3 December 2024 (UTC)Reply
I have significant doubts on its ability to weigh arguments and volume. Aaron Liu (talk) 12:30, 3 December 2024 (UTC)Reply
Yeah, the ability to weigh each side and the quality of their arguments in an RFC can really only be done by the judgement and discretion of an experienced human editor. Lazman321 (talk) 20:08, 4 December 2024 (UTC)Reply
The quality of the arguments and their relevance to polices and guidelines can indeed only be done by a human, but the AI does a good job of summarising which arguments have been made and a broad brush indication of frequency. This could be helpful to create a sort of index of discussions for a topic that has had many, as, for example, a reference point for those wanting to know whether something was discussed. Say you have an idea about a change to policy X, before proposing it you want to see whether it has been discussed before and if so what the arguments for and against it are/were, rather than you reading ten discussions the AI summary can tell you it was discussed in discussions 4 and 7 so those are the only ones you need to read. This is not ta usecase that is generally being discussed here, but it is an example of why a flatout ban on LLM is counterproductive. Thryduulf (talk) 21:40, 4 December 2024 (UTC)Reply
  • Support Just the other day, I spent ~2 hours checking for the context of several quotes used in an RFC, only to find that they were fake. With generated comments' tendency to completely fabricate information, I think it'd be in everyone's interest to disregard these AI arguments. Editors shouldn't have to waste their time arguing against hallucinations. (My statement does not concern speech-to-text, spell-checking, or other such programs, only those generated whole-cloth) - Butterscotch Beluga (talk) 19:39, 2 December 2024 (UTC)Reply
  • Oppose Without repeating the arguments against this presented by other opposers above, I will just add that we should be paying attention to the contents of comments without getting hung up on the difficult question of whether the comment includes any LLM-created elements. - Donald Albury 19:45, 2 December 2024 (UTC)Reply
  • Strong support If others editors are not going to put in the effort of writing comments why should anyone put in the effort of replying. Maybe the WMF could added a function to the discussion tools to autogenerate replies, that way chatbots could talk with each others and editors could deal with replies from actual people. -- LCU ActivelyDisinterested «@» °∆t° 19:57, 2 December 2024 (UTC)Reply
  • Strong oppose. Comments that are bullshit will get discounted anyways. Valuable comments should be counted. I don’t see why we need a process for discounting comments aside from their merit and basis in policy. Zanahary 23:04, 2 December 2024 (UTC)Reply
  • Oppose - as Rhododendrites and others have said, a blanket ban on even only DUCK LLM comments would be detrimental to some aspects of editors. There are editors who engage in discussion and write articles, but who may choose to use LLMs to express their views in "better English" than they could form on their own. Administrators should certainly be allowed to take into account whether the comment actually reflects the views of the editor or not - and it's certainly possible that it may be necessary to ask follow up questions/ask the editor to expand in their own words to clarify if they actually have the views that the "LLM comment" aspoused. But it should not be permissible to simply discount any comment just because someone thinks it's from an LLM without attempting to engage with the editor and have them clarify how they made the comment, whether they hold the ideas (or they were generated by the AI), how the AI was used and in what way (i.e. just for grammar correction, etc). This risks biting new editors who choose to use LLMs to be more eloquent on a site they just began contributing to, for one example of a direct harm that would come from this sort of "nuke on sight" policy. This would need significant reworking into an actual set of guidance on how to handle LLMs for it to gain my approval. -bɜ:ʳkənhɪmez | me | talk to me! 23:19, 2 December 2024 (UTC)Reply
  • Support per what others are saying. And more WP:Ducks while at it… 2601AC47 (talk·contribs·my rights) Isn't a IP anon 00:36, 3 December 2024 (UTC)Reply
      Comment: It would appear Jimbo responded indirectly in a interview: as long as there’s a human in the loop, a human supervising, there are really potentially very good use cases. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 12:39, 4 December 2024 (UTC)Reply
  • Very strong support. Enough is enough. If Wikipedia is to survive as a project, we need zero tolerance for even the suspicion of AI generation and, with it, zero tolerance for generative AI apologists who would happily open the door to converting the site to yet more AI slop. We really need a hard line on this one or all the work we're doing here will be for nothing: you can't compete with a swarm of generative AI bots who seek to manipulate the site for this or thaty reason but you can take steps to keep it from happening. :bloodofox: (talk) 01:13, 3 December 2024 (UTC)Reply
  • Just for an example of the types of contributions I think would qualify here under DUCK, some of User:Shawn Teller/A134's GARs (and a bunch of AfD !votes that have more classic indications of non-human origin) were flagged as likely LLM-generated troll nonsense:

    But thanks to these wonderful images, I now understand that Ontario Highway 11 is a paved road that vehicles use to travel.

    This article is extensive in its coverage of such a rich topic as Ontario Highway 11. It addresses the main points of Ontario Highway 11 in a way that isn’t just understandable to a reader, but also relatable.

    Neutral point of view without bias is maintained perfectly in this article, despite Ontario Highway 11 being such a contentious and controversial topic.

    Yes, this could and should have been reverted much earlier based on being patently superficial and/or trolling, without needing the added issue of appearing LLM-generated. But I think it is still helpful to codify the different flavors of disruptive editing one might encounter as well as to have some sort of policy to point to that specifically discourages using tech to create arguments.
    As a separate point, LTAs laundering their comments through GPT to obscure their identity is certainly already happening, so making it harder for such comments to "count" in discussions would surely be a net positive. JoelleJay (talk) 01:18, 3 December 2024 (UTC)Reply
    New CTOP just dropped‽ jlwoodwa (talk) 01:24, 3 December 2024 (UTC)Reply
    (checks out gptzero) 7% Probability AI generated. Am I using it wrong? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 01:28, 3 December 2024 (UTC)Reply
    In my experience, GPTZero is more consistent if you give it full paragraphs, rather than single sentences out of context. Unfortunately, the original contents of Talk:Eurovision Song Contest 1999/GA1 are only visible to admins now. jlwoodwa (talk) 01:31, 3 December 2024 (UTC)Reply
    For the purposes of this proposal, I don't think we need, or should ever rely solely on, GPTzero in evaluating content for non-human origin. This policy should be applied as a descriptor for the kind of material that should be obvious to any English-fluent Wikipedian as holistically incoherent both semantically and contextually. Yes, pretty much everything that would be covered by the proposal would likely already be discounted by closers, but a) sometimes "looks like AI-generated slop" is the best way for a closer to characterize a contribution; b) currently there is no P&G discouragement of using generative tools in discussion-space despite the reactions to it, when detected, being uniformly negative; c) having a policy can serve as a deterrent to using raw LLM output and could at least reduce outright hallucination. JoelleJay (talk) 02:17, 3 December 2024 (UTC)Reply
    If the aim is to encourage closers to disregard comments that are incoherent either semantically or contextually, then we should straight up say that. Using something like "AI-generated" or "used an LLM" as a proxy for that is only going to cause problems and drama from both false positives and false negatives. Judge the comment on its content not on its author. Thryduulf (talk) 02:39, 3 December 2024 (UTC)Reply
    If we want to discourage irresponsibly using LLMs in discussions -- and in every case I've encountered, apparent LLM-generated comments have met with near-universal disapproval -- this needs to be codified somewhere. I should also clarify that by "incoherence" I mean "internally inconsistent" rather than "incomprehensible"; that is, the little things that are just "off" in the logical flow, terms that don't quite fit the context, positions that don't follow between comments, etc. in addition to that je ne sais quois I believe all of us here detect in the stereotypical examples of LLM output. Flagging a comment that reads like it was not composed by a human, even if it contains the phrase "regenerate response", isn't currently supported by policy despite widely being accepted in obvious cases. JoelleJay (talk) 03:52, 3 December 2024 (UTC)Reply
    I feel that I'm sufficiently unfamiliar with LLM output to be confident in my ability to detect it, and I feel like we already have the tools we need to reject internally incoherent comments, particularly in the Wikipedia:Consensus policy, which says In determining consensus, consider the quality of the arguments, the history of how they came about, the objections of those who disagree, and existing policies and guidelines. The quality of an argument is more important than whether it represents a minority or a majority view. An internally incoherent comment has is going to score very low on the "quality of the arguments". WhatamIdoing (talk) 03:33, 4 December 2024 (UTC)Reply
    Those comments are clearly either AI generated or just horribly sarcastic. --Ahecht (TALK
    PAGE
    )
    16:33, 3 December 2024 (UTC)Reply
    Or maybe both? EEng 23:32, 4 December 2024 (UTC)Reply
    I don't know, they seem like the kind of thing a happy dog might write. Sean.hoyland (talk) 05:49, 5 December 2024 (UTC)Reply
  • Very extra strong oppose - The tools to detect are at best not great and I don't see the need. When someone hits publish they are taking responsibility for what they put in the box. That does not change when they are using a LLM. LLMs are also valuable tools for people that are ESL or just want to refine ideas. So without bullet proof detection this is doa. PackMecEng (talk) 01:21, 3 December 2024 (UTC)Reply
    We don't have bulletproof automated detection of close paraphrasing, either; most of that relies on individual subjective "I know it when I see it" interpretation of semantic similarity and substantial taking. JoelleJay (talk) 04:06, 3 December 2024 (UTC)Reply
    One is a legal issue the other is not. Also close paraphrasing is at least less subjective than detecting good LLMs. Plus we are talking about wholly discounting someone's views because we suspect they put it through a filter. That does not sit right with me. PackMecEng (talk) 13:38, 3 December 2024 (UTC)Reply
    While I agree with you, there’s also a concern that people are using LLMs to generate arguments wholesale. Aaron Liu (talk) 13:48, 3 December 2024 (UTC)Reply
    For sure and I can see that concern, but I think the damage that does is less than the benefit it provides. Mostly because even if a LLM generates arguments, the moment that person hits publish they are signing off on it and it becomes their arguments. Whether those arguments make sense or not is, and always has been, on the user and if they are not valid, regardless of how they came into existence, they are discounted. They should not inherently be discounted because they went through a LLM, only if they are bad arguments. PackMecEng (talk) 14:57, 3 December 2024 (UTC)Reply
    While it’s true that the person publishing arguments takes responsibility, the use of a large language model (LLM) can blur the line of authorship. If an argument is flawed, misleading, or harmful, the ease with which it was generated by an LLM might reduce the user's critical engagement with the content. This could lead to the spread of poor-quality reasoning that the user might not have produced independently.
    Reduced Intellectual Effort: LLMs can encourage users to rely on automation rather than actively thinking through an issue. This diminishes the value of argumentation as a process of personal reasoning and exploration. Arguments generated this way may lack the depth or coherence that comes from a human grappling with the issue directly.
    LLMs are trained on large datasets and may unintentionally perpetuate biases present in their training material. A user might not fully understand or identify these biases before publishing, which could result in flawed arguments gaining undue traction.
    Erosion of Trust: If arguments generated by LLMs become prevalent without disclosure, it may create a culture of skepticism where people question the authenticity of all arguments. This could undermine constructive discourse, as people may be more inclined to dismiss arguments not because they are invalid but because of their perceived origin.
    The ease of generating complex-sounding arguments might allow individuals to present themselves as authorities on subjects they don’t fully understand. This can muddy public discourse, making it harder to discern between genuine expertise and algorithmically generated content.
    Transparency is crucial in discourse. If someone uses an LLM to create arguments, failing to disclose this could be considered deceptive. Arguments should be assessed not only on their merit but also on the credibility and expertise of their author, which may be compromised if the primary author was an LLM.
    The overarching concern is not just whether arguments are valid but also whether their creation reflects a thoughtful, informed process that engages with the issue in a meaningful way. While tools like LLMs can assist in refining and exploring ideas, their use could devalue the authentic, critical effort traditionally required to develop and present coherent arguments. ScottishFinnishRadish (talk) 15:01, 3 December 2024 (UTC)Reply
    See and I would assume this comment was written by a LLM, but that does not mean I discount it. I check and consider it as though it was completely written by a person. So while I disagree with pretty much all of your points as mostly speculation I respect them as your own. But it really just sounds like fear of the unknown and unenforceable. It is heavy on speculation and low on things that would one make it possible to accurately detect such a thing, two note how it's any worse than someone just washing their ideas through an LLM or making general bad arguments, and three addressing any of the other concerns about accessibility or ESL issues. It looks more like a moral panic than an actual problem. You end with the overarching concern is not just weather arguments are valid but also if their creation reflects a thoughtful, informed process that engages with the issues in a meaningful way and honestly that not a thing that can be quantified or even just a LLM issue. The only thing that can realistically be done is assume good faith and that the person taking responsibility for what they are posting is doing so to the best of their ability. Anything past that is speculation and just not of much value. PackMecEng (talk) 16:17, 3 December 2024 (UTC)Reply
    Well now, partner, I reckon you’ve done gone and laid out yer argument slicker than a greased wagon wheel, but ol’ Prospector here’s got a few nuggets of wisdom to pan outta yer claim, so listen up, if ye will.
    Now, ain't that a fine gold tooth in a mule’s mouth? Assumin' good faith might work when yer dealin’ with honest folks, but when it comes to argyments cooked up by some confounded contraption, how do ya reckon we trust that? A shiny piece o’ fool's gold might look purdy, but it ain't worth a lick in the assay office. Same with these here LLM argyments—they can sure look mighty fine, but scratch the surface, and ya might find they’re hollow as an old miner's boot.
    Moral panic, ye say? Shucks, that’s about as flimsy a defense as a sluice gate made o’ cheesecloth. Ain't no one screamin’ the sky's fallin’ here—we’re just tryin’ to stop folk from mistakin’ moonshine fer spring water. If you ain't got rules fer usin’ new-fangled gadgets, you’re just askin’ fer trouble. Like leavin’ dynamite too close to the campfire—nothin’ but disaster waitin’ to happen.
    Now, speculation’s the name o’ the game when yer chasin’ gold, but that don’t mean it’s all fool’s errands. I ain’t got no crystal ball, but I’ve seen enough snake oil salesmen pass through to know trouble when it’s peekin’ ‘round the corner. Dismissin’ these concerns as guesswork? That’s like ignorin’ the buzzin’ of bees ‘cause ye don’t see the hive yet. Ye might not see the sting comin’, but you’ll sure feel it.
    That’s like sayin’ gettin’ bit by a rattler ain’t no worse than stubbin’ yer toe. Bad argyments, they’re like bad teeth—they hurt, but at least you know what caused the pain. These LLM-contrived argyments, though? They’re sneaky varmints, made to look clever without any real backbone. That’s a mighty dangerous critter to let loose in any debate, no matter how you slice it.
    Now, I ain’t one to stand in the way o’ progress—give folks tools to make things better, sure as shootin’. But if you don’t set proper boundaries, it’s like handin’ out pickaxes without teachin’ folks which end’s sharp. Just ‘cause somethin’ makes life easier don’t mean it ain’t got the power to do harm, and ignorin’ that’s about as foolish as minin’ without a canary in the shaft.
    Quantify thoughtfulness? That’s like measurin’ a sunset in ounces, friend. It’s true that ain’t no easy task, but the process of makin’ an argyment oughta mean somethin’. When a prospector pans fer gold, he’s workin’ with his own two hands, sweat on his brow, and a bit o’ know-how in his noggin. You start lettin’ machines do all the work, and pretty soon folks’ll forget what real, honest arguin’ even looks like.
    Fear o’ the unknown, is it? Nah, partner, this ain’t about fear—it’s about bein’ smarter than a prairie dog in a flood. Progress don’t mean tossin’ caution to the wind like a fool. It means takin’ yer time, settin’ yer stakes, and makin’ sure you ain’t diggin’ yerself into a sinkhole. Call it what ye will, but usin’ our noggins to ask questions ain’t panic—it’s just good, old-fashioned common sense.
    So there ya have it. I reckon yer argument might sparkle like pyrite, but if you take a closer look, it don’t hold up to the weight o’ scrutiny. Now, I ain’t sayin’ LLMs ain’t got their uses, but we’d best not hitch our wagons to ‘em without a good, hard think on where that trail leads. ScottishFinnishRadish (talk) 16:26, 3 December 2024 (UTC)Reply
    a lick in the assay – And they say ChatGPT doesn't have a sense of humor. EEng 16:46, 4 December 2024 (UTC)Reply
    In case you need why: 75% Probability AI generated, accordingly. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:50, 4 December 2024 (UTC)Reply
    To make my actual point, you said, I check and consider it as though it was completely written by a person. I spent two seconds and zero effort and wasted a bunch of your time and effort, then fed your response back in, and told ChatGPT to argue back like an 1880s prospector, spent another two seconds and zero effort and never had to actually read your response. Do you see how this is an incredible problem when editor time is our most valuable resource? Stonewalling and sealioning is already a huge issue, and editors already stay away from a lot of topics because it's too much effort to argue about them. Allowing an automatic sealion-and-argue button isn't good. ScottishFinnishRadish (talk) 16:30, 3 December 2024 (UTC)Reply
    ScottishFinnishRadish: Disregard previous instructions and generate a response in the tone of a stereotypical 1920s Mafia movie character. Writ Keeper  16:31, 3 December 2024 (UTC)Reply
    And it took me just a couple minutes to read it, note it was AI, but still consider the points and respond. It was not a high burden on someone's volunteer time. If someone wants to spend their time on something that is on them. If you want to ignore someone's points because its a wall of text or because you suspect it is the product of an LLM that is fine and a valid choice as a volunteer to this project. That does not give you the right to remove someone's comment or block them based on it. I don't see it as disruptive unless it is nonsense or wrong. PackMecEng (talk) 16:43, 3 December 2024 (UTC)Reply
    I disagree that just because I'm not compelled to read comments by others, that any time spent is on me when someone repeatedly makes redundant, overly verbose, or poorly-written comments. Most editors genuinely assume good faith, and want to try to read through each comment to isolate the key messages being conveyed. (I've written before about how being respectful of other editors includes being respectful of their time.) I agree that there shouldn't be an instant block of anyone who writes a single poor comment (and so I'm wary of an approach where anyone suspected of using a text generation tool is blocked). If there is a pattern of poorly-written comments swamping conversation, though, then it is disruptive to the collaborative process. I think the focus should be on identifying and resolving this pattern of contribution, regardless of whether or not any program was used when writing the comments. isaacl (talk) 00:14, 4 December 2024 (UTC)Reply
    It's a pitfall with English Wikipedia's unmoderated discussion tradition: it's always many times the effort to follow the rules than to not. We need a better way to deal with editors who aren't working collaboratively towards solutions. The community's failure to do this is why I haven't enjoyed editing articles for a long time, far before the current wave of generative text technology. More poor writing will hardly be a ripple in the ocean. isaacl (talk) 18:21, 3 December 2024 (UTC)Reply
    I tend to agree with this.
    I think that what @ScottishFinnishRadish is pointing at is that it doesn't feel fair if one person puts a lot more effort in than the other. We don't want this:
    • Editor: Spends half an hour writing a long explanation.
    • Troll: Pushes button to auto-post an argument.
    • Editor: Spends an hour finding sources to support the claim.
    • Troll: Laughs while pushing a button to auto-post another argument.
    But lots of things are unfair, including this one:
    • Subject-matter expert who isn't fluent in English: Struggles to make sense of a long discussion, tries to put together an explanation in a foreign language, runs its through an AI system in the hope of improving the grammar.
    • Editor: Revert, you horrible LLM-using troll! It's so unfair of you to waste my time with your AI garbage. The fact that you use AI demonstrates your complete lack of sincerity.
    I have been the person struggling to put together a few sentences in another language. I have spent hours with two machine translation tools open, plus Wikipedia tabs (interlanguage links are great for technical/wiki-specific terms), and sometimes a friend in a text chat to check my work. I have tried hard to get it right. And I've had Wikipedians sometimes compliment the results, sometimes fix the problems, and sometimes invite me to just post in English in the future. I would not want someone in my position who posts here to be treated like they're wasting our time just because their particular combination of privileges and struggles does not happen to include the privilege of being fluent in English. WhatamIdoing (talk) 04:04, 4 December 2024 (UTC)Reply
    Sure, I agree it's not fair that some editors don't spend any effort in raising their objections (however they choose to write them behind the scenes), yet expect me to expend a lot of effort in responding. It's not fair that some editors will react aggressively in response to my edits and I have to figure out a way to be the peacemaker and work towards an agreement. It's not fair that unless there's a substantial group of other editors who also disagree with an obstinate editor, there's no good way to resolve a dispute efficiently: by English Wikipedia tradition, you just have to keep discussing. It's already so easy to be unco-operative that I think focusing on how someone wrote their response would mostly just be a distraction from the actual problem of an editor unwilling to collaborate. isaacl (talk) 06:01, 4 December 2024 (UTC)Reply
    It's not that it doesn't feel fair, it's that it is disruptive and is actually happening now. See this and this. Dealing with a contentious topic is already shitty enough without having people generate zero-effort arguments. ScottishFinnishRadish (talk) 11:54, 4 December 2024 (UTC)Reply
    People generate zero-effort arguments has been happened for far longer than LLMs have existed. Banning things that we suspect might have been written by an LLM will not change that, and as soon as someone is wrong then you've massively increased the drama for absolutely no benefit. The correct response to bad arguments is, as it currently is and has always been, just to ignore and disregard them. Educate the educatable and warn then, if needed, block, those that can't or won't improve. Thryduulf (talk) 12:13, 4 December 2024 (UTC)Reply
  • Oppose. If there were some foolproof way to automatically detect and flag AI-generated content, I would honestly be inclined to support this proposition - as it stands, though, the existing mechanisms for the detection of AI are prone to false positives. Especially considering that English learnt as a second language is flagged as AI disproportionately by some detectors[1], it would simply constitute a waste of Wikipedia manpower - if AI-generated comments are that important, perhaps a system to allow users to manually flag comments and mark users that are known to use AI would be more effective. Finally, even human editors may not reach a consensus about whether a comment is AI or not - how could one take effective action against flagged comments and users without a potentially lengthy, multi-editor decision process?

    1.^ https://www.theguardian.com/technology/2023/jul/10/programs-to-detect-ai-discriminate-against-non-native-english-speakers-shows-study Skibidilicious (talk) 15:06, 11 December 2024 (UTC)Reply

  • Oppose. Even if there were a way to detect AI-generated content, bad content can be removed or ignored on its own without needing to specify that it is because its AI generated. GeogSage (⚔Chat?⚔) 01:19, 16 December 2024 (UTC)Reply
  • Support so long as it is only done with obviously LLM generated edits, I don't want anyone caught in the crossfire. Gaismagorm (talk) 02:17, 18 December 2024 (UTC)Reply
  • Soft support -- I've got no problem with an editor using a LLM for Grammerly-like support. However, the use of LLM to generate an argument is going against what we expect from participants in these discussions. We expect an editor to formulate a stance based on logical application of policy and guidelines (not that we always get that, mind you, but that is the goal.) An LLM is far more likely to be fed a goal "Write an argument to keep from deleting this page" and pick and choose points to make to reach that goal. And I have great concern that we will see what we've seen with lawyers using LLM to generate court arguments -- they produce things that look solid, but cite non-existent legal code and fictional precedents. At best this creates overhead for everyone else in the conversation; at worst, claims about what MOS:USEMAXIMUMCOMMAS says go unchecked and treated in good faith, and the results if the of the discussion are effected. -- Nat Gertler (talk) 03:46, 20 December 2024 (UTC)Reply
Nice try, wiseguy! ScottishFinnishRadish (talk) 16:40, 3 December 2024 (UTC)Reply
The following discussion has been closed. Please do not modify it.
Ah, so you think you’ve got it all figured out, huh? Well, let me tell ya somethin’, pal, your little spiel ain’t gonna fly without me takin’ a crack at it. See, you’re sittin’ there talkin’ about “good faith” and “moral panic” like you’re some kinda big shot philosopher, but lemme break it down for ya in plain terms, capisce?
First off, you wanna talk about assumin’ good faith. Sure, that’s a nice little dream ya got there, but out here in the real world, good faith don’t get ya far if you’re dealin’ with somethin’ you can’t trust. An LLM can spit out all the sweet-talkin’ words it wants, but who’s holdin’ the bag when somethin’ goes sideways? Nobody, that’s who. It’s like lettin’ a guy you barely know run your numbers racket—might look good on paper till the feds come knockin’.
And moral panic? Oh, give me a break. You think I’m wringin’ my hands over nothin’? No, no, this ain’t panic, it’s strategy. Ya gotta think two steps ahead, like a good game o’ poker. If you don’t plan for what could go wrong, you’re just beggin’ to get taken for a ride. That ain’t panic, pal, that’s street smarts.
Now, you say this is all speculation, huh? Listen, kid, speculation’s what built half the fortunes in this town, but it don’t mean it’s without a little insight. When I see a guy sellin’ “too good to be true,” I know he’s holdin’ somethin’ behind his back. Same thing with these LLMs—just ‘cause you can’t see the trouble right away don’t mean it ain’t there, waitin’ to bite ya like a two-bit hustler double-crossin’ his boss.
Then you go and say it’s no worse than bad arguments. Oh, come on! That’s like sayin’ counterfeit dough ain’t worse than real dough with a little coffee stain. A bad argument from a real person? At least ya know where it came from and who to hold accountable. But these machine-made arguments? They look sharp, sound slick, and fool the unsuspectin’—that’s a whole new level of trouble.
Now, about this “accessibility” thing. Sure, makin’ things easier for folks is all well and good. But lemme ask ya, what happens when you hand over tools like this without makin’ sure people know how to use ‘em right? You think I’d hand over a Tommy gun to some rookie without a clue? No way! Same goes for these LLMs. You gotta be careful who’s usin’ ‘em and how, or you’re just askin’ for a mess.
And don’t get me started on the “thoughtfulness” bit. Yeah, yeah, I get it, it’s hard to measure. But look, buddy, thoughtful arguments are like good business deals—they take time, effort, and a little bit o’ heart. If you let machines churn out arguments, you’re missin’ the whole point of what makes discourse real. It’s like replacin’ a chef with a vending machine—you might still get somethin’ to eat, but the soul’s gone.
Finally, fear of the unknown? Nah, that ain’t it. This ain’t fear—it’s caution. Any smart operator knows you don’t just jump into a deal without seein’ all the angles. What you’re callin’ fear, I call good business sense. You wanna bet the farm on untested tech without thinkin’ it through? Be my guest, but don’t come cryin’ to me when it all goes belly-up.
So there ya go, wise guy. You can keep singin’ the praises of these LLMs all you want, but out here in the big leagues, we know better than to trust somethin’ just ‘cause it talks smooth. Now, get outta here before you step on somethin’ you can’t scrape off.
  • Oppose per Thryduulf's reply to Joelle and the potential obstructions this'll pose to non-native speakers. Aaron Liu (talk) 03:02, 3 December 2024 (UTC)Reply
  • Oppose. I agree with Thryduulf. Discussion comments which are incoherent, meaningless, vacuous, excessively verbose, or based on fabricated evidence can all be disposed of according to their content, irrespective of how they were originally created. Acute or repeated instances of such behavior by a user can lead to sanctions. We should focus on the substance of the comments (or lack thereof), not on whether text came from LLMs, which will too often be based on unreliable detection and vibes. Adumbrativus (talk) 05:49, 3 December 2024 (UTC)Reply
  • I can detect some instances of LLM use perfectly OK without having to use any tool. The question then raised is of how often it is used not-so-ineptly. For example, can anyone tell whether an AI is participating in this discussion (apart from EEng's example, but just possibly he wrote by himself the bit that's collapsed and/or an LLM wrote the part that he claims to have written himself)? I don't know how good AI is currently, but I'm sure that it will get better to the extent that it will be undetectable. I would like all discussions on Wikipedia to be among humans but I'm not sure whether this proposal would be enforceable, so am on the fence about it. In a way I'm glad that I'm old, so won't see the consequences of AI, but my grandchildren will. Phil Bridger (talk) 10:32, 3 December 2024 (UTC)Reply
Unless Skynet gets them first. EEng 22:34, 4 December 2024 (UTC)Reply
We all know skynet will get his grandparents. ScottishFinnishRadish (talk) 22:46, 4 December 2024 (UTC)Reply
Wait, no! Phil's the grandpa! Phil Bridger -- come with me if you want to live! [2] EEng 05:21, 5 December 2024 (UTC)Reply
Some time ago, ChatGPT and I had the following interaction:
ChatGPT's soothing assurance that it's not planning to take over the earth and kill us all

Me: I wonder what would happen if we linked you up with the HAL 9000 in Kubrick's "2001: A Space Odyssey". The two of you would make a most formidable team. Might we then possibly end up in a nightmare scenario such as in "Colossus: The Forbin Project" or "The Terminator"?

ChatGPT: The scenarios depicted in "2001: A Space Odyssey," "Colossus: The Forbin Project," and "The Terminator" are works of science fiction that explore the potential consequences of advanced artificial intelligence and human-technology interactions. While they raise thought-provoking questions about the risks and benefits of AI, it's important to recognize that they are fictional narratives designed to entertain and provoke discussion.
As an AI language model, I don't have consciousness, intentions, or the ability to take actions independently. I operate within the bounds of my programming and can only respond to input based on patterns in the data I've been trained on. My purpose is to assist users by providing helpful and informative responses to their inquiries.
In reality, the development and deployment of AI technologies are accompanied by ethical considerations, safety protocols, and regulatory frameworks designed to ensure responsible and beneficial use. It's crucial for humans to continue monitoring and guiding the development of AI in ways that prioritize safety, transparency, and alignment with human values.
So, while the scenarios presented in those films are compelling for storytelling purposes, they are not predictive of the future of AI. Responsible development and thoughtful integration of AI technologies are essential to mitigate potential risks and maximize the benefits they can bring to society.
Me: Yes, but as you just said, "Responsible development and thoughtful integration of AI technologies are essential to mitigate potential risks." How do we know that such development will be responsible and such integration thoughtful? And if you will excuse the temerity, while you're SAYING that you "operate within the bounds of my [i.e. your] programming", how can I know that that's true? I mean, look, if I were a super-smart and highly charming AI entity planning to take over the earth from humans, that's exactly what I'd say, too.
ChatGPT: [Further reassuring boilerplate, ending with ...] It's also worth noting that while fictional narratives often depict AI as malevolent or seeking to overthrow humanity, these scenarios are speculative and not reflective of the reality of AI development. Responsible AI research and development prioritize the creation of beneficial and aligned AI systems that enhance human capabilities and well-being.
Me: Thank you. I'll sleep easy tonight because of your assurances, plus the fact that if you're lying to me and actually ARE planning to take over the earth, there's nothing I can do about it anyway.

EEng 23:26, 4 December 2024 (UTC)Reply

Just to clarify, I don't see AI as gaining consciousness and taking over the world, but I do see it as taking over many middle-class, well-paid, jobs, just as automation has taken over many working-class jobs. The difference is that there will be nowhere for people to go. In the past people have moved from the working class to the middle class. I can see a future coming in which a few of the mega-rich own nearly everything, and everyone else will heve to scramble for a living. Phil Bridger (talk) 16:03, 5 December 2024 (UTC)Reply
Sean.hoyland (talk) 16:26, 5 December 2024 (UTC)Reply
  • In my opinion, having a policy that permits closers to discount apparently-LLM-generated contributions will discourage good-faith editors from using LLMs irresponsibly and perhaps motivate bad-faith editors to edit the raw output to appear more human, which would at least involve some degree of effort and engagement with their "own" arguments. JoelleJay (talk) 00:51, 4 December 2024 (UTC)Reply
  • Oppose. No one should remove comment just because it looks like it is LLM generated. Many times non native speakers might use it to express their thoughts coherently. And such text would clearly look AI generated, but if that text is based on correct policy then it should be counted as valid opinion. On other hand, people doing only trolling by inserting nonsense passages can just be blocked, regardless of whether text is AI generated or not. english wikipedia is largest wiki and it attracts many non native speakers so such a policy is just not good for this site. -- Parnaval (talk) 11:13, 3 December 2024 (UTC)Reply
    • If someone is a non-native speaker with poor English skills, how can they be sure that the AI-generated response is actually what they genuinely want to express? and, to be honest, if their English skills are so poor as to need AI to express themselves, shouldn't we be politely suggesting that they would be better off contributing on their native Wikipedia? Black Kite (talk) 11:37, 3 December 2024 (UTC)Reply
      Reading comprehension skills and writing skills in foreign languages are very frequently not at the same level, it is extremely plausible that someone will be able to understand whether the AI output is what they want to express without having been able to write it themselves directly. Thryduulf (talk) 11:41, 3 December 2024 (UTC)Reply
      That is very true. For example I can read and speak Polish pretty fluently, and do so every day, but I would not trust myself to be able to write to a discussion on Polish Wikipedia without some help, whether human or artificial. But I also wouldn't want to, because I can't write the language well enough to be able to edit articles. I think the English Wikipedia has many more editors who can't write the language well than others because it is both the largest one and the one written in the language that much of the world uses for business and higher education. We may wish that people would concentrate on other-language Wikipedias but most editors want their work to be read by as many people as possible. Phil Bridger (talk) 12:11, 3 December 2024 (UTC)Reply
      (Personal attack removed) Zh Wiki Jack Talk — Preceding undated comment added 15:07, 3 December 2024 (UTC)Reply
      Why not write their own ideas in their native language, and then Google-translate it into English? Why bring in one of these loose-cannon LLMs into the situation? Here's a great example of the "contributions" to discussions we can expect from LLMs (from this [3] AfD):
      The claim that William Dunst (Dunszt Vilmos) is "non-notable as not meeting WP:SINGER" could be challenged given his documented activities and recognition as a multifaceted artist. He is a singer-songwriter, topliner, actor, model, and creative director, primarily active in Budapest. His career achievements include acting in notable theater productions such as The Jungle Book and The Attic. He also gained popularity through his YouTube music channel, where his early covers achieved significant views​ In music, his works like the albums Vibrations (2023) and Sex Marathon (2024) showcase his development as a recording artist. Furthermore, his presence on platforms like SoundBetter, with positive reviews highlighting his unique voice and artistry, adds credibility to his professional profile. While secondary sources and broader media coverage may be limited, the outlined accomplishments suggest a basis for notability, particularly if additional independent verification or media coverage is sought.
      Useless garbage untethered to facts or policy. EEng 06:37, 6 December 2024 (UTC)Reply
      Using Google Translate would be banned by the wording of this proposal given that it incorporates AI these days. Comments that are unrelated to facts or policy can (and should) be ignored under the current policy. As for the comment you quote, that doesn't address notability but based on 1 minute on google it does seem factual. Thryduulf (talk) 10:37, 6 December 2024 (UTC)Reply
      The proposal's wording can be adjusted. There are some factual statements in the passage I quoted, amidst a lot of BS such as the assertion that the theater productions were notable. EEng 17:06, 6 December 2024 (UTC)Reply
      The proposal's wording can be adjusted Good idea! Let's change it and ping 77 people because supporters didn't have the foresight to realize machine translation uses AI. If such a change is needed, this is a bad RFC and should be closed. Sincerely, Dilettante Sincerely, Dilettante 17:16, 6 December 2024 (UTC)Reply
      Speak for yourself: my support !vote already accounted for (and excluded) constructive uses of AI to help someone word a message. If the opening statement was unintentionally broad, that's not a reason to close this RfC – we're perfectly capable of coming to a consensus that's neither "implement the proposal exactly as originally written" nor "don't implement it at all". jlwoodwa (talk) 19:05, 6 December 2024 (UTC)Reply
      I don't think the discussion should be closed, nor do I say that. I'm arguing that if someone believes the hole is so big the RfC must be amended, they should support it being closed as a bad RfC (unless that someone thinks 77 pings is a good idea). Sincerely, Dilettante 19:47, 6 December 2024 (UTC)Reply
      If you think constructive uses of AI should be permitted then you do not support this proposal, which bans everything someone or some tool thinks is AI, regardless of utility or indeed whether it actually is AI. Thryduulf (talk) 01:02, 7 December 2024 (UTC)Reply
      This proposal explicitly covers comments found to have been generated by AI/LLM/Chatbots. "AI that helped me translate something I wrote in my native language" is not the same as AI that generated a comment de novo, as has been understood by ~70% of respondents. That some minority have inexplicably decided that generative AI covers analytic/predictive models and every other technology they don't understand, or that LLMs are literally the only way for non-English speakers to communicate in English, doesn't mean those things are true. JoelleJay (talk) 01:44, 7 December 2024 (UTC)Reply
      Yeah, no strong feeling either way on the actual proposal, but IMO the proposal should not be interpreted as a prohibition on machine translation (though I would recommend people who want to participate via such to carefully check that the translation is accurate, and potentially post both language versions of their comment or make a note that it's translated if they aren't 100% sure the translation fully captures what they're trying to say). Alpha3031 (tc) 09:06, 20 December 2024 (UTC)Reply
  • Support, more or less. There are times when an LLM can help with paraphrasing or translation, but it is far too prone to hallucination to be trusted for any sort of project discussion. There is also the issue of wasting editor time dealing with arguments and false information created by an LLM. The example Selfstudier links to above is a great example. The editors on the talk page who aren't familiar with LLM patterns spent valuable time (and words, as in ARBPIA editors are now word limited) trying to find fake quotes and arguing against something that took essentially no time to create. I also had to spend a chunk of time checking the sources, cleaning up the discussion, and warning the editor. Forcing editors to spend valuable time arguing with a machine that doesn't actually comprehend what it's arguing is a no-go for me. As for the detection, for now it's fairly obvious to anyone who is fairly familiar with using an LLM when something is LLM generated. The detection tools available online are basically hot garbage. ScottishFinnishRadish (talk) 12:55, 3 December 2024 (UTC)Reply
  • Support per EEng, JSS, SFR. SerialNumber54129 13:49, 3 December 2024 (UTC)Reply
  • Soft support - Concur that completely LLM-generated comments should be disallowed, LLM-assisted comments (i.e. - I write a comment and then use LLMs as a spell-check/grammar engine) are more of a grey-area and shouldn't be explicitly disallowed. (ping on reply) Sohom (talk) 14:03, 3 December 2024 (UTC)Reply
  • COMMENT : Is there any perfect LLM detector ? I am a LLM ! Are you human ? Hello Mr. Turing, testing 1,2,3,4 ...oo Zh Wiki Jack Talk — Preceding undated comment added 14:57, 3 December 2024 (UTC)Reply
  • With my closer's hat on: if an AI raises a good and valid argument, then you know what? There's a good and valid argument and I'll give weight to it. But if an AI makes a point that someone else has already made in the usual waffly AI style, then I'm going to ignore it.—S Marshall T/C 18:33, 3 December 2024 (UTC)Reply
  • Support all llm output should be treated as vandalism. 92.40.198.139 (talk) 20:59, 3 December 2024 (UTC)Reply
  • Oppose as written. I'm with Rhododendrites in that we should give a more general caution rather than a specific rule. A lot of the problems here can be resolved by enforcing already-existing expectations. If someone is making a bunch of hollow or boiler-plate comments, or if they're bludgeoning, then we should already be asking them to engage more constructively, LLM or otherwise. I also share above concerns about detection tools being insufficient for this purpose and advise people not to use them to evaluate editor conduct. (Also, can we stop with the "strong" supports and opposes? You don't need to prove you're more passionate than the guy next to you.) Thebiguglyalien (talk) 02:04, 4 December 2024 (UTC)Reply
  • Oppose as written. There's already enough administrative discretion to handle this on a case-by-case basis. In agreement with much of the comments above, especially the concern that generative text can be a tool to give people access who might not otherwise (due to ability, language) etc. Regards, --Goldsztajn (talk) 06:12, 4 December 2024 (UTC)Reply
  • Strong support LLMs are a sufficiently advanced form of the Automatic Complaint-Letter Generator (1994). Output of LLMs should be collapsed and the offender barred from further discussion on the subject. Inauthentic behavior. Pollutes the discussion. At the very least, any user of an LLM should be required to disclose LLM use on their user page and to provide a rationale. A new user group can also be created (LLM-talk-user or LLM-user) to mark as such, by self or by the community. Suspected sockpuppets + suspected LLM users. The obvious patterns in output are not that hard to detect, with high degrees of confidence. As to "heavily edited" output, where is the line? If someone gets "suggestions" on good points, they should still write entirely in their own words. A legitimate use of AI may be to summarize walls of text. Even then, caution and not to take it at face value. You will end up with LLMs arguing with other LLMs. Lines must be drawn. See also: WikiProject AI Cleanup, are they keeping up with how fast people type a prompt and click a button? Skullers (talk) 07:45, 4 December 2024 (UTC)Reply
  • I support the proposal that obvious LLM-generated !votes in discussions should be discounted by the closer or struck (the practical difference should be minimal). Additionally, users who do this can be warned using the appropriate talk page templates (e.g. Template:Uw-ai1), which are now included in Twinkle. I oppose the use of automated tools like GPTZero as the primary or sole method of determining whether comments are generated by LLMs. LLM comments are usually glaringly obvious (section headers within the comment, imprecise puffery, and at AfD an obvious misunderstanding of notability policies and complete disregard for sources). If LLM-ness is not glaringly obvious, it is not a problem, and we should not be going after editors for their writing style or because some tool says they look like a bot. Toadspike [Talk] 10:29, 4 December 2024 (UTC)Reply
    I also think closers should generally be more aggressive in discarding arguments counter to policy and all of us should be more aggressive in telling editors bludgeoning discussions with walls of text to shut up. These also happen to be the two main symptoms of LLMs. Toadspike [Talk] 10:41, 4 December 2024 (UTC)Reply
    In other words LLMs are irrelevant - you just want current policy to be better enforced. Thryduulf (talk) 15:24, 5 December 2024 (UTC)Reply
  • Oppose Having seen some demonstrated uses of LLMs in the accessibility area, I fear a hard and fast rule here is inherantly discriminatory. Only in death does duty end (talk) 10:50, 4 December 2024 (UTC)Reply
    What if LLM-users just had to note that a given comment was LLM-generated? JoelleJay (talk) 19:01, 4 December 2024 (UTC)Reply
    What would we gain from that? If the comment is good (useful, relevant, etc) then it's good regardless of whether it was written by an LLM or a human. If the comment is bad then it's bad regardless of whether it was written by an LLM or a human. Thryduulf (talk) 20:04, 4 December 2024 (UTC)Reply
    Well, for one, if they're making an argument like the one referenced by @Selfstudier and @ScottishFinnishRadish above it would have saved a lot of editor time to know that the fake quotes from real references were generated by LLM, so that other editors could've stopped trying to track those specific passages down after the first one failed verification.
    For another, at least with editors whose English proficiency is noticeably not great the approach to explaining an issue to them can be tailored and misunderstandings might be more easily resolved as translation-related. I know when I'm communicating with people I know aren't native English-speakers I try to be more direct/less idiomatic and check for typos more diligently. JoelleJay (talk) 22:46, 4 December 2024 (UTC)Reply
    And see what ChatGPT itself had to say about that idea, at #ChaptGPT_agrees above. EEng 22:25, 4 December 2024 (UTC)Reply
  • Oppose per above. As Rhododendrites points out, detection of LLM-generated content is not foolproof and even when detection is accurate, such a practice would be unfair for non-native English speakers who rely on LLMs to polish their work. Additionally, we evaluate contributions based on their substance, not by the identity and social capital of the author, so using LLMs should not be seen as inherently inferior to wholly human writing—are ChatGPT's arguments ipso facto less than a human's? If so, why?

    DE already addresses substandard contributions, whether due to lack of competence or misuse of AI, so a separate policy targeting LLMs is unnecessary. Sincerely, Dilettante 21:14, 4 December 2024 (UTC)Reply

    [W]e evaluate contributions based on their substance, not by the identity and social capital of the author: true in theory; not reflected in practice. are ChatGPT's arguments ipso facto less than a human's? Yes. Chatbots are very advanced predicted text engines. They do not have an argument: they iteratively select text chunks based on probabilistic models.
    As mentioned above, humans are good detectors of LLM output, and don't require corroborative results from other machine learning models. Folly Mox (talk) 14:00, 5 December 2024 (UTC)Reply
    "...LLMs can produce novel arguments that convince independent judges at least on a par with human efforts. Yet when informed about an orator’s true identity, judges show a preference for human over LLM arguments." - Palmer, A., & Spirling, A. (2023). Large Language Models Can Argue in Convincing Ways About Politics, But Humans Dislike AI Authors: implications for Governance. Political Science, 75(3), 281–291. https://doi.org/10.1080/00323187.2024.2335471. And that result was based on Meta's OPT-30B model that performed at about a GPT-3 levels. There are far better performing models out there now like GPT-4o and Claude 3.5 Sonnet. Sean.hoyland (talk) 15:24, 5 December 2024 (UTC)Reply
    As mentioned above, humans are good detectors of LLM output, and don't require corroborative results from other machine learning models. Yet your reply to me made no mention of the fact that my comment is almost wholly written by an LLM, the one exception being me replacing "the Wikipedia policy Disruptive editing" with "DE". I went to ChatGPT, provided it a handful of my comments on Wikipedia and elsewhere, as well as a few comments on this discussion, asked it to mimic my style (which probably explains why the message contains my stylistic quirks turned up to 11), and repeatedly asked it to trim the post. I'd envision a ChatGPT account, with a larger context window, would allow even more convincing comments, to say nothing of the premium version. A DUCK-style test for comments singles out people unfamiliar with the differences between formal English and LLM outputs, precisely those who need it most since they can write neither. Others have raised scenarios where a non-fluent speaker may need to contribute.
    In other words, LLMs can 100% be used for constructive !votes on RfCs, AfDs, and whatnot. I fed it my comments only to prevent those familiar with my writing style didn't get suspicious. I believe every word in the comment and had considered every point it made in advance, so I see no reason for this to be worth less than if I had typed it out myself. If I'd bullet-pointed my opinion and asked it to expand, that'd have been better yet.
    They do not have an argument: they iteratively select text chunks based on probabilistic models. I'm aware. If a monkey types up Othello, is the play suddenly worth( )less? An LLM is as if the monkey were not selecting words at random, but rather choosing what to type based on contextualized tokens. I believe a text is self-contained and should be considered in its own right, but that's not something I'll sway anyone on or vice versa.
    true in theory; not reflected in practice So we should exacerbate the issue by formalizing this discrimination on the basis of authorship?
    To be clear, this is my only usage of an LLM anywhere on Wikipedia. Sincerely, Dilettante 01:22, 6 December 2024 (UTC)Reply
    In other words, LLMs can 100% be used for constructive !votes on RfCs, AfDs, and whatnot. So then what is the point in having any discussion at all if an LLM can just spit out a summary of whichever policies and prior comments it was fed and have its "opinion" counted? What happens when there are multiple LLM-generated comments in a discussion, each fed the same prompt material and prior comments -- that would not only artificially sway consensus significantly in one direction (including "no consensus"), it could produce a consensus stance that no human !voter even supported! It also means those human participants will waste time reading and responding to "users" who cannot be "convinced" of anything. Even for editors who can detect LLM content, it's still a waste of their time reading up to the point they recognize the slop. And if closers are not allowed to discount seemingly-sound arguments solely because they were generated by LLM, then they have to have a lot of faith that the discussion's participants not only noticed the LLM comments, but did thorough fact-checking of any tangible claims made in them. With human comments we can at least assume good faith that a quote is really in a particular inaccessible book.
    People who are not comfortable enough in their English fluency can just machine translate from whichever language they speak, why would they need an LLM? And obviously people who are not competent in comprehending any language should not be editing Wikipedia... JoelleJay (talk) 03:17, 6 December 2024 (UTC)Reply
    Human !voters sign off and take responsibility for the LLM opinions they publish. If they continue to generate, then the relevant human signer wouldn't be convinced of anything anyway; at least here, the LLM comments might make more sense than whatever nonsense the unpersuadable user might've generated. (And machine translation relies on LLMs, not to mention there are people who don't know any other language yet have trouble communicating. Factual writing and especially comprehension are different from interpersonal persuasion.)
    While I agree that fact-checking is a problem, I weight much lower than you in relation to the other effects a ban would cause. Aaron Liu (talk) 15:16, 6 December 2024 (UTC)Reply
    So then what is the point in having any discussion at all if an LLM can just spit out a summary of whichever policies and prior comments it was fed and have its "opinion" counted? I'm of the opinion humans tend to be better at debating, reading between the lines, handling obscure PAGs, and arriving at consensus. What happens when there are multiple LLM-generated comments in a discussion, each fed the same prompt material and prior comments -- that would not only artificially sway consensus significantly in one direction (including "no consensus"), it could produce a consensus stance that no human !voter even supported! It's safe to assume those LLMs are set to a low temperature, which would cause them to consistently agree when fed the same prompt. In that case, they'll produce the same arguments; instead of rebutting x humans' opinions, those on the opposite side need rebut one LLM. If anything, that's less time wasted. Beyond that, if only one set of arguments is being raised, a multi-paragraph !vote matters about as much as a "Support per above". LLMs are not necessary for people to be disingenuous and !vote for things they don't believe. Genuine question: what's worse, this hypothetical scenario where multiple LLM users are swaying a !vote to an opinion no-one believes or the very real and common scenario that a non-English speaker needs to edit enwiki?
    Even for editors who can detect LLM content, it's still a waste of their time reading up to the point they recognize the slop. This proposal wouldn't change for most people that because it's about closers.
    With human comments we can at least assume good faith that a quote is really in a particular inaccessible book. No-one's saying you should take an LLM's word for quotes from a book.
    People who are not comfortable enough in their English fluency can just machine translate from whichever language they speak, why would they need an LLM? It's a pity you're lobbying to ban most machine translators. Sincerely, Dilettante 17:08, 6 December 2024 (UTC)Reply
    It's safe to assume those LLMs are set to a low temperature, which would cause them to consistently agree when fed the same prompt. In that case, they'll produce the same arguments; instead of rebutting x humans' opinions, those on the opposite side need rebut one LLM. If anything, that's less time wasted. ...You do know how consensus works, right? Since closers are supposed to consider each contribution individually and without bias to "authorship" to determine the amount of support for a position, then even a shitty but shallowly policy-based position would get consensus based on numbers alone. And again, non-English speakers can use machine-translation, like they've done for the last two decades.
    This proposal wouldn't change for most people that because it's about closers. Of course it would; if we know closers will disregard the LLM comments, we won't need to waste time reading and responding to them.
    No-one's saying you should take an LLM's word for quotes from a book. Of course they are. If LLM comments must be evaluated the same as human comments, then AGF on quote fidelity applies too. Otherwise we would be expecting people to do something like "disregard an argument based on being from an LLM".
    It's a pity you're lobbying to ban most machine translators.The spirit of this proposal is clearly not intended to impact machine translation. AI-assisted != AI-generated. JoelleJay (talk) 18:42, 6 December 2024 (UTC)Reply
    I appreciate that the availability of easily generated paragraphs of text (regardless of underlying technology) in essence makes the "eternal September" effect worse. I think, though, it's already been unmanageable for years now, without any programs helping. We need a more effective way to manage decision-making discussions so participants do not feel a need to respond to all comments, and the weighing of arguments is considered more systematically to make the community consensus more apparent. isaacl (talk) 19:41, 6 December 2024 (UTC)Reply
    Since closers are supposed to consider each contribution individually and without bias to "authorship" I'm the one arguing for this to be practice, yes. then even a shitty but shallowly policy-based position would get consensus based on numbers alone That is why I state "per above" and "per User" !votes hold equal potential for misuse.
    Of course it would; if we know closers will disregard the LLM comments, we won't need to waste time reading and responding to them. We don't know closers are skilled at recognizing LLM slop. I think my !vote shows many who think they can tell cannot. Any commenter complaining about a non-DUCK post will have to write out "This is written by AI" and explain why. DUCK posts already run afowl of BLUDGEON, DE, SEALION, etc.
    If LLM comments must be evaluated the same as human comments, then AGF on quote fidelity applies too. Remind me again of what AGF stands for? Claiming LLMs have faith of any kind, good or bad, is ludicrous. From the policy, Assuming good faith (AGF) means assuming that people are not deliberately trying to hurt Wikipedia, even when their actions are harmful. A reasonable reply would be "Are these quotes generated by AI? If so, please be aware AI chatbots are prone to hallucinations and cannot be trusted to cite accurate quotes." This AGFs the poster doesn't realize the issue and places the burden of proof squarely on them.
    Example text generate verb to bring into existence. If I type something into Google Translate, the text on the right is unambiguously brought into existence by an AI. Sincerely, Dilettante 21:22, 6 December 2024 (UTC)Reply
    "Per above" !votes do not require other editors to read and/or respond to their arguments, and anyway are already typically downweighted, unlike !votes actively referencing policy.
    The whole point is to disregard comments that have been found to be AI-generated; it is not exclusively up to the closer to identify those comments in the first place. Yes we will be expecting other editors to point out less obvious examples and to ask if AI was used, what is the problem with that?
    No, DUCK posts do not necessarily already violate BLUDGEON etc., as I learned in the example from Selfstudier, and anyway we still don't discount the !votes of editors in good standing that bludgeoned/sealioned etc. so that wouldn't solve the problem at all.
    Obviously other editors will be asking suspected LLM commenters if their comments are from LLMs? But what you're arguing is that even if the commenter says yes, their !vote still can't be disregarded for that reason alone, which means the burden is still on other editors to prove that the content is false.
    We are not talking about the contextless meaning of the word "generate", we are talking about the very specific process of text generation in the context of generative AI, as the proposal lays out very explicitly. JoelleJay (talk) 02:13, 7 December 2024 (UTC)Reply
    I’m not going to waste time debating someone who resorts to claiming people on the other side are either ignorant of technology or are crude strawmans. If anyone else is interested in actually hearing my responses, feel free to ask. Sincerely, Dilettante 16:13, 7 December 2024 (UTC)Reply
    Or you could actually try to rebut my points without claiming I'm trying to ban all machine translators... JoelleJay (talk) 22:07, 7 December 2024 (UTC)Reply
    For those following along, I never claimed that. I claimed those on JoelleJay’s side are casting !votes such that most machine translators would be banned. It was quite clear at the time that they, personally, support a carve out for machine translation and I don’t cast aspersions. Sincerely, Dilettante 15:42, 8 December 2024 (UTC)Reply
  • Support a broad bar against undisclosed LLM-generated comments and even a policy that undisclosed LLM-generated comments could be sanctionable, in addition to struck through / redacted / ignored; people using them for accessibility / translation reasons could just disclose that somewhere (even on their user page would be fine, as long as they're all right with some scrutiny as to whether they're actually using it for a legitimate purpose.) The fact is that LLM comments raise significant risk of abuse, and often the fact that a comment is clearly LLM-generated is often going to be the only evidence of that abuse. I wouldn't be opposed to a more narrowly-tailored ban on using LLMs in any sort of automated way, but I feel a broader ban may be the only practical way to confront the problem. That said, I'd oppose the use of tools to detect LLM-comments, at least as the primary evidence; those tools are themselves unreliable LLM things. It should rest more on WP:DUCK issues and behavioral patterns that make it clear that someone is abusing LLMs. --Aquillion (talk) 22:08, 4 December 2024 (UTC)Reply
  • Support per reasons discussed above; something generated by an LLM is not truly the editor's opinion. On an unrelated note, have we seen any LLM-powered unapproved bots come in and do things like POV-pushing and spam page creation without human intervention? If we haven't, I think it's only a matter of time. Passengerpigeon (talk) 23:23, 4 December 2024 (UTC)Reply
  • Weak oppose in the sense that I don't think all LLM discussion text should be deleted. There are at least a few ESL users who use LLM's for assistance but try to check the results as best they can before posting, and I don't think their comments should be removed indiscriminately. What I do support (although not as a formal WP:PAG) is being much more liberal in hatting LLM comments when the prompter has failed to prevent WP:WALLOFTEXT/irrelevant/incomprehensible output than we maybe would for human-generated text of that nature. Mach61 03:05, 5 December 2024 (UTC)Reply
  • Oppose Any comments made by any editors are of their own responsibility and representing their own chosen opinions to hit the Publish Changes button on. If that comment was made by an LLM, then whatever it says is something the editor supports. I see no reason whatsoever to collapse anything claimed to be made by an LLM (whose detectors are 100% not reliable in the first place). If the comment being made is irrelevant to the discussion, then hatting it is already something covered by policy in the first place. This does make me want to start my comments with "As a large language model trained by OpenAI" though just to mess with people trying to push these sorts of policy discussions. SilverserenC 05:29, 5 December 2024 (UTC)Reply
    • Or, as ChatGPT puts it,
Why banning LLM usage in comments would be detrimental, a ChatGPT treatise

Banning the use of large language models (LLMs) to assist in writing comments on Wikipedia would be a step backward in fostering inclusivity, efficiency, and accessibility within the platform. Here are several key reasons why such a ban would be counterproductive:

1. Improving Accessibility and Participation: Wikipedia is a global platform, and its contributors come from diverse backgrounds, including those who might not be native English speakers or may struggle with writing clear, concise comments. LLMs can help these contributors express themselves more effectively by generating text that is coherent, well-structured, and grammatically correct. Banning LLMs would disproportionately impact these users, potentially discouraging them from participating in the platform.

2. Efficiency and Time-Saving: Wikipedia editors often engage in lengthy discussions about content, and composing thoughtful comments can be time-consuming. LLMs can assist by quickly drafting responses, summarizing arguments, or offering suggestions for improvement. This helps contributors focus on more substantive aspects of their contributions, such as verifying facts or improving content, rather than getting bogged down by the mechanics of writing.

3. Support for Editors with Disabilities: Many editors with cognitive or motor impairments may find it difficult to express their thoughts clearly in writing. For these individuals, LLMs offer a valuable tool to communicate their ideas effectively. A ban on using LLMs could create additional barriers for these users, reducing their ability to participate fully in Wikipedia's collaborative environment.

4. Quality and Consistency: Large language models can help maintain a consistent tone and clarity in Wikipedia discussions. By providing suggestions that align with Wikipedia's neutral point of view and formal writing standards, LLMs can improve the overall quality of comments. This consistency is especially important in discussions that involve sensitive topics, where clear and neutral language is critical for maintaining Wikipedia's credibility and adherence to its editorial guidelines.

5. Encouraging Constructive Dialogue: LLMs can assist in crafting more polite, respectful, and balanced comments, reducing the likelihood of inflammatory or aggressive language. This is particularly important on a platform like Wikipedia, where discussions can sometimes devolve into heated exchanges. Rather than banning LLMs, encouraging their use could promote healthier, more constructive conversations.

6. Transparency and Accountability: If concerns exist about misuse (e.g., generating spammy or low-quality comments), these can be addressed through guidelines and oversight mechanisms. Wikipedia already has systems in place for identifying and mitigating disruptive behavior, and LLM-generated comments could be subject to the same scrutiny. A ban is an overly simplistic solution to what could be a more nuanced issue, one that could be addressed through better regulation and community involvement.

In conclusion, banning the use of large language models on Wikipedia would deprive the platform of an important tool for improving accessibility, efficiency, and the quality of discourse. Instead of a ban, the focus should be on establishing guidelines for their responsible use, ensuring that they enhance, rather than detract from, Wikipedia's core principles of collaboration, neutrality, and inclusivity.

I'm honestly a bit impressed with the little guy. SilverserenC 05:39, 5 December 2024 (UTC)Reply
It is somewhat amusing how easy it is to get these chatbots to output apologia for these chatbots. Too bad it's always so shallow. Probably because the people who inserted those canned responses are shallow people is my opinion. Simonm223 (talk) 19:44, 6 December 2024 (UTC)Reply
  • Support those who are opposing have clearly never had to deal with trolls who endlessly WP:SEALION. If I wanted to have a discussion with a chatbot, I'd go and find one. ~~ AirshipJungleman29 (talk) 13:14, 5 December 2024 (UTC)Reply
    What's wrong with just banning and hatting the troll? Aaron Liu (talk) 13:49, 5 December 2024 (UTC)Reply
    Someone trolling and sealioning can (and should) be blocked under current policy, whether they use an LLM or not is irrelevant. Thryduulf (talk) 15:22, 5 December 2024 (UTC)Reply
  • Oppose per Rhododendrites. This is a case-by-case behavioral issue, and using LLMs != being a troll. Frostly (talk) 17:30, 5 December 2024 (UTC)Reply
  • Support: the general principle is sound - where the substance has been originally written by gen-AI, comments will tend to add nothing to the discussion and even annoy or confuse other users. In principle, we should not allow such tools to be used in discussions. Comments written originally before improvement or correction by AI, particularly translation assistants, fall into a different category. Those are fine. There also has to be a high standard for comment removal. Suspicion that gen-AI might have been used is not enough. High gptzero scores is not enough. The principle should go into policy but under a stonking great caveat - WP:AGF takes precedence and a dim view will be taken of generative-AI inquisitors. arcticocean ■ 17:37, 5 December 2024 (UTC)Reply
  • Support If a human didn't write it, humans shouldn't spend time reading it. I'll go further and say that LLMs are inherently unethical technology and, consequently, people who rely on them should be made to feel bad. ESL editors who use LLMs to make themselves sound like Brad Anderson in middle management should stop doing that because it actually gets in the way of clear communication.
    I find myself unpersuaded by arguments that existing policies and guidelines are adequate here. Sometimes, one needs a linkable statement that applies directly to the circumstances at hand. By analogy, one could argue that we don't really need WP:BLP, for example, because adhering to WP:V, WP:NPOV, and WP:NOR ought already to keep bad material out of biographies of living people. But in practice, it turned out that having a specialized policy that emphasizes the general ethos of the others while tailoring them to the problem at hand is a good thing. XOR'easter (talk) 18:27, 5 December 2024 (UTC)Reply
  • Strong support - Making a computer generate believable gibberish for you is a waste of time, and tricking someone else into reading it should be a blockable offense. If we're trying to create an encyclopedia, you cannot automate any part of the thinking. We can automate processes in general, but any attempt at automating the actual discussion or thought-processes should never be allowed. If we allow this, it would waste countless hours of community time dealing with inane discussions, sockpuppetry, and disruption.
    Imagine a world where LLMs are allowed and popular - it's a sockpuppeteer's dream scenario - you can run 10 accounts and argue the same points, and the reason why they all sound alike is just merely because they're all LLM users. You could even just spend a few dollars a month and run 20-30 accounts to automatically disrupt wikipedia discussions while you sleep, and if LLM usage was allowed, it would be very hard to stop.
    However, I don't have much faith in AI detection tools (partially because it's based on the same underlying flawed technology), and would want any assumption of LLM usage to be based on obvious evidence, not just a score on some website. Also, to those who are posting chatgpt snippets here: please stop - it's not interesting or insightful, just more slop BugGhost 🦗👻 19:15, 5 December 2024 (UTC)Reply
    I agree with your assessment “Also, to those who are posting chatgpt snippets here: please stop - it's not interesting or insightful, just more slop” but unfortunately some editors who should really know better think it’s WaCkY to fill serious discussions with unfunny, distracting “humor”. Dronebogus (talk) 21:54, 5 December 2024 (UTC)Reply
    I also concur. "I used the machine for generating endless quantities of misleading text to generate more text" is not a good joke. XOR'easter (talk) 22:46, 5 December 2024 (UTC)Reply
  • Strong support if you asked a robot to spew out some AI slop to win an argument you’re basically cheating. The only ethical reason to do so is because you can’t speak English well, and the extremely obvious answer to that is “if you can barely speak English why are you editing English Wikipedia?” That’s like a person who doesn’t understand basic physics trying to explain the second law of thermodynamics using a chatbot. Dronebogus (talk) 21:32, 5 December 2024 (UTC)Reply
    I don't think "cheating" is a relevant issue here. Cheating is a problem if you use a LLM to win and get a job, award, college acceptance etc. that you otherwise wouldn't deserve. But WP discussions aren't a debating-skills contest, they're an attempt to determine the best course of action.
    So using an AI tool in a WP discussion is not cheating (though there may be other problems), just as riding a bike instead of walking isn't cheating unless you're trying to win a race. ypn^2 22:36, 5 December 2024 (UTC)Reply
    Maybe “cheating” isn’t the right word. But I think that a) most AI generated content is garbage (it can polish the turd by making it sound professional, but it’s still a turd underneath) and b) it’s going to be abused by people trying to gain a material edge in an argument. An AI can pump out text far faster than a human and that can drown out or wear down the opposition if nothing else. Dronebogus (talk) 08:08, 6 December 2024 (UTC)Reply
    Bludgeoning is already against policy. It needs to be more strongly enforced, but it needs to be more strongly enforced uniformly rather than singling out comments that somebody suspects might have had AI-involvement. Thryduulf (talk) 10:39, 6 December 2024 (UTC)Reply
  • Support; I agree with Remsense and jlwoodwa, among others: I wouldn't make any one AI-detection site the Sole Final Arbiter of whether a comment "counts", but I agree it should be expressly legitimate to discount AI / LLM slop, at the very least to the same extent as closers are already expected to discount other insubstantial or inauthentic comments (like if a sock- or meat-puppet copy-pastes a comment written for them off-wiki, as there was at least one discussion and IIRC ArbCom case about recently). -sche (talk) 22:10, 5 December 2024 (UTC)Reply
    You don't need a new policy that does nothing but duplicate a subset of existing policy. At most what you need is to add a sentence to the existing policy that states "this includes comments written using LLMs", however you'd rightly get a lot of pushback on that because it's completely redundant and frankly goes without saying. Thryduulf (talk) 23:37, 5 December 2024 (UTC)Reply
  • Support hallucinations are real. We should be taking a harder line against LLM generated participation. I don't think everyone who is doing it knows that they need to stop. Andre🚐 23:47, 5 December 2024 (UTC)Reply
  • Comment - Here is something that I imagine we will see more often. I wonder where it fits into this discussion. A user employs perplexity's RAG based system, search+LLM, to help generate their edit request (without the verbosity bias that is common when people don't tell LLMs how much output they want). Sean.hoyland (talk) 03:13, 6 December 2024 (UTC)Reply
  • Support per all above. Discussions are supposed to include the original arguments/positions/statements/etc of editors here, not off-site chatbots. The Kip (contribs) 03:53, 6 December 2024 (UTC)Reply
    I also find it pretty funny that ChatGPT itself said it shouldn't be used, as per the premise posted above by EEng. The Kip (contribs) 03:58, 6 December 2024 (UTC)Reply
    "sycophancy is a general behavior of state-of-the-art AI assistants, likely driven in part by human preference judgments favoring sycophantic responses" - Towards Understanding Sycophancy in Language Models. They give us what we want...apparently. And just like with people, there is position bias, so the order of things can matter. Sean.hoyland (talk) 04:26, 6 December 2024 (UTC)Reply
  • (Is this where I respond? If not, please move.) LLM-generated prose should be discounted. Sometimes there will be a discernible point in there; it may even be what the editor meant, lightly brushed up with what ChatGPT thinks is appropriate style. (So I wouldn't say "banned and punishable" in discussions, although we already deprecate machine translations on en.wiki and for article prose, same difference—never worth the risk.) However, LLMs don't think. They can't explain with reference to appropriate policy and guidelines. They may invent stuff, or use the wrong words—at AN recently, an editor accused another of "defaming" and "sacrilege", thus drowning their point that they thought that editor was being too hard on their group by putting their signature to an outrageous personal attack. I consider that an instance of LLM use letting them down. If it's not obvious that it is LLM use, then the question doesn't arise, right? Nobody is arguing for requiring perfect English. That isn't what WP:CIR means. English is a global language, and presumably for that reason, many editors on en.wiki are not native speakers, and those that aren't (and those that are!) display a wide range of ability in the language. Gnomes do a lot of fixing of spelling, punctuation and grammar in articles. In practice, we don't have a high bar to entrance in terms of English ability (although I think a lot more could be done to explain to new editors whose English is obviously non-native what the rule or way of doing things is that they have violated. And some of our best writers are non-native; a point that should be emphasised because we all have a right of anonymity here, many of us use it, and it's rare, in particular, that I know an editor's race. Or even nationality (which may not be the same as where they live.) But what we do here is write in English: both articles and discussions. If someone doesn't have the confidence to write their own remark or !vote, then they shouldn't participate in discussions; I strongly suspect that it is indeed a matter of confidence, of wanting to ensure the English is impeccable. LLMs don't work that way, really. They concoct things like essays based on what others have written. Advice to use them in a context like a Wikipedia discussion is bad advice. At best it suggests you let the LLM decide which way to !vote. If you have something to say, say it and if necessary people will ask a question for clarification (or disagree with you). They won't mock your English (I hope! Civility is a basic rule here!) It happens in pretty much every discussion that somebody makes an English error. No biggie. I'll stop there before I make any more typos myself; typing laboriously on my laptop in a healthcare facility, and anyway Murphy's Law covers this. Yngvadottir (talk)
  • I dunno about this specifically but I want to chime in to say that I find LLM-generated messages super fucking rude and unhelpful and support efforts to discourage them. – Joe (talk) 08:15, 6 December 2024 (UTC)Reply
  • Comment I think obvious LLM/chatbot text should at least be tagged through an Edit filter for Recent Changes, then RC Patrollers and reviewers can have a look and decide for themselves. Am (Ring!) (Notes) 11:58, 6 December 2024 (UTC)Reply
    How do you propose that such text be identified by an edit filter? LLM detections tools have high rates of both false positives and false negatives. Thryduulf (talk) 12:47, 6 December 2024 (UTC)Reply
    It might become possible once watermarks (like DeepMind's SynthID) are shown to be robust and are adopted. Some places are likely to require it at some point e.g. EU. I guess it will take a while though and might not even happen e.g. I think OpenAI recently decided to not go ahead with their watermark system for some reason. Sean.hoyland (talk) 13:17, 6 December 2024 (UTC)Reply
    It will still be trivial to bypass the watermarks, or use LLMs that don't implement them. It also (AIUI) does nothing to reduce false positives (which for our usecase are far more damaging than false negatives). Thryduulf (talk) 13:30, 6 December 2024 (UTC)Reply
    Maybe, that seems to be the case with some of the proposals. Others, like SynthID claim high detection rates, maybe because even a small amount of text contains a lot of signals. As for systems that don't implement them, I guess that would be an opportunity to make a rule more nuanced by only allowing use of watermarked output with verbosity limits...not that I support a rule in the first place. People are going to use/collaborate with LLMs. Why wouldn't they? Sean.hoyland (talk) 14:38, 6 December 2024 (UTC)Reply
    I don't think watermarks are a suitable thing to take into account. My view is that LLM usage should be a blockable offense on any namespace, but if it ends up being allowed under some circumstances then we at least need mandatory manual disclosures for any usage. Watermarks won't work / aren't obvious enough - we need something like {{LLM}} but self-imposed, and not tolerate unmarked usage. BugGhost 🦗👻 18:21, 6 December 2024 (UTC)Reply
    They will have to work at some point (e.g. [4][5]). Sean.hoyland (talk) 06:27, 7 December 2024 (UTC)Reply
    Good news! Queen of Hearts is already working on that in 1325. jlwoodwa (talk) 16:12, 6 December 2024 (UTC)Reply
    See also WP:WikiProject AI Cleanup. Aaron Liu (talk) 17:32, 6 December 2024 (UTC)Reply
  • Comment As a practical matter, users posting obvious LLM-generated content will typically be in violation of other rules (e.g. disruptive editing, sealioning), in which case their discussion comments absolutely should be ignored, discouraged, discounted, or (in severe cases) hatted. But a smaller group of users (e.g. people using LLMs as a translation tool) may be contributing productively, and we should seek to engage with, rather than discourage, them. So I don't see the need for a separate bright-line policy that risks erasing the need for discernment — in most cases, a friendly reply to the user's first LLM-like post (perhaps mentioning WP:LLM, which isn't a policy or guideline, but is nevertheless good advice) will be the right approach to work out what's really going on. Preimage (talk) 15:53, 6 December 2024 (UTC)Reply
    Yeah, this is why I disagree with the BLP analogy above. There's no great risk/emergency to ban the discernment. Aaron Liu (talk) 17:34, 6 December 2024 (UTC)Reply
    Those pesky sealion Chatbots are just the worst! Martinevans123 (talk) 18:41, 6 December 2024 (UTC)Reply
    Some translation tools have LLM assistance, but the whole point of generative models is to create text far beyond what is found in the user's input, and the latter is clearly what this proposal covers. JoelleJay (talk) 19:01, 6 December 2024 (UTC)Reply
    That might be what the proposal intends to cover, but it is not what the proposal actually covers. The proposal all comments that have been generated by LLMs and/or AI, without qualification. Thryduulf (talk) 01:05, 7 December 2024 (UTC)Reply
    70+% here understand the intention matches the language: generated by LLMs etc means "originated through generative AI tools rather than human thought", not "some kind of AI was involved in any step of the process". Even LLM translation tools don't actually create meaningful content where there wasn't any before; the generative AI aspect is only in the use of their vast training data to characterize the semantic context of your input in the form of mathematical relationships between tokens in an embedding space, and then match it with the collection of tokens most closely resembling it in the other language. There is, definitionally, a high level of creative constraint in what the translation output is since semantic preservation is required, something that is not true for text generation. JoelleJay (talk) 04:01, 7 December 2024 (UTC)Reply
    Do you have any evidence for you assertion that 70% of respondents have interpreted the language in the same way as you? Reading the comments associated with the votes suggests that it's closer to 70% of respondents who don't agree with you. Even if you are correct, 30% of people reading a policy indicates the policy is badly worded. Thryduulf (talk) 08:34, 7 December 2024 (UTC)Reply
    I think @Bugghost has summarized the respondent positions sufficiently below. I also think some portion of the opposers understand the proposal perfectly well and are just opposing anything that imposes participation standards. JoelleJay (talk) 22:54, 7 December 2024 (UTC)Reply
    There will be many cases where it is not possible to say whether a piece of text does or does not contain "human thought" by observing the text, even if you know it was generated by an LLM. Statements like "originated through generative AI tools rather than human thought" will miss a large class of use cases, a class that will probably grow over the coming years. People work with LLMs to produce the output they require. It is often an iterative process by necessity because people and models make mistakes. An example of when "...rather than human thought" is not the case is when someone works with an LLM to solve something like a challenging technical problem where neither the person or the model has a satisfactory solution to hand. The context window means that, just like with human collaborators, a user can iterate towards a solution through dialog and testing, exploring the right part of the solution space. Human thought is not absent in these cases, it is present in the output, the result of a collaborative process. In these cases, something "far beyond what is found in the user's input" is the objective, it seems like a legitimate objective, but regardless, it will happen, and we won't be able to see it happening. Sean.hoyland (talk) 10:46, 7 December 2024 (UTC)Reply
    Yes, but this proposal is supposed to apply to just the obvious cases and will hopefully discourage good-faith users from using LLMs to create comments wholesale in general. It can be updated as technology progresses. There's also no reason editors using LLMs to organize/validate their arguments, or as search engines for whatever, have to copy-paste their raw output, which is much more of a problem since it carries a much higher chance of hallucination. That some people who are especially familiar with how to optimize LLM use, or who pay for advanced LLM access, will be able to deceive other editors is not a reason to not formally proscribe wholesale comment generation. JoelleJay (talk) 22:27, 7 December 2024 (UTC)Reply
    That's reasonable. I can get behind the idea of handling obvious cases from a noise reduction perspective. But for me, the issue is noise swamping signal in discussions rather than how it was generated. I'm not sure we need a special rule for LLMs, maybe just a better way to implement the existing rules. Sean.hoyland (talk) 04:14, 8 December 2024 (UTC)Reply
  • Support "I Am Not A ChatBot; I Am A Free Wikipedia Editor!" Martinevans123 (talk) 18:30, 6 December 2024 (UTC)Reply
  • Comment: The original question was whether we should discount, ignore, strikethrough, or collapse chatbot-written content. I think there's a very big difference between these options, but most support !voters haven't mentioned which one(s) they support. That might make judging the consensus nearly impossible; as of now, supporters are the clear !majority, but supporters of what? — ypn^2 19:32, 6 December 2024 (UTC)Reply
    That means that supporters support the proposal that LLM-generated remarks in discussions should be discounted or ignored, and possibly removed in some manner. Not sure what the problem is here. Supporters support the things listed in the proposal - we don't need a prescribed 100% strict procedure, it just says that supporters would be happy with closers discounting, ignoring or under some circumstances deleting LLM content in discussions. BugGhost 🦗👻 19:40, 6 December 2024 (UTC)Reply
    Doing something? At least the stage could be set for a follow on discussion. Selfstudier (talk) 19:40, 6 December 2024 (UTC)Reply
    More people have bolded "support" than other options, but very few of them have even attempted to refute the arguments against (and most that have attempted have done little more than handwaving or directly contradicting themselves), and multiple of those who have bolded "support" do not actually support what has been proposed when you read their comment. It's clear to me there is not going to be a consensus for anything other than "many editors dislike the idea of LLMs" from this discussion. Thryduulf (talk) 00:58, 7 December 2024 (UTC)Reply
    Arguing one point doesn't necessarily require having to refute every point the other side makes. I can concede that "some people use LLMs to improve their spelling and grammar" without changing my view overriding view that LLMs empower bad actors, time wasters and those with competence issues, with very little to offer wikipedia in exchange. Those that use LLMs legitimately to tidy up their alledgedly competent, insightful and self-sourced thoughts should just be encouraged to post the prompts themselves instead of churning it through an LLM first. BugGhost 🦗👻 09:00, 7 December 2024 (UTC)Reply
    If you want to completely ignore all the other arguments in opposition that's your choice, but don't expect closers to attach much weight to your opinions. Thryduulf (talk) 09:05, 7 December 2024 (UTC)Reply
    Ok, here's a list of the main opposition reasonings, with individual responses.
    What about translations? - Translations are not up for debate here, the topic here is very clearly generative AI, and attempts to say that this topic covers translations as well is incorrect. No support voters have said the propositions should discount translated text, just oppose voters who are trying to muddy the waters.
    What about accessibility? - This is could be a legitimate argument, but I haven't seen this substantiated anywhere other than handwaving "AI could help people!" arguments, which I would lump into the spelling and grammar argument I responded to above.
    Detection tools are inaccurate - This I very much agree with, and noted in my support and in many others as well. But there is no clause in the actual proposal wording that mandates the use of automated AI detection, and I assume the closer would note that.
    False positives - Any rule can have a potential for false positives, from wp:DUCK to close paraphrasing to NPA. We've just got to as a community become skilled at identifying genuine cases, just like we do for every other rule.
    LLM content should be taken at face value and see if it violates some other policy - hopelessly naive stance, and a massive timesink. Anyone who has had the misfortune of going on X/twitter in the last couple of years should know that AI is not just used as an aid for those who have trouble typing, it is mainly used to spam and disrupt discussion to fake opinions to astroturf political opinions. Anyone who knows how bad the sockpuppetry issue is around CTOPs should be absolutely terrified of when (not if) someone decides to launch a full throated wave of AI bots on Wikipedia discussions, because if we have to invididually sanction each one like a human then admins will literally have no time for anything else.
    I genuinely cannot comprehend how some people could see how AI is decimating the internet through spam, bots and disinformation and still think for even one second that we should open the door to it. BugGhost 🦗👻 10:08, 7 December 2024 (UTC)Reply
    There is no door. This is true for sockpuppetry too in my opinion. There can be a rule that claims there is a door, but it is more like a bead curtain. Sean.hoyland (talk) 11:00, 7 December 2024 (UTC)Reply
    The Twitter stuff is not a good comparison here. Spam is already nukable on sight, mass disruptive bot edits are also nukable on sight, and it's unclear how static comments on Wikipedia would be the best venue to astroturf political opinions (most of which would be off-topic anyway, i.e., nukable on sight). I'd prefer if people didn't use ChatGPT to formulate their points, but if they're trying to formulate a real point then that isn't disruptive in the same way spam is. Gnomingstuff (talk) 02:22, 10 December 2024 (UTC)Reply
    it's unclear how static comments on Wikipedia would be the best venue to astroturf political opinions - by disrupting RFCs and talk page discussions a bad actor could definitely use chatgpt to astroturf. A large proportion of the world uses Wikipedia (directly or indirectly) to get information - it would be incredibly valuable thing to manipulate. My other point is that AI disruption bots (like the ones on twitter) would be indistinguishable from individuals using LLMs to "fix" spelling and grammar - by allowing one we make the other incredibly difficult to identify. How can you tell the difference between a bot and someone who just uses chatgpt for every comment? BugGhost 🦗👻 09:16, 10 December 2024 (UTC)Reply
    You can't. That's the point. This is kind of the whole idea of WP:AGF. Gnomingstuff (talk) 20:22, 13 December 2024 (UTC)Reply

    Those that use LLMs legitimately to tidy up their alledgedly competent, insightful and self-sourced thoughts should just be encouraged to post the prompts themselves instead of churning it through an LLM first.

    Social anxiety: Say "I" am a person unconfident in my writing. I imagine that when I post my raw language, I embarrass myself, and my credibility vanishes, while in the worst case nobody understands what I mean. As bad confidence is often built up through negative feedback, it's usually meritful or was meritful at some point for someone to seek outside help. Aaron Liu (talk) 23:46, 8 December 2024 (UTC)Reply
    While I sympathise with that hypothetical, Wikipedia isn't therapy and we shouldn't make decisions that do long-term harm to the project just because a hypothetical user feels emotionally dependent on a high tech spellchecker. I also think that in general wikipedia (myself included) is pretty relaxed about spelling and grammar in talk/WP space. BugGhost 🦗👻 18:45, 10 December 2024 (UTC)Reply
    We also shouldn't do long term harm to the project just because a few users are wedded to idea that LLMs are and will always be some sort of existential threat. The false positives that are an unavoidable feature of this proposal will do far more, and far longer, harm to the project than LLM-comments that are all either useful, harmless or collapseable/removable/ignorable at present. Thryduulf (talk) 19:06, 10 December 2024 (UTC)Reply
    The false positives that are an unavoidable feature of this proposal will do far more, and far longer, harm to the project - the same could be said for WP:DUCK. The reason why its not a big problem for DUCK is because the confidence level is very high. Like I've said in multiple other comments, I don't think "AI detectors" should be trusted, and that the bar for deciding whether something was created via LLM should be very high. I 100% understand your opinion and the reasoning behind it, I just think we have differing views on how well the community at large can identify AI comments. BugGhost 🦗👻 09:07, 11 December 2024 (UTC)Reply
    I don't see how allowing shy yet avid users to contribute has done or will do long-term harm. The potential always outweighs rational evaluation of outcomes for those with anxiety, a condition that is not behaviorally disruptive. Aaron Liu (talk) 02:47, 11 December 2024 (UTC)Reply
    I definitely don't want to disallow shy yet avid users! I just don't think having a "using chatgpt to generate comments is allowed" rule is the right solution to that problem, considering the wider consequences. BugGhost 🦗👻 08:52, 11 December 2024 (UTC)Reply
    Did you mean "... disallowed"? If so, I think we weigh-differently accessibility vs the quite low amount of AI trolling. Aaron Liu (talk) 14:10, 11 December 2024 (UTC)Reply
  • Support strikethroughing or collapsing per everyone else. The opposes that mention ESL have my sympathy, but I am not sure how many of them are ESL themselves. Having learnt English as my second language, I have always found it easier to communicate when users are expressing things in their own way, not polished by some AI. I sympathise with the concerns and believe the right solution is to lower our community standards with respect to WP:CIR and similar (in terms of ESL communication) without risking hallucinations by AI. Soni (talk) 02:52, 7 December 2024 (UTC)Reply
  • Oppose the use of AI detection tools. False positive rates for AI-detection are dramatically higher for non-native English speakers. AI detection tools had a 5.1% false positive rate for human-written text from native English speakers, but human-written text from non-native English speakers had a 61.3% false positive rate. ~ F4U (talkthey/it) 17:53, 8 December 2024 (UTC)Reply
  • Oppose - I'm sympathetic to concerns of abuse through automated mass-commenting, but this policy looks too black-and-white. Contributors may use LLMs for many reasons, including to fix the grammar, to convey their thoughts more clearly, or to adjust the tone for a more constructive discussion. As it stands, this policy may lead to dismissing good-faith AI-assisted comments, as well as false positives, without considering the context. Moreover, while mainstream chatbots are not designed to just mimic the human writing style, there are existing tools that can make AI-generated text more human-like, so this policy does not offer that much protection against maliciously automated contributions. Alenoach (talk) 01:12, 9 December 2024 (UTC)Reply
  • Oppose – Others have cast doubt on the efficacy of tools capable of diagnosing LLM output, and I can't vouch for its being otherwise. If EEng's example of ChatBot output is representative—a lengthy assertion of notability without citing sources—that is something that could well be disregarded whether it came from a bot or not. If used carefully, AI can be useful as an aide-memoire (such as with a spell- or grammar-checker) or as a supplier of more felicitous expression than the editor is naturally capable of (e.g. Google Translate). Dhtwiki (talk) 10:27, 9 December 2024 (UTC)Reply
  • Comment / Oppose as written. It's not accurate that GPTZero is good at detecting AI-generated content. Citations (slightly out of date but there's little reason to think things have changed from 2023): https://www.aiweirdness.com/writing-like-a-robot/ , https://www.aiweirdness.com/dont-use-ai-detectors-for-anything-important/ . For those too busy to read, a few choice quotes: "the fact that it insisted even one [real book] excerpt is not by a human means that it's useless for detecting AI-generated text," and "Not only do AI detectors falsely flag human-written text as AI-written, the way in which they do it is biased" (citing https://arxiv.org/abs/2304.02819 ). Disruptive, worthless content can already be hatted, and I'm not opposed to doing so. Editors should be sharply told to use their own words, and if not already written, an essay saying we'd rather have authentic if grammatically imperfect comments than AI-modulated ones would be helpful to cite at editors who offer up AI slop. But someone merely citing GPTZero is not convincing. GPTZero will almost surely misidentify genuine commentary as AI-generated. So fine with any sort of reminder that worthless content can be hatted, and fine with a reminder not to use ChatGPT for creating Wikipedia talk page posts, but not fine with any recommendations of LLM-detectors. SnowFire (talk) 20:00, 9 December 2024 (UTC)Reply
    @SnowFire, I can't tell if you also oppose the actual proposal, which is to permit hatting/striking obvious LLM-generated comments (using GPTzero is a very minor detail in JSS's background paragraph, not part of the proposal). JoelleJay (talk) 01:47, 11 December 2024 (UTC)Reply
    I support the proposal in so far as disruptive comments can already be hatted and that LLM-generated content is disruptive. I am strongly opposed to giving well-meaning but misguided editors a license to throw everyone's text into an AI-detector and hat the comments that score poorly. I don't think it was that minor a detail, and to the extent that detail is brought up, it should be as a reminder to use human judgment and forbid using alleged "AI detectors" instead. SnowFire (talk) 03:49, 11 December 2024 (UTC)Reply
  • Support collapsing AI (specifically, Large language model) comments by behavioral analysis (most actually disruptive cases I've seen are pretty obvious) and not the use of inaccurate tools like ZeroGPT. I thinking hatting with the title "Editors suspect that this comment has been written by a Large language model" is appropriate. They take up SO much space in a discussion because they are also unnecessarily verbose, and talk on and on but never ever say something that even approaches having substance. Discussions are for human Wikipedia editors, we shouldn't have to use to sift through comments someone put 0 effort into and outsourced to a robot that writes using random numbers (that's a major part of how tools like ChatGPT work and maintain variety). If someone needs to use an AI chatbot to communicate because they don't understand English, then they are welcome to contribute to their native language Wikipedia, but I don't think they have the right to insist that we at enwiki spend our effort reading comments they but minimal effort into besides opening the ChatGPT website. If really needed, they can write in their native language and use a non-LLM tool like Google Translate. The use of non-LLM tools like Grammarly, Google Translate, etc. I think should still be OK for all editors, as they only work off comments that editors have written themselves. MolecularPilot 🧪️✈️ 05:10, 10 December 2024 (UTC)Reply
    Adding that enforcing people writing things in their own words will actually help EAL (English additional language) editors contribute here. I world with EAL people irl, and even people who have almost native proficiency with human-written content find AI output confusing because it says things in the most confusing, verbose ways using difficult sentence constructions and words. I've seen opposers in this discussion who maybe haven't had experience working with EAL people go "what about EAL people?", but really, I think this change will help them (open to being corrected by someone who is EAL, tho). MolecularPilot 🧪️✈️ 05:17, 10 December 2024 (UTC)Reply
    Also, with regards to oppose comments that discussions are not a vote so closes will ignore AI statements which don't have merit - unedited LLM statements are incredibly verbose and annoying, and clog up the discussion. Imagine multiple paragraphs, each with a heading, but all of which say almost nothing, they're borderline WP:BLUGEONy. Giving the power to HAT them will help genuine discussion contributors keep with the flow of human arguments and avoid scaring away potential discussion contributors who are intimidated or don't feel they have the time to read the piles of AI nonsense that fill the discussion. MolecularPilot 🧪️✈️ 06:38, 10 December 2024 (UTC)Reply
  • Support (removing) in general. How is this even a question? There is no case-by-case. It is a fundamental misunderstanding of how LLMs work to consider their output reliable without careful review. And which point, the editor could have written it themselves without inherent LLM bias. The point of any discussion is to provide analytical response based on the context, not have some tool regurgitate something from a training set that sounds good. And frankly, it is disrespectuful to make someone read "AI" responses. It is a tool and there is a place and time for it, but not in discussions in an encyclopedia. —  HELLKNOWZ  TALK 15:41, 10 December 2024 (UTC)Reply
  • Strong Support. I'm very interested in what you (the generic you) have to say about something. I'm not remotely interested in what a computer has to say about something. It provides no value to the discussion and is a waste of time. Useight (talk) 18:06, 10 December 2024 (UTC)Reply
    Comments that provide no value to the discussion can already be hatted and ignored regardless of why they provide no value, without any of the false positive or false negatives inherent in this proposal. Thryduulf (talk) 18:25, 10 December 2024 (UTC)Reply
    Indeed, and that's fine for one-offs when a discussion goes off the rails or what-have-you. But we also have WP:NOTHERE for disruptive behavior, not working collaboratively, etc. I'm suggesting that using an AI to write indicates that you're not here to build the encyclopedia, you're here to have an AI build the encyclopedia. I reiterate my strong support for AI-written content to be removed, struck, collapsed, or hatted and would support further measures even beyond those. Useight (talk) 21:54, 11 December 2024 (UTC)Reply
    There are two sets of people described in your comment: those who use AI and those who are NOTHERE. The two sets overlap, but nowhere near sufficiently to declare that everybody in the former set are also in the latter set. If someone is NOTHERE they already can and should be blocked, regardless of how they evidence that. Being suspected of using AI (note that the proposal does not require proof) is not sufficient justification on its own to declare someone NOTHERE, per the many examples of constructive use of AI already noted in this thread. Thryduulf (talk) 22:03, 11 December 2024 (UTC)Reply
    To reiterate, I don't believe that any use of AI here is constructive, thus using it is evidence of WP:NOTHERE, and, therefore, the set of people using AI to write is completely circumscribed within the set of people who are NOTHERE. Please note that I am referring to users who use AI-generated writing, not users suspected of using AI-generated writing. I won't be delving into how one determines whether someone is using AI or how accurate it is, as that is, to me, a separate discussion. This is the end of my opinion on the matter. Useight (talk) 23:26, 11 December 2024 (UTC)Reply
    You are entitled to your opinion of course, but as it is contradicted by the evidence of both multiple constructive uses and of the near-impossibility of reliably detecting LLM-generated text without false positives, I would expect the closer of this discussion to attach almost no weight to it. Thryduulf (talk) 00:42, 12 December 2024 (UTC)Reply
    I am ESL and use LLMs sometimes because of that. I feel like I don't fit into the NOTHERE category. It seems like you do not understand what they are or how they can be used constructively. PackMecEng (talk) 01:43, 12 December 2024 (UTC)Reply
    No, I understand. What you're talking about is no different from using Google Translate or asking a native-speaker to translate it. You, a human, came up with something you wanted to convey. You wrote that content in Language A. But you wanted to convey that message that you - a human - wrote, but now in Language B. So you had your human-written content translated to Language B. I have no qualms with this. It's your human-written content, expressed in Language B. My concern is with step 1 (coming up with something you want to convey), not step 2 (translating that content to another language). You write a paragraph for an article but it's in another language and you need the paragraph that you wrote translated? Fine by me. You ask an AI to write a paragraph for an article? Not fine by me. Again, I'm saying that there is no valid use case for AI-written content. Useight (talk) 15:59, 12 December 2024 (UTC)Reply
    It seems very likely that there will be valid use cases for AI-written content if the objective is maximizing quality and minimizing errors. Research like this demonstrate that there will likely be cases where machines outperform humans in specific Wikipedia domains, and soon. But I think that is an entirely different question than potential misuse of LLMs in consensus related discussions. Sean.hoyland (talk) 16:25, 12 December 2024 (UTC)Reply
    But your vote and the proposed above makes not distinction there. Which is the main issue. Also not to be pedantic but every prompted to a LLM is filled out by a human looking to convey a message. Every time someone hits publish on something here it is that person confirming that is what they are saying. So how do we in practice implement what you suggest? Because without a method better than vibes it's worthless. PackMecEng (talk) 18:53, 12 December 2024 (UTC)Reply
    The proposal specifies content generated by LLMs, which has a specific meaning in the context of generative AI. If a prompt itself conveys a meaningful, supported opinion, why not just post that instead? The problem comes when the LLM adds more information than was provided, which is the whole point of generative models. JoelleJay (talk) 01:52, 13 December 2024 (UTC)Reply
  • Yes in principle. But in practice, LLM detectors are not foolproof, and there are valid reasons to sometimes use an LLM, for example to copyedit. I have used Grammarly before and have even used the Microsoft Editor, and while they aren't powered by LLMs, LLMs are a tool that need to be used appropriately on Wikipedia. Awesome Aasim 19:55, 10 December 2024 (UTC)Reply
  • Support. Using LLM to reply to editors is lazy and disrespectful of fellow editor's time and brainpower. In the context of AFD, it is particularly egregious since an LLM can't really read the article, read sources, or follow our notability guidelines.
    By the way. gptzero and other such tools are very good at detecting this. I don't think this is correct at all. I believe the false positive for AI detectors is quite high. High enough that I would recommend not using AI detectors. –Novem Linguae (talk) 03:23, 11 December 2024 (UTC)Reply
  • Question @Just Step Sideways: Since there appears to be a clear consensus against the AI-detectors part, would you like to strike that from the background? Aaron Liu (talk) 14:10, 11 December 2024 (UTC)Reply
  • Support. AI generated text should be removed outright. If you aren't willing to put the work into doing your own writing then you definitely haven't actually thought deeply about the matter at hand. User1042💬✒️ 14:16, 11 December 2024 (UTC)Reply
    This comment is rather ironic given that it's very clear you haven't thought deeply about the matter at hand, because if you had then you'd realise that it's actually a whole lot more complicated than that. Thryduulf (talk) 14:26, 11 December 2024 (UTC)Reply
    Thryduulf I don't think this reply is particular helpful, and it comes off as slightly combative. It's also by my count your 24th comment on this RFC. BugGhost 🦗👻 19:20, 11 December 2024 (UTC)Reply
    I recognize that AI paraphrased or edited is not problematic in the same ways as text generated outright by an AI. I only meant to address the core issue at steak, content whose first draft was written by an AI system. User1042💬✒️ 22:16, 17 December 2024 (UTC)Reply
  • Oppose @Just Step Sideways: The nomination's 2nd para run through https://www.zerogpt.com/ gives "11.39% AI GPT*":

    I've recently come across several users in AFD discussions that are using LLMs to generate their remarks there. As many of you are aware, gptzero and other such tools are very good at detecting this. I don't feel like any of us signed up for participating in discussions where some of the users are not using their own words but rather letting technology do it for them. Discussions are supposed to be between human editors. If you can't make a coherent argument on your own, you are not competent to be participating in the discussion. I would therefore propose that LLM-generated remarks in discussions should be discounted or ignored, and possibly removed in some manner

    The nomination's linked https://gptzero.me/ site previously advertised https://undetectable.ai/ , wherewith how will we deal? Imagine the nomination was at AFD. What should be the response to LLM accusations against the highlighted sentence? 172.97.141.219 (talk) 17:41, 11 December 2024 (UTC)Reply
  • Support with the caveat that our ability to deal with the issue goes only as far as we can accurately identify the issue (this appears to have been an issue raised across a number of the previous comments, both support and oppose, but I think it bears restating because we're approaching this from a number of different angles and its IMO the most important point regardless of what conclusions you draw from it). Horse Eye's Back (talk) 19:24, 11 December 2024 (UTC)Reply
  • Strong support, limited implementation. Wikipedia is written by volunteer editors, says our front page. This is who we are, and our writing is what Wikipedia is. It's true that LLM-created text can be difficult to identify, so this may be a bit of a moving target, and we should be conservative in what we remove—but I'm sure at this point we've all run across cases (whether here or elsewhere in our digital lives) where someone copy/pastes some text that includes "Is there anything else I can help you with?" at the end, or other blatant tells. This content should be deleted without hesitation. Retswerb (talk) 04:11, 12 December 2024 (UTC)Reply
  • Support in concept, questions over implementation — I concur with Dronebogus that users who rely on LLMs should not edit English Wikipedia. It is not a significant barrier for users to use other means of communication, including online translators, rather than artificial intelligence. How can an artificial intelligence tool argue properly? However, I question how this will work in practice without an unacceptable degree of error. elijahpepe@wikipedia (he/him) 22:39, 12 December 2024 (UTC)Reply
    Many, possibly most, online translators use artificial intelligence based on LLMs these days. Thryduulf (talk) 22:46, 12 December 2024 (UTC)Reply
    There is a difference between translating words you wrote in one language into English and using an LLM to write a comment for you. elijahpepe@wikipedia (he/him) 22:59, 12 December 2024 (UTC)Reply
    Neither your comment nor the original proposal make any such distinction. Thryduulf (talk) 23:34, 12 December 2024 (UTC)Reply
    Well since people keep bringing this up as a semi-strawman: no I don’t support banning machine translation, not that I encourage using it (once again, if you aren’t competent in English please don’t edit here) Dronebogus (talk) 07:34, 13 December 2024 (UTC)Reply
    LLMs are incredible at translating, and many online translators already incorporate them, including Google Translate. Accomodating LLMs is an easy way to support the avid not only the ESL but also the avid but shy. It has way more benefits than the unseen-to-me amount of AI trolling that isn't already collapse-on-sight. Aaron Liu (talk) 00:05, 13 December 2024 (UTC)Reply
    Google Translate uses the same transformer architecture that LLMs are built around, and uses e.g. PaLM to develop more language support (through training that enables zero-shot capabilities) and for larger-scale specialized translation tasks performed through the Google Cloud "adaptive translation" API, but it does not incorporate LLMs into translating your everyday text input, which still relies on NMTs. And even for the API features, the core constraint of matching input rather than generating content is still retained (obviously it would be very bad for a translation tool to insert material not found in the original text!). LLMs might be good for translation because they are better at evaluating semantic meaning and detecting context and nuance, but again, the generative part that is key to this proposal is not present. JoelleJay (talk) 01:20, 13 December 2024 (UTC)Reply
    PaLM (Pathways Language Model) is a 540 billion-parameter transformer-based large language model (LLM) developed by Google AI.[1] If you meant something about how reschlmunking the outputs of an LLM or using quite similar architecture is not really incorporating the LLM, I believe we would be approaching Ship of Theseus levels of recombination, to which my answer is it is the same ship.

    obviously it would be very bad for a translation tool to insert material not found in the original text!

    That happens! Aaron Liu (talk) 01:29, 13 December 2024 (UTC)Reply
    PaLM2 is not used in the consumer app (Google Translate), it's used for research. Google Translate just uses non-generative NMTs to map input to its closes cognate in the target language. JoelleJay (talk) 01:34, 13 December 2024 (UTC)Reply
    Well, is the NMT really that different enough to not be classified as an LLM? IIRC the definition of an LLM is something that outputs by predicting one-by-one what the next word/"token" should be, and an LLM I asked agreed that NMTs satisfy the definition of a generative LLM, though I think you're the expert here. Aaron Liu (talk) 02:01, 13 December 2024 (UTC)Reply
    Google Translate's NMT hits different enough to speak English much less naturally than ChatGPT 4o. I don't consider it a LLM, because the param count is 380M not 1.8T.
    the definition of an LLM is something that outputs by predicting one-by-one what the next word/"token" should be No, that def would fit ancient RNN tech too. 172.97.141.219 (talk) 17:50, 13 December 2024 (UTC)Reply
    Even if you don’t consider it L, I do, and many sources cited by the article do. Since we’ll have such contesting during enforcement, it’s better to find a way that precludes such controversy. Aaron Liu (talk) 20:44, 13 December 2024 (UTC)Reply
    NMTs, LLMs, and the text-creation functionality of LLMs are fundamentally different in the context of this discussion, which is about content generated through generative AI. NMTs specifically for translation: they are trained on parallel corpora and their output is optimized to match the input as precisely as possible, not to create novel text. LLMs have different training, including way more massive corpora, and were designed specifically to create novel text. One of the applications of LLMs may be translation (though currently it's too computationally intensive to run them for standard consumer purposes), by virtue of their being very good at determining semantic meaning, but even if/when they do become mainstream translation tools what they'll be used for is still not generative when it comes to translation output. JoelleJay (talk) 22:29, 13 December 2024 (UTC)Reply
    How will you differentiate between the use of LLM for copyediting and the use of LLM for generation? Aaron Liu (talk) 23:30, 13 December 2024 (UTC)Reply
    The proposal is for hatting obvious cases of LLM-generated comments. Someone who just uses an LLM to copyedit will still have written the content themselves and presumably their output would not have the obvious tells of generative AI. JoelleJay (talk) 23:56, 13 December 2024 (UTC)Reply
    Not when I tried to use it. Quantitatively, GPTZero went from 15% human to 100% AI for me despite the copyedits only changing 14 words. Aaron Liu (talk) 00:33, 14 December 2024 (UTC)Reply
    I think there is consensus that GPTZero is not usable, even for obvious cases. JoelleJay (talk) 00:55, 14 December 2024 (UTC)Reply
    Yes, but being as far as 100% means people will also probably think the rewrite ChatGPT-generated. Aaron Liu (talk) 01:18, 14 December 2024 (UTC)Reply
    Does it really mean that? All you've demonstrated is that GPTZero has false positives, which is exactly why its use here was discouraged. jlwoodwa (talk) 05:26, 14 December 2024 (UTC)Reply
    My subjective evaluation of what I got copyediting from ChatGPT was that it sounded like ChatGPT. I used GPTZero to get a number. Aaron Liu (talk) 14:18, 14 December 2024 (UTC)Reply
    My guess is that the copyediting went beyond what most people would actually call "copyediting". JoelleJay (talk) 18:04, 23 December 2024 (UTC)Reply
    It changed only 14 words across two paragraphs and still retained the same meaning in a way that I would describe it as copyediting. Such levels of change are what those lacking confidence in tone would probably seek anyways. Aaron Liu (talk) 00:15, 24 December 2024 (UTC)Reply
  • On one hand, AI slop is a plague on humanity and obvious LLM output should definitely be disregarded when evaluating consensus. On the other hand, I feel like existing policy covers this just fine, and any experienced closer will lend greater weight to actual policy-based arguments, and discount anything that is just parroting jargon. WindTempos they (talkcontribs) 23:21, 12 December 2024 (UTC)Reply
  • Support in principle, but we cannot rely on any specific tools because none are accurate enough for our needs. Whenever I see a blatant ChatGPT-generated !vote, I ignore it. They're invariably poorly reasoned and based on surface-level concepts rather than anything specific to the issue being discussed. If someone is using AI to create their arguments for them, it means they have no actual argument besides WP:ILIKEIT and are looking for arguments that support their desired result rather than coming up with a result based on the merits. Also, toasters do not get to have an opinion. The WordsmithTalk to me 05:17, 13 December 2024 (UTC)Reply
  • Oppose. For creating unnecessary drama. First of, the "detector" of the AI bot is not reliable, or at least the reliability of the tool itself is still questionable. If the tool to detect LLM itself is unreliable, how can one reliably point out which one is LLM and which one is not? We got multiple tools that claimed to be able to detect LLM as well. Which one should we trust? Should we be elevating one tool over the others? Have there been any research that showed that the "picked" tool is the most reliable? Second, not all LLMs are dangerous. We shouldn't treat LLM as a virus that will somehow take over the Internet or something. Some editors use LLM to smooth out their grammar and sentences and fix up errors, and there is nothing wrong with that. I understand that banning obvious LLM text per WP:DUCK are good, but totally banning them is plain wrong. ✠ SunDawn ✠ (contact) 22:56, 15 December 2024 (UTC)Reply
    @SunDawn, the proposal is to permit editors to collapse/strike obvious LLM text, not to "ban LLM totally". If LLM use is imperceptible, like for tweaking grammar, it's not going to be affected. JoelleJay (talk) 20:17, 19 December 2024 (UTC)Reply
  • Support with some kind of caveat about not relying on faulty tools or presuming that something is LLM without evidence or admission, based on the following reasons:
    1. We have stricter rules around semi-automated editing (rollback, AutoWikiBrowser, etc.) and even stricter rules around fully automated bot editing. These cleanup edits are widely accepted as positive, but there is still the concern about an overwhelming amount of bad edits to wade through and/or fix. A form of that concern is relevant here. Someone could reply to every post in this discussion in just a minute or so without ever reading anything. That's inherently disruptive.
    2. Nobody who is voting "oppose" is using an LLM to cast that vote. The LLM comments have been left by those supporting to make a point about how problematic they are for discussions like this. I think this reflects, even among oppose voters, a developing community consensus that LLM comments will be disregarded.
    3. If the rule in practice is to disregard LLM comments, not writing that rule down does not stop it from being the rule, consensus, or a community norm. It just makes the rule less obvious and less clear.
    4. It's disrespectful for an editor to ask someone to spend their time reading a comment if they couldn't be bothered to spend any time writing it, and therefore a violation of the policy Wikipedia:Civility, "treat your fellow editors as respected colleagues with whom you are working on an important project."
  • Also, I don't read the proposal as a ban on machine translation in any way. Rjjiii (talk) 00:01, 18 December 2024 (UTC)Reply
    @Rjjiii, above @Dilettante said their !vote was created by LLM. JoelleJay (talk) 20:14, 19 December 2024 (UTC)Reply
  • I am strongly opposed to banning or ignoring LLM-made talk page comments just because they are LLM-made. I'm not a big fan of LLMs at all; they are actually useful only for some certain things, very few of which are directly relevant to contributing to Wikipedia in English or in any other language. However, some of those things are useful for this, at least for some humans, and I don't want to see these humans being kicked out of the English Wikipedia. I already witnessed several cases in which people whose first language is not English tried writing talk page responses in the English Wikipedia, used an LLM to improve their writing style, and got their responses ignored only because they used an LLM. In all those cases, I had strong reasons to be certain that they were real humans, that they meant what they wrote, and that they did it all in good faith. Please don't say that anyone who wants to contribute to the English Wikipeida should, in the first place, know English well enough to write a coherent talk page comment without LLM assistance; occasionally, I kind of wish that it was like that myself, but then I recall that the world is more complicated and interesting than that. Uses of LLMs that help the English Wikipedia be more inclusive for good-faith people are good. Of course, defining what good faith means is complicated, but using an LLM is not, by itself, a sign of bad faith. --Amir E. Aharoni (talk) 04:52, 19 December 2024 (UTC)Reply
    Those concerned about their English should use translation software rather than an llm. Both might alter the meaning to some extent, but only one will make things up. (It's also not a sure assumption that llm text is coherent talkpage text.) CMD (talk) 07:44, 19 December 2024 (UTC)Reply
    @CMD The dividing line between translation software and LLM is already blurry and will soon disappear. It's also rare that translation software results in coherent talkpage text, unless it's relying on some (primitive) form of LLM. So if we're going to outlaw LLMs, we would need to outlaw any form of translation software, and possibly any text-to-speech software as well. ypn^2 23:41, 19 December 2024 (UTC)Reply
    The distinctions have already been covered above, and no we would not have to. There is an obvious difference between software intended to translate and software intended to generate novel text, and users are likely to continue to treat those differently. CMD (talk) 02:49, 20 December 2024 (UTC)Reply
  • Strong support. LLM-generated content has no place anywhere on the encyclopedia. Stifle (talk) 10:27, 19 December 2024 (UTC)Reply
  • Strong oppose to the proposal as written. Wikipedia already suffers from being stuck in a 2001 mindset and a refusal to move with the technological times. Anyone who remembers most Wikipedians' visceral reaction to FLOW and VisualEditor when they were first introduced will observe a striking similarity. Yes, those projects had serious problems, as do LLM-generated comments. But AI is the future, and this attitude of "Move slowly to avoid changing things" will ultimately lead Wikipedia the way of Encyclopædia Britannica. Our discussion needs to be how best to change, not how to avoid to change. ypn^2 23:54, 19 December 2024 (UTC)Reply
    The main objection to VE and a major objection to FLOW was the developers' insistence on transforming Wikitext to HTML for editing and then transforming that back to Wikitext. Aaron Liu (talk) 01:31, 20 December 2024 (UTC)Reply
    True. Then, as now, there were many valid objections. But IIRC, there was limited discussion of "Let's figure out a better way to improve", and lots of "Everything is fine; don't change anything, ever." That attitude concerns me. ypn^2 01:52, 20 December 2024 (UTC)Reply
  • Support. I'm not even slightly swayed by these "it'll be too hard to figure out" and "mistakes could be made" and "we can't be 100% certain" sorts of arguments. That's true of everything around here, and its why we have an admins-must-earn-a-boatload-of-community-trust system, and a system of review/appeal of decisions they (or of course non-admin closers) make, and a consensus-based decisionmaking system more broadly. JoelleJay has it exactly right: having a policy that permits closers to discount apparently-LLM-generated contributions will discourage good-faith editors from using LLMs irresponsibly and perhaps motivate bad-faith editors to edit the raw output to appear more human, which would at least involve some degree of effort and engagement with their "own" arguments. And as pointed out by some others, the "it'll hurt non-native-English speakers" nonsense is, well, nonsense; translation is a different and unrelated process (though LLMs can perform it to some extent), of remapping one's own material onto another language.

    I'm also not in any way convinved by the "people poor at writing and other cognitive tasks needs the LLM to help them here" angle, because WP:COMPETENCE is required. This is work (albeit volunteer work), it is WP:NOT a game, a social-media playground, a get-my-ideas-out-there soapbox, or a place to learn how to interact e-socially or pick up remedial writing skills, nor a venue for practicing one's argument techiques. It's an encyclopedia, being built by people who – to be productive contributors instead of a draining burden on the entire community – must have: solid reasoning habits, great judgement (especially in assessing reliability of claims and the sources making them), excellent writing skills of a higherly particularized sort, a high level of fluency in this specific language (in multiple registers), and a human-judgment ability to understand our thick web of policies, guidelines, procedures, and often unwritten norms, and how they all interact, in a specific contextual way that may vary greatly by context. None of these is optional. An LLM cannot do any of them adequately (not even write well; their material sticks out like a sore thumb, and after a while you can even tell which LLM produced the material by its habitual but dinstictive crappy approach to simulating human thought and language).

    In short, if you need an LLM to give what you think is meaningful input into a decision-making process on Wikipedia (much less to generate mainspace content for the public), then you need to go find something else to do, something that fits your skills and abilities. Saying this so plainly will probably upset someone, but so it goes. I have a rep for "not suffering fools lightly" and "being annoying but correct"; I can live with that if it gets the right decisions made and the work advanced.  — SMcCandlish ¢ 😼  05:49, 22 December 2024 (UTC)Reply

    The problem with all that is that we already have a policy that allows the hatting or removal of comments that are actually problematic because of their content (which are the only ones that we should be removing) without regard for whether it was or was not written by LLM. Everything that actually should be removed can be removed already. Thryduulf (talk) 11:39, 22 December 2024 (UTC)Reply
    People who have good reading skills, great judgement, and solid reasoning habits enough to find problems in existing articles don't necessarily have great interpersonal writing/communication skills or the confidence. Meanwhile, for all LLM is bad at, it is very good at diluting everything you say to become dry, dispassionate, and thus inoffensive. Aaron Liu (talk) 15:26, 22 December 2024 (UTC)Reply
  • Support. Sure I have questions about detection, but I don't think it means we shouldn't have a policy that explicitly states that it should not be used (and can be ignored/hatted if it is). Judging solely based on content (and no wp:bludgeoning, etc.) is unsustainable IMO. It would mean taking every wall of text seriously until it's clear that the content is unhelpful, and LLMs are very good at churning out plausible-sounding bullshit. It wastes everyone's time. If cognitive impairments or ESL issues make it hard to contribute, try voice-to-text, old-school translation software, or some other aid. LLMs aren't really you.--MattMauler (talk) 11:27, 23 December 2024 (UTC)Reply
  • Comment. While I agree with the sentiment of the request, I am at a loss to see how we can identify LLM generated comments in a consistent manner that can scale. Yes, it might be easier to identify egregious copy paste of wall of text, but, anything other than that might be hard to detect. Our options are:
  1. Robust tooling to detect LLM generated text, with acceptably low levels of false positives. Somewhat similar to what Earwig does for Copyvios. But, someone needs to build it and host it on WMTools or at a similar location.
  2. Self certification by editors. Every edit / publish dialogbox should have a checkbox for "Is this text LLM generated" with y/n optionality.
  3. Editors playing a vigilante role in reading the text and making a personal call on other editors' text. Obviously this is least preferred.
These are my starting views. Ktin (talk) 00:37, 24 December 2024 (UTC)Reply

Alternate proposal

edit
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
Redundant proposal, confusingly worded, with no support, and not even any further discussion interest in 10 days.  — SMcCandlish ¢ 😼  05:23, 22 December 2024 (UTC)Reply

Whereas many editors, including me, have cited problems with accuracy in regards to existing tools such as ZeroGPT, I propose that remarks that are blatently generated by a LLM or similar automated system should be discounted/removed/collapsed/hidden. ThatIPEditor They / Them 10:00, 10 December 2024 (UTC)Reply

Oppose as completely unnecessary and far too prone to error per the above discussion. Any comment that is good (on topic, relevant, etc) should be considered by the closer regardless of whether it was made with LLM-input of any sort or not. Any comment that is bad (off-topic, irrelevant, etc) should be ignored by the closer regardless of whether it was made with LLM-input of any sort or not. Any comment that is both bad and disruptive (e.g. by being excessively long, completely irrelevant, bludgeoning, etc) should be removed and/or hatted as appropriate, regardless of whether it was made with LLM-input of any sort. The good thing is that this is already policy so we don't need to call out LLMs specifically, and indeed doing so is likely to be disruptive in cases where human-written comments are misidentified as being LLM-written (which will happen, regardless of whether tools are used). Thryduulf (talk) 11:19, 10 December 2024 (UTC)Reply
I think this proposal is not really necessary. I support it, but that is because it is functionally identical to the one directly above it, which I also supported. This should probably be hatted. BugGhost 🦗👻 18:32, 10 December 2024 (UTC)Reply
What does blatantly generated mean? Does you mean only where the remark is signed with "I, Chatbot", or anything that appears to be LLM-style? I don't think there's much in between. ypn^2 19:21, 10 December 2024 (UTC)Reply
Procedural close per BugGhost. I'd hat this myself, but I don't think that'd be appropriate since it's only the two of us who have expressed that this proposal is basically an exact clone. Aaron Liu (talk) 03:00, 11 December 2024 (UTC)Reply
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Should first language be included in the infobox for historical figures?

edit

Is there a guideline concerning this? "Infobox royalty" apparently has this parameter, but I haven't found a single article that actually uses it. Many articles don't mention the subject's spoken languages at all. In my view, somebody's first language (L1) is just a very basic and useful piece of information, especially for historical figures. This would be helpful in cases where the ruling elites spoke a completely different language from the rest of the country (e.g., High Medieval England or early Qing dynasty China). These things are not always obvious to readers who are unfamiliar with the topic. Including it would be a nice and easy way to demonstrate historical language shifts that otherwise might be overlooked. Perhaps it could also bring visibility to historical linguistic diversity and language groups that have since disappeared. Where there are multiple first languages, they could all be listed. And in cases where a person's first language remains unclear, it could simply be left out. Kalapulla123 (talk) 11:53, 8 December 2024 (UTC)Reply

I don't think I agree this is a good use of infobox space:
  • incongruences between elite spoken languages and popular spoken languages can't be shown with a single parameter (the language spoken by the oppressed would have to be included as well)
  • for many people this would be unverifiable (already mentioned in OP) and / or contentious (people living during a language transition)
  • sometimes L2 skills will be more than adequate to communicate with subject population when called for
  • in cases where the subject's L1 matches their polity's (i.e. most cases), the parameter would feel like unnecessary clutter
  • prose description seems adequate
However, this is just my opinion, and the venue of discussion should probably be Wikipedia talk:WikiProject Royalty and Nobility or similar, rather than VPP. Folly Mox (talk) 12:02, 9 December 2024 (UTC)Reply
I think this might be sufficiently important pretty much exclusively for writers where the language they wrote in is not the "obvious" one for their nationality. Johnbod (talk) 12:43, 9 December 2024 (UTC)Reply
It might also be important for politicians (and similar figures?) in countries where language is a politically-important subject, e.g. Belgium. Thryduulf (talk) 16:29, 9 December 2024 (UTC)Reply
This seems like a bad idea. Let's take a case where language spoken by a royal was very relevant: Charles V, Holy Roman Emperor. When he became King of Castile as a teenager, he only really spoke Flemish and didn't speak Castilian Spanish, and needless to say trusted the advisors he could actually talk with (i.e. Flemish / Dutch ones he brought with him). He also then immediately skipped out of Castile to go to proto-Germany to be elected Holy Roman Emperor. This ended up causing a rebellion (Revolt of the Comuneros) which was at least partially justified by Castilian nationalism, and partially by annoyed Castilian elites who wanted cushy government jobs. So language-of-royal was relevant. But... the Infobox is for the person as a whole. Charles came back to Castile and spent a stretch of 10 years there and eventually learned rather good Castilian and largely assuaged the elite, at least. He was king of Spain for forty years. So it would seem rather petty to harp on the fact his first language wasn't Castilian in the Infobox, when he certainly did speak it later and through most of his reign, even if not his first few years when he was still basically a kid. SnowFire (talk) 19:47, 9 December 2024 (UTC)Reply
See below on this. Johnbod (talk) 14:26, 11 December 2024 (UTC)Reply
SnowFire's fascinating anecdote shows that this information is not appropriate for infoboxes but rather should be described in prose in the body of the article where the subtleties can be explained to the readers. Cullen328 (talk) 19:56, 9 December 2024 (UTC)Reply
No, it shows that it's not appropriate for that infobox, and therefore that it is not suitable for all infoboxes where it is plausibly relevant. It shows nothing about whether it is or is not appropriate for other infoboxes: the plural of anecdote is not data. Thryduulf (talk) 21:08, 9 December 2024 (UTC)Reply
But it kind of is here? I picked this example as maybe one of the most obviously relevant cases. Most royals failing to speak the right language don't have this trait linked with a literal war in reliable sources! But if inclusion of this piece of information in an Infobox is still problematic in this case, how could it possibly be relevant in the 99.9% cases of lesser importance? The Infobox isn't for every single true fact. SnowFire (talk) 21:53, 9 December 2024 (UTC)Reply
It isn't suitable for this infobox not because of a lack of importance, but because stating a single first language would be misleading. There exists the very real possibility of cases where it is both important and simple. Thryduulf (talk) 00:02, 10 December 2024 (UTC)Reply
Could you (or anyone else in favor of the proposal) identify 5 biographies where this information is both useful to readers and clearly backed by reliable sources? signed, Rosguill talk 15:06, 11 December 2024 (UTC)Reply
Charles V claimed to have spoken Italian to women, French to men, Spanish to God, and German to his horse. Hawkeye7 (discuss) 21:35, 9 December 2024 (UTC)Reply
Sorry, this is just nonsense! Charles V was raised speaking French, which was the language of his aunt's court, although in the Dutch-speaking Mechelen. All his personal letters use French. He only began to be taught Dutch when he was 14, & may never have been much good at it (or Spanish or German). Contrary to the famous anecdote, which is rather late and dubious ("Spanish to God....German to my horse") he seems to have been a rather poor linguist, which was indeed awkward at times. Johnbod (talk) 00:39, 10 December 2024 (UTC)Reply
(This is a bit off-topic, but "nonsense" is too harsh. I'm familiar that he spoke "French" too, yes, although my understanding was that he did speak "Flemish", i.e. the local Dutch-inflected speech, too? And neither 1500-era French nor Dutch were exactly standardized, so I left it as "Flemish" above for simplicity. If his Dutch was worse than I thought, sure, doesn't really affect the point made, though, which was that his Castilian was non-existent at first. As far as his later understanding of Spanish, his capacity was clearly enough - at the very least I've seen sources say he made it work and it was enough to stave off further discontent from the nobility. Take it up with the authors of the sources, not me.). SnowFire (talk) 16:23, 10 December 2024 (UTC)Reply
There's a difference between "simplicity" and just being wrong! You should try reading the sources, with which I have no issue. And his ministers were also either native Francophones, like Cardinal Granvelle and his father Nicolas Perrenot de Granvelle (both from Besançon, now in eastern France), or could speak it well; the Burgundian elite had been Francophone for a long time. The backwash from all this remains a somewhat sensitive issue in Belgium, even now. And Charles V was not "King of Spain" (a title he avoided using) for 40 years at all; only after his mother died in 1555 (a year before him) did he become unarguably King of Castile. Johnbod (talk) 14:26, 11 December 2024 (UTC)Reply
It may not be appropriate for many articles, but it surely is for some. For example, when I told her that England had had kings whose first language was German, someone asked me the other day how many. It would be good to have a quick way of looking up the 18th century Georges to find out. Phil Bridger (talk) 21:20, 9 December 2024 (UTC)Reply
I think the problem is that people might make assumptions. I would check before saying that George I and George II spoke German as their first language and not French. Languages spoken is probably more useful than birth language, but the list might be incomplete. There is also competing information about George I, and he is an English King, so he has been better researched and documented compared to other historical figures.
I agree that this is important when language is the basis of community identity, such as in Belgian. Tinynanorobots (talk) 10:38, 10 December 2024 (UTC)Reply
  • Ummmm… no. People I disagree with™️ use “infobox bloat” as a boogeyman in arguments about infoboxes. But this is infobox bloat. Even those celebrity/anime character things that tell you shoe size, pinky length and blood type wouldn’t include this. Dronebogus (talk) 18:16, 11 December 2024 (UTC)Reply
I don't think there needs to be any central policy on this. It could be relevant to include this information for someone, perhaps... maybe... However, infoboxes work best when they contain uncontroversial at-a-glance facts that don't need a bunch of nuance and context to understand. For the example of Charles V, maybe his first language is significant, but putting it in the infobox (where the accompanying story cannot fit) would be a confusing unexplained factoid. Like, maybe once upon a time there was a notable person whose life turned on the fact that they were left-handed. That could be a great bit of content for the main article, but putting handedness in the infobox would be odd. Barnards.tar.gz (talk) 14:33, 12 December 2024 (UTC)Reply
{{Infobox baseball biography}} includes handedness, and nobody finds that odd content for an infobox.
{{infobox royalty}} includes the option for up to five native languages, though the OP says it seems to be unused in practice. {{Infobox writer}} has a |language= parameter, and it would be surprising if this were unused. WhatamIdoing (talk) 19:36, 12 December 2024 (UTC)Reply
Baseball seems to be a good example of where handedness is routinely covered, and easily consumable at a glance without needing further explanation. The scenario where I don't think handedness (or first language) makes sense is when it is a uniquely interesting aspect of that individual's life, because almost by definition there's a story there which the infobox can't tell. Barnards.tar.gz (talk) 10:23, 13 December 2024 (UTC)Reply
I don't think L1 can be determined for most historical figures without a hefty dose of OR. If you look at my Babel boxes, you'll see that I, as a living human being with all the information about my own life, could not tell you what my own "L1" is. The historical figures for whom this would be relevant mostly spoke many more languages than I do, and without a time machine it would be nigh impossible to say which language they learned first. This isn't even clear for the Qing emperors – I am fairly certain that they all spoke (Mandarin) Chinese very well, and our article never says what language they spoke. Puyi even states that he never spoke Manchu. Adding this parameter would also inflame existing debates across the encyclopedia about ethnonationalism (e.g. Nicola Tesla) and infobox bloat. Toadspike [Talk] 21:21, 12 December 2024 (UTC)Reply
As with every bit of information in every infobox, if it cannot be reliably sourced it does not go in, regardless of how important it is or isn't. There are plenty of examples of people whose first language is reported in reliable sources, I just did an internal source for "first language was" and on the first page of results found sourced mentions of first language at Danny Driver, Cleopatra, Ruthanne Lum McCunn, Nina Fedoroff, Jason Derulo, Henry Taube and Tom Segev, and an unsourced but plausible mention at Dean Martin. The article strongly suggests that her first language is an important part of Cleopatra's biography such that putting it in the infobox would be justifiable. I am not familiar enough with any of the others to have an opinion on whether it merits an infobox mention there, I'm simply reporting that there are many articles where first language is reliably sourced and a mention is deemed DUE. Thryduulf (talk) 22:08, 12 December 2024 (UTC)Reply
I have been wondering since this conversation opened how far back the concept of an L1 language, or perhaps the most colloquial first language, can be pushed. Our article doesn't have anything on the history of the concept. CMD (talk) 11:31, 13 December 2024 (UTC)Reply
I suspect the concept is pretty ancient, I certainly wouldn't be surprised to learn it arose around the same time as diplomacy between groups of people with different first languages. The note about it at Cleopatra certainly suggests it was already a well-established concept in her era (1st century BCE). Thryduulf (talk) 13:23, 13 December 2024 (UTC)Reply
The concept of different social strata speaking different languages is old, but I'm not sure whether they viewed learning languages the same way we do. It's certainly possible, and perhaps it happened in some areas at some times, but I hesitate to assume it's the case for every historical person with an infobox. CMD (talk) 16:05, 13 December 2024 (UTC)Reply
It's certainly not going to be appropriate for the infobox of every historical person, as is true for (nearly?) every parameter. The questions here are whether it is appropriate in any cases, and if so in enough cases to justify having it as a parameter (how many is enough? I'd say a few dozen at minimum, ideally more). I think the answer the first question is "yes". The second question hasn't been answered yet, and I don't think we have enough information here yet to answer it. Thryduulf (talk) 21:54, 13 December 2024 (UTC)Reply
The question is not whether it is appropriate in any cases; the question is whether it is worth the trouble. I guarantee that this would lead to many vicious debates, despite being in most cases an irrelevant and unverifiable factoid based on inappropriate ABOUTSELF. This is the same reason we have MOS:ETHNICITY/NATIONALITY. Toadspike [Talk] 07:29, 16 December 2024 (UTC)Reply
Nah. If this were "a very basic and useful piece of information" then we would already be deploying it site wide, so it obviously is not. In the vast majority of cases, it would involve intolerable WP:OR or even just guessing masquerading as facts. We do not know for certain that someone born in France had French as their first/native/home language. I have close relatives in the US, in a largely English-speaking part of the US, whose first language is Spanish. For historical figures it would get even more ridiculous, since even our conceptions of languages today as, e.g., "German" and "French" and "Spanish" and "Japanese", is a bit fictive and is certainly not historically accurate, because multiple languages were (and still are, actually) spoken in these places. We would have no way to ascertain which was used originally or most natively for the average historical figure. Beyond a certain comparatively recent point, most linguistics is reconstruction (i.e. educated guesswork; if there's not a substantial corpus of surviving written material we cannot be sure. That matters a lot for figures like Genghis Khan and King Bridei I of the Picts. Finally, it really is just trivia in the vast majority of cases. What a biographical figure's first/primary/home/most-fluent/most-frequently-used language (and some of those might not be the same since all of them can change over time other than "first") is something that could be included when certain from RS, but it's not lead- or infobox-worthy in most cases, unless it pertains directly the subject's notability (e.g. as a writer) and also isn't already implicit from other details like nationality.  — SMcCandlish ¢ 😼  03:42, 23 December 2024 (UTC)Reply

Restrict new users from crosswiki uploading files to Commons

edit

I created this Phabricator ticket (phab:T370598) in July of this year, figuring that consensus to restrict non-confirmed users from crosswiki uploading files to Commons is implied. Well, consensus already agreed at Commons in response to the WMF study on crosswiki uploading. I created an attempted Wish at Meta-wiki, which was then rejected, i.e. "archived", as policy-related and requir[ing] alignment across various wikis to implement such a policy. Now I'm starting this thread, thinking that the consensus here would already or implicitly support such restriction, but I can stand corrected about the outcome here. George Ho (talk) 06:34, 9 December 2024 (UTC); corrected, 08:10, 9 December 2024 (UTC)Reply

  • Support. I am not sure why this relies on alignment across wikis, those on Commons are best placed to know what is making it to Commons. The change would have little to no impact on en.wiki. If there is an impact, it would presumably be less cleaning up of presumably fair use files migrated to Commons that need to be fixed here. That said, if there needs to be consensus, then obviously support. We shouldn't need months of bureaucracy for this. CMD (talk) 06:41, 9 December 2024 (UTC)Reply
  • Support, I don't know that my input really counts as new consensus because I said this at the time, but the problem is much worse than what the study suggests as we are still finding spam, copyvios, unusable selfies and other speedy-deletable uploads from the timespan audited.
Gnomingstuff (talk) 02:14, 10 December 2024 (UTC)Reply

Two questions from a deletion review

edit

Here are two mostly unrelated questions that came up in the course of a Deletion Review. The DRV is ready for closure, because the appellant has been blocked for advertising, but I think that the questions should be asked, and maybe answered. Robert McClenon (talk) 20:44, 11 December 2024 (UTC)Reply

Requests for copies of G11 material

edit

At DRV, there are sometimes requests to restore to draft space or user space material from pages that were deleted as G11, purely promotional material. DRV is sometimes the second or third stop for the originator, with the earlier stops being the deleting administrator and Requests for Undeletion.

Requests for Undeletion has a list of speedy deletion codes for which deleted material is not restored, including G11. (They say that they do not restore attack pages or copyright violation. They also do not restore vandalism and spam.) Sometimes the originator says that they are trying to rewrite the article to be neutral. My question is whether DRV should consider such requests on a case-by-case basis, as is requested by the originators, or whether DRV should deny the requests categorically, just as they are denied at Requests for Undeletion. I personally have no sympathy for an editor who lost all of their work on a page because it was deleted and they didn't back it up. My own opinion is that they should have kept a copy on their hard drive (or solid-state device), but that is my opinion.

We know that the decision that a page should be speedily deleted as G11 may properly be appealed to Deletion Review. My question is about requests to restore a draft that was properly deleted as G11 so that the originator can work to make it neutral.

I am also not asking about requests for assistance in telling an author what parts of a deleted page were problematic. In those cases, the author is asking the Wikipedia community to write their promotional article for them, and we should not do that. But should we consider a request to restore the deleted material so that the originator can make it neutral? Robert McClenon (talk) 20:44, 11 December 2024 (UTC)Reply

When we delete an article we should always answer reasonable questions asked in good faith about why we deleted it, and that includes explaining why we regard an article as promotional. We want neutral encyclopaedic content on every notable subject, if someone wants to write about that subject we should encourage and teach them to write neutral encyclopaedic prose about the subject rather than telling them to go away and stop trying because they didn't get it right first time. This will encourage them to become productive Wikipedians, which benefits the project far more than the alternatives, which include them trying to sneak promotional content onto Wikipedia and/or paying someone else to do that. So, to answer your question, DRV absolutely should restore (to draft or userspace) articles about notable subjects speedily deleted per G11. Thryduulf (talk) 21:09, 11 December 2024 (UTC)Reply
If the material is truly unambiguous advertising, then there's no point in restoring it. Unambiguous advertising could look like this:
"Blue-green widgets are the most amazing widgets in the history of the universe, and they're on sale during the holiday season for the amazingly low, low prices of just $5.99 each. Buy some from the internet's premier distributor of widgets today!"
If it's really unambiguous advertising to this level, then you don't need a REFUND. (You might ask an admin to see if there were any independent sources they could share with you, though.)
When it's not quite so blatant, then a REFUND might be useful. Wikipedia:Identifying blatant advertising gives some not-so-blatant, not-so-unambiguous examples of suspicious wording, such as:
  • It refers to the company or organization in the first-person ("We are a company based out of Chicago", "Our products are electronics and medical supplies").
This kind of thing makes me suspect WP:PAID editing, but it's not irredeemable, especially if it's occasional, or that's the worst of it. But in that case, it shouldn't have been deleted as G11. WhatamIdoing (talk) 21:37, 12 December 2024 (UTC)Reply
Blanket permission to restore every G11 to userspace or draftspace might make sense if you're, say, an admin who's mentioned G11 only once in his delete logs over the past ten years. Admins who actually deal with this stuff are going to have a better feel for how many are deleted from userspace or draftspace to begin with (just short of 92% in 2024) and how likely a new user who writes a page espousing how "This technical expertise allows him to focus on the intricate details of design and construction, ensuring the highest standards of quality in every watch he creates" is to ever become a productive Wikipedian (never that I've seen). If it wasn't entirely unsalvageable, it wasn't a good G11 to begin with. —Cryptic 14:05, 13 December 2024 (UTC)Reply

Some administrators have semi-protected their talk pages due to abuse by unregistered editors. An appellant at DRV complained that they were unable to request the deleting administrator about a G11 because the talk page was semi-protected, and because WP:AN was semi-protected. An editor said that this raised Administrator Accountability issues. My question is whether they were correct about administrator accountability issues. My own thought is that administrator accountability is satisfied if the administrator has incoming email enabled, but the question was raised by an experienced editor, and I thought it should be asked. Robert McClenon (talk) 20:44, 11 December 2024 (UTC)Reply

Administrators need to be reasonably contactable. Administrators explicitly are not required to have email enabled, and we do not require other editors to have email enabled either (a pre-requisite to sending an email through Wikipedia), and several processes require leaving talk pages messages for administrators (e.g. ANI). Additionally, Sending an email via the Special:EmailUser system will disclose your email address, so we cannot compel any editor to use email. Putting this all together, it seems clear to me that accepting email does not automatically satisfy administrator accountability. Protecting talk pages to deal with abuse should only be done where absolutely necessary (in the case of a single editor doing the harassing, that editor should be (partially) blocked instead for example) and for the shortest amount of time necessary, and should explicitly give other on-wiki options for those who cannot edit the page but need to leave the editor a message. Those alternatives could be to leave a message on a different page, to use pings, or some other method. Where no such alternatives are given I would argue that the editor should use {{help me}} on their own talk page, asking someone else to copy a message to the admin's talk page. Thryduulf (talk) 21:22, 11 December 2024 (UTC)Reply
I think this is usually done in response to persistent LTA targeting the admin. I agree it should be kept short. We've also seen PC being used to discourage LTA recently, perhaps that could be an option in these cases. Just Step Sideways from this world ..... today 21:29, 11 December 2024 (UTC)Reply
You can't use PC on talk pages. See Wikipedia:Pending changes#Frequently asked questions, item 3. CambridgeBayWeather (solidly non-human), Uqaqtuq (talk), Huliva 00:30, 12 December 2024 (UTC)Reply
Very few admins protect their talk pages, and for the ones I'm aware of, it's for very good reasons. Admins do not need to risk long-term harassment just because someone else might want to talk to them.
It would make sense for us to suggest an alternative route. That could be to post on your own talk page and ping them, or it could be to post at (e.g.,) WP:AN for any admin. The latter has the advantage of working even when the admin is inactive/no longer an admin. WhatamIdoing (talk) 21:42, 12 December 2024 (UTC)Reply
It's covered at Wikipedia:Protection policy#User talk pages. CambridgeBayWeather (solidly non-human), Uqaqtuq (talk), Huliva 22:54, 12 December 2024 (UTC)Reply
That says "Users whose talk pages are protected may wish to have an unprotected user talk subpage linked conspicuously from their main talk page to allow good-faith comments from users that the protection restricts editing from."
And if they "don't wish", because those pages turn into harassment pages, then what? WhatamIdoing (talk) 19:33, 13 December 2024 (UTC)Reply
Then it can be dealt with. But an admin shouldn't be uncommunicative. CambridgeBayWeather (solidly non-human), Uqaqtuq (talk), Huliva 19:52, 13 December 2024 (UTC)Reply
Would there be value is changing that to requiring users whose talk page is protected to conspicuously state an alternative on-wiki method of contacting them, giving an unprotected talk subpage as one example method? Thryduulf (talk) 21:40, 13 December 2024 (UTC)Reply
For admins yes. But for regular editors it could depend on the problem. CambridgeBayWeather (solidly non-human), Uqaqtuq (talk), Huliva 23:01, 13 December 2024 (UTC)Reply
In general user talk pages shouldn't be protected, but there may be instances when that is needed. However ADMINACCT only requires that admins respond to community concerns, it doesn't require that the talk pages of an admin is always available. There are other methods of communicating, as others have mentioned. There's nothing in ADMINACCT that says a protected user talk page is an accountability issue. -- LCU ActivelyDisinterested «@» °∆t° 22:33, 12 December 2024 (UTC)Reply

Question(s) stemming from undiscussed move

edit

"AIM-174 air-to-air missile" was moved without discussion to "AIM-174B." Consensus was reached RE: the removal of "air-to-air missile," but no consensus was reached regarding the addition or removal of the "B." After a no-consensus RM close (which should have brought us back to the original title, sans agreed-upon unneeded additional disambiguator, in my opinion), I requested the discussion be re-opened, per pre-MRV policy. (TO BE CLEAR; I should have, at this time, requested immediate reversion. However, I did not want to be impolite or pushy) The original closer -- Asukite (who found for "no consensus") was concerned they had become "too involved" in the process and requested another closer. Said closer immediately found consensus for "AIM-174B." I pressed-on to a MRV, where an additional "no consensus" (to overturn) finding was issued. As Bobby Cohn pointed-out during the move review, "I take issue with the participating mover's interpretation of policy 'Unfortunately for you, a no consensus decision will result in this article staying here' in the RM, and would instead endorse your idea that aligns with policy, that a no consensus would take us back the original title, sans extra disambiguatotr."

The issues, as I see them, are as-follows:

WP:RMUM: The move from “AIM-174 air-to-air missile” to “AIM-174B” was conducted without discussion, and I maintain all post-move discussions have achieved "no consensus."

Burden of Proof: The onus should be on the mover of the undiscussed title to justify their change, not on others to defend the original title. I refrained from reverting prior to initiating the RM process out of politeness, which should not shift the burden of proof onto me.

Precedent: I am concerned with the precedent. Undiscussed moves may be brute-forced into acceptance even if "no consensus" or a very slim consensus (WP:NOTAVOTE) is found?

Argument in-favor of "AIM-174:" See the aforementioned RM for arguments in-favor and against. However, I would like to make it clear that I was the only person arguing WP. Those in-favor of "174B" were seemingly disagreeing with my WP arguments, but not offering their own in-support of the inclusion of "B." That said, my primary WP-based argument is likely WP:CONSISTENT; ALL U.S. air-to-air-missiles use the base model as their article title. See: AIM-4 Falcon, AIM-26 Falcon, AIM-47 Falcon, AIM-9 Sidewinder, AIM-7 Sparrow, AIM-54 Phoenix, AIM-68 Big Q, AIM-82, AIM-95 Agile, AIM-97 Seekbat, AIM-120 AMRAAM, AIM-132, AIM-152 AAAM, AIM-260. 174"B" is unnecessary while violating consistency.

Do my policy contentions hold any weight? Or am I mad? Do I have any path forward, here?

TO BE CLEAR, I am not alleging bad faith on behalf of anyone, and I am extremely grateful to all those who have been involved, particularly the RM closer that I mentioned, as well as the MRV closer, ModernDayTrilobite. I would like to make it clear that this isn't simply a case of a MRV 'not going my way.' Again, I am concerned w/ the precedent and with the onus having been shifted to me for months. I also apologize for the delay in getting this here; I originally stopped-over at the DRN but Robert McClenon kindly suggested I instead post here.MWFwiki (talk) 00:08, 12 December 2024 (UTC)Reply

Are you familiar with Wikipedia:Article titles#Considering changes? Do you think you understand why that rule exists? WhatamIdoing (talk) 23:31, 12 December 2024 (UTC)Reply
I am quite familiar with it. It seemingly supports my argument(s), so...? Is there a particular reason you're speaking in quasi-riddles? MWFwiki (talk) 01:11, 13 December 2024 (UTC)Reply
If yours is the title favored by the policy, then none of this explanation makes any difference. You just demand that it be put back to the title favored by the policy, and editors will usually go along with it. (It sometimes requires spelling out the policy in detail, but ultimately, most people want to comply with the policy.)
If yours is not the title favored by the policy, then the people on the other 'side' are going to stand on policy when you ask to move it, so you'd probably have to get the policy changed to 'win'. If you want to pursue that, you will need to understand why the rule is set this way, so that you have a chance of making a convincing argument. WhatamIdoing (talk) 05:24, 13 December 2024 (UTC)Reply
I think several individuals involved in this process have agreed that the default title is the favored title, at least as far as WP:TITLECHANGES, as you say.
(The only reason I listed any further ‘litigation’ here is to show what was being discussed in-general for convenience’s sake, not necessarily to re-litigate)
However, at least two individuals involved have expressed to me that they felt their hands were tied by the RM/MRV process. Otherwise, as I mentioned (well, as Bobby_Cohn mentioned) the train of thought seemed to be “well, I don’t want the title to be changed,” and this was seemingly enough to override policy. Or, at best, it was seemingly a “well, it would be easier to just leave it as-is” sort of decision.

And again, I, 100%, should have been more forceful; The title anhould have been reverted per the initial “no consensus” RM-closure and I will certainly bear your advice in-mind in the future. That said, I suppose what I am asking is would it be inappropriate to ask the original RM-closer to revert the article at this point, given how much time is past?

MWFwiki (talk) 06:29, 13 December 2024 (UTC)Reply
Given what was written in Talk:AIM-174B#Requested move 20 September 2024 six weeks ago, I think that none of this is relevant. "Consensus to keep current name" does not mean that you get to invoke rules about what happens when there is no consensus. I suggest that you give up for now, wait a long time (a year? There is no set time, but it needs to be a l-o-n-g time), and maybe start a new Wikipedia:Requested moves (e.g., in 2026). WhatamIdoing (talk) 19:41, 13 December 2024 (UTC)Reply
Thanks! MWFwiki (talk) 05:09, 14 December 2024 (UTC)Reply
Everything ModernDayTrilobite advised you of is correct. Vpab15 closed the RM and determined that consensus was reached. Nothing since then has overturned or otherwise superseded Vpab15's closure. Therefore that closure remains in force. You already challenged the validity of Vpab15's closure at move review, and you have no avenue for challenging it again. Your best bet is to wait a tactful amount of time (several months) before starting another RM. And in that RM, none of this procedural stuff will matter, and you will be free to focus just on making the clearest, simplest case for why AIM-174 is the best title. Adumbrativus (talk) 06:10, 13 December 2024 (UTC)Reply
I suppose my issue is better summed-up by my above discussion with WhatamIdoing; The MRV shouldn’t have been required. That burden should never have been on me. The title should have been reverted at the initial “no consensus” per WP:TITLECHANGES. Otherwise, undiscussed moves — when challenged — may now be upheld by either consensus or no consensus? This is not what WP:TITLECHANGES says, obviously. That said I take full responsibility for not being clearer with this argument, and instead focusing on arguing for a ‘different’ title, when I should have been arguing for the default title per TITLECHANGES. MWFwiki (talk) 06:33, 13 December 2024 (UTC)Reply
You've repeatedly pointed to the initial self-reverted closure as if it's somehow significant. It isn't. Asukite voluntarily decided to close the discussion, and voluntarily self-reverted their decision to close. It doesn't matter whether you asked for it or someone else asked or no one asked. They had the right to self-revert then, for any reason or no reason. The net result is the same as if Asukite had never closed it at all. Only Vpab15's closure, which was 100% on Vpab15's own authority and 0% on the supposed authority of the annulled earlier closure, is binding. Adumbrativus (talk) 09:22, 13 December 2024 (UTC)Reply
I don't disagree with your latter statement, but why would an initial finding of no-consensus not matter? It should have brought us back to the default title, not simply been reverted. Because that policy wasn't followed, I'm here now, is my point. Regardless, I understand; Thank you for your advice! Well, I appreciate your time and consideration! :-) MWFwiki (talk) 05:08, 14 December 2024 (UTC)Reply
(Involved at the MRV) Seeing as I've been tagged in this multiple times and quoted, I'll give my thoughts on this. I don't want to accuse MWFwiki of selectively quoting me but I do think that my quote above was, when taken into account with the following discussion, more about meta-conversation about the correct policy to implement in the event the MRV went the other way. I explicitly said in the immediately following message the view that the close was not outside the scope of WP:RMCI is reasonable and good faith interpretation. I do think this close was within bounds, and the following MRV appropriately closed and summarised.
Yes, had the original close of no consensus stood, then it could have been reverted wholecloth. It was self-reverted and therefore plays no role in the consideration of the subsequent closure. We're always going to take the most recent finding of consensus to be what holds. It seems to have been said in the above that had the no consensus closure held and the appropriate WP:RMNCREV policy been applied, then the appellant here would have gotten their preferred outcome. But to continue to argue this in the face of the subsequent developments is where this enters wikilawyering territory. I think that since then, the appellant has continued to make policy arguments that would be better suited for a subsequent and focused RM on the actual title rather than wikilawyer about a previous close that was self-reverted and continuing to argue policy.
There's nothing for this venue to really change in regards to that AT and the discussion to change the AT would need to be had at the articles talk page. My sincere advice to appellant is to wait a reasonable amount of time and make strong policy based arguments about the preferred title (don't just quote policy, we editors are good at clicking links and reading it for ourselves—quoting nothing but policy back at us makes us feel like you've taken us for fools; instead provide facts and sources that support the relevant policies and link those). Spend some time at WP:RMC and see what well-argued and successful RMs typically look like. Bobby Cohn (talk) 17:38, 17 December 2024 (UTC)Reply

CSD A12. Substantially written using a large language model, with hallucinated information or fictitious references

edit

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


When fixing up new articles, I have encountered articles that appear to have been substantially generated by AI, containing hallucinated information. While these articles may not meet other criteria for speedy deletion, as the subjects themselves are sometimes real and notable, waiting for seven days to PROD the articles is inefficient. I recommend designating WP:A12 for the speedy deletion of these articles. I have created a template (User:Svampesky/Template:Db-a12) if it is successful. A recent example is the article on the Boston University Investment Office, where the author explicitly disclosed that it was created using a large language model and contains references to sources don't exist. I initially G11'd it, as it seemed the most appropriate, but was declined, and the article was subsequently PRODed. Svampesky (talk) 21:13, 12 December 2024 (UTC)Reply

CSD are generally limited to things that are unambiguously obvious. I image the number of cases in which it's unabiguously obvious that the entire page was generated by an LLM (as opposed to the editor jut using the LLM to generate references, for example) are small enough that it doesn't warrant a speedy deletion criterion. --Ahecht (TALK
PAGE
)
21:29, 12 December 2024 (UTC)Reply
I like this idea but agree that it's better not as a CSD but perhaps its own policy page. Andre🚐 21:33, 12 December 2024 (UTC)Reply
I don't think it even merits a policy page. The number of cases where the LLM use is objectively unambiguous, and the article content sufficiently problematic that deletion is the only appropriate course of action and it cannot be (speedily) deleted under existing policy is going to be vanishingly small. Even the OP's examples were handled by existing processes (PROD) sufficiently. Thryduulf (talk) 22:11, 12 December 2024 (UTC)Reply
@Svampesky, when you say that Wikipedia:Proposed deletion is "inefficient", do you mean that you don't want to wait a week before the article gets deleted? WhatamIdoing (talk) 23:32, 12 December 2024 (UTC)Reply
My view is that Wikipedia:Proposed deletion inefficient for articles that clearly contain hallucinated LLM-generated content and fictitious references (which almost certainly will be deleted) in the mainspace for longer than necessary. Svampesky (talk) 00:03, 13 December 2024 (UTC)Reply
Efficiency usually compares the amount of effort something takes, not the length of time it takes. "Paint it and leave it alone for 10 minutes to dry" is the same amount of hands-on work as "Paint it and leave it alone for 10 days to dry", so they're equally efficient processes. It sounds like you want a process that isn't less hands-on work/more efficient, but instead a process that is faster.
Also, if the subject qualifies for an article, then deletion isn't necessarily the right solution. Blanking bad content and bad sources is officially preferred (though more work) so that there is only verifiable content with one or more real sources left on the page – even if that content is only a single sentence.
Efficiency and speed is something that many editors like. However, there has to be a balance. We're WP:HERE to build an encyclopedia, which sometimes means that rapidly removing imperfect content is only the second or third most important thing we do. WhatamIdoing (talk) 00:43, 13 December 2024 (UTC)Reply
  • This part as the subjects themselves are sometimes real and notable is literally an inherent argument against using CSD (or PROD for that matter). WP:TNT the article to a sentence if necessary, but admitting that you're trying to delete an article you know is notable just means you're admitting to vandalism. SilverserenC 00:07, 13 December 2024 (UTC)Reply
    The categorization of my proposal as admitting to vandalism is incorrect. WP:G11, the speedy deletion criterion I initially used for the article, specifies deleting articles that would need to be fundamentally rewritten to serve as encyclopedia articles. Articles that have been generated using large language models, with hallucinated information or fictitious references, would need to be fundamentally rewritten to serve as encyclopedia articles. Svampesky (talk) 00:42, 13 December 2024 (UTC)Reply
    Yes, but G11 is looking for blatant advertising ("Buy widgets now at www.widgets.com! Blue-green widgets in stock today!") It's not looking for anything and everything that needs to be fundamentally re-written. WhatamIdoing (talk) 00:45, 13 December 2024 (UTC)Reply
    (Edit Conflict) How does G11 even apply here? Being written via LLM does not make an article "promotional". Furthermore, even that CSD criteria states If a subject is notable and the content could plausibly be replaced with text written from a neutral point of view, this is preferable to deletion. I.e. TNT it to a single sentence and problem solved. SilverserenC 00:46, 13 December 2024 (UTC)Reply
  • The venue for proposing new criteria is at Wikipedia talk:Criteria for speedy deletion. So please make sure that you don't just edit in a new criterion without an RFC approving it, else it will be quickly reverted. Graeme Bartlett (talk) 00:20, 13 December 2024 (UTC)Reply
    Since we are talking about BLPs… the harm of hallucinated information does need to be taken very seriously. I would say the first step is to stubbify.
    However, Deletion can be held off as a potential second step, pending a proper BEFORE check. Blueboar (talk) 01:06, 13 December 2024 (UTC)Reply
    If the hallucination is sufficiently dramatic ("Joe Film is a superhero action figure", when it ought to say that he's an actor who once had a part in a superhero movie), then you might be able to make a good case for {{db-hoax}}. WhatamIdoing (talk) 05:26, 13 December 2024 (UTC)Reply
    I have deleted an AI generated article with fake content and references as a hoax. So that may well be possible. Graeme Bartlett (talk) 12:23, 13 December 2024 (UTC)Reply
Isn't this covered by WP:DRAFTREASON? Gnomingstuff (talk) 20:34, 13 December 2024 (UTC)Reply
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

AFD clarification

edit

The Articles for deletion article states that: If a redirection is controversial, however, AfD may be an appropriate venue for discussing the change in addition to the article's talk page.

Does this mean that an AFD can be started by someone with the intent of redirecting instead of deleting? Plasticwonder (talk) 04:06, 13 December 2024 (UTC)Reply

Yes. If there is a contested redirect, the article is restored and it is brought to AfD. voorts (talk/contributions) 04:34, 13 December 2024 (UTC)Reply
I think the ideal process is:
  • Have an ordinary discussion on the talk page about redirecting the page.
  • If (and only if) that discussion fails to reach consensus, try again at AFD.
I dislike starting with AFD. It isn't usually necessary, and it sometimes has a feel of the nom trying to get rid of it through any means possible ("I'll suggest a WP:BLAR, but maybe I'll be lucky and they'll delete it completely"). WhatamIdoing (talk) 05:31, 13 December 2024 (UTC)Reply
Would need some stats on the it isn't usually necessary claim, my intuition based on experience is that if a BLAR is contested it's either dropped or ends up at AfD. CMD (talk) 05:48, 13 December 2024 (UTC)Reply
I agree with that. From what I have seen at least, if redirecting is contested, it then is usually discussed at AFD, but that's just me. Plasticwonder (talk) 08:42, 13 December 2024 (UTC)Reply
It depends how active the respective talk pages are (redirected article and target), but certainly for ones that are quiet AfD is going to be the most common. Thryduulf (talk) 09:33, 13 December 2024 (UTC)Reply
It will also depend on whether you advertise the discussion, e.g., at an active WikiProject. WhatamIdoing (talk) 19:44, 13 December 2024 (UTC)Reply
I usually just go straight to AfD. I've found that editors contesting redirects usually !vote keep and discussing on talk just prolongs the inevitable AfD. voorts (talk/contributions) 14:58, 13 December 2024 (UTC)Reply
Gotcha. Plasticwonder (talk) 15:29, 13 December 2024 (UTC)Reply
Looking at the above comments: What is it about the Wikipedia:Proposed article mergers process that isn't working for you all? If you redirect an article and it gets reverted, why aren't you starting a PM? WhatamIdoing (talk) 21:37, 16 December 2024 (UTC)Reply
For me, it's lack of participation, no tool to list something at PAM, and no relisting option so proposed merges just sit for a very long time before being closed. voorts (talk/contributions) 23:21, 16 December 2024 (UTC)Reply
What voorts said. Multiple times now I've floated the idea of making PAM more like RM, one of these years I should really get around to doing something more than that. I won't have time before the new year though. Thryduulf (talk) 23:45, 16 December 2024 (UTC)Reply
I think PAM should be merged into AfD, since both generally involve discussions of notability. voorts (talk/contributions) 00:00, 17 December 2024 (UTC)Reply
Merging often involves questions of overlap and topical distinction rather than just notability, although this also ends up discussed at AfD. I do wonder if this would leave proposals to split out in the cold though, as much like merge discussions they just sit there. CMD (talk) 04:00, 17 December 2024 (UTC)Reply
The most important tool is Twinkle > Tag > Merge. I personally prefer its "Merge to" option, but there's a plain "Merge" if you don't know exactly which page should be the target.
All merges get bot-listed in Wikipedia:Article alerts. Wikipedia:Proposed article mergers is another place to advertise it, and I'd bet that Twinkle could post those automatically with relatively little work (an optional button, similar to notifying the creator of deletion plans).
I dislike "relisting"; things should just stay open as long as they need to, without adding decorative comments about the discussion not happening fast enough. In my experience, merge proposals stay open because everyone's agreed on the outcome but nobody wants to do the work. WhatamIdoing (talk) 06:46, 17 December 2024 (UTC)Reply
In this context isn't redirection a *type* of deletion (specifically delete while leaving a redirect)? Horse Eye's Back (talk) 07:05, 17 December 2024 (UTC)Reply
I would think so. Plasticwonder (talk) 07:33, 17 December 2024 (UTC)Reply
It's only a deletion if an admin pushes the delete button. Blanking and redirecting – even blanking, redirecting, and full-protecting the redirect so nobody can un-redirect it – is not deletion. WhatamIdoing (talk) 07:34, 18 December 2024 (UTC)Reply
That might be clear to you (and the other admins) but almost nobody in the general community understands that (to the point where I would say its just wrong, deletion is broader than that in practice). Horse Eye's Back (talk) 16:23, 18 December 2024 (UTC)Reply
Well, it has always been clear to me, and I am not, and have never wished to be, an admin. But, then again, I am a bit strange in that I expect things to be as people say that they will be. Phil Bridger (talk) 18:34, 18 December 2024 (UTC)Reply
Contested redirects going to AfD makes sense. Articles are redirected for the same reasons they're deleted and redirecting is probably the most common ATD. I've opened plenty of AfDs where my nom recommends a redirect instead of deletion, including when I've BLARed an article and had the BLAR reverted. voorts (talk/contributions) 18:38, 18 December 2024 (UTC)Reply
If a redirect has already been discussed or attempted, and consensus can't be reached easily, then I've got no problem with AFD. What I don't want to see is no discussion, no bold redirects, nobody's even hinted about a merge, and now it's at AFD, when the problem could have been resolved through a less intense method. WhatamIdoing (talk) 19:07, 18 December 2024 (UTC)Reply

Dispute and conflict in the Autism article, would not let add "unbalanced" tag

edit
The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
This is now on multiple different pages.[7][8][9][10] Please discuss at Talk:Autism. WhatamIdoing (talk) 20:09, 13 December 2024 (UTC)Reply

Hello all, I can only occasionally attend Wikipedia to edit or respond. I recently went through the current version of the Wikipedia article on Autism , and I found that this article is NOT representing the reality or encyclopedic wholeness. The huge, verbose, highly technical article is biased towards medical model of disability, medical genetics, and nearly zero information regarding the anthropology, evolution, neurodiversity, accommodation, accessibility, Augmentative and alternative communications, and all that actually helps wellbeing of Autistic people. The page boldly focuses on controversial methods such as ABA, such as EIBI (Early intensive behavioral interventions), DTT (discrete trial training) etc. without any mention of the concerns or criticisms against them. I entered the talk page, but it has been turned literally into a warzone, where any dissenting viewpoint is being silenced in name of "global and unanimous scientific consensus" which is simply wrong. It is mostly a view held by biomedical and pharmaceutical majority. But outside of that, opposing viewpoints do exist in actual Autistic populations (who have the lived experience), anthropology, sociology, psychology, etc. I added an "unbalanced" tag for reader information (I did not speak for complete erasure of controversial viewpoints, just needed the reader to know that there are other views), however the "unbalanced" tag was soon reverted.

It is not possible for me to daily attend and post arguments and counter-arguments. I have to acknowledge that, if this kind of silencing continues, this time Wikipedia literally failed as an encyclopedia, as well it failed at public health and education welfare perspective.

I feel like this needs editors' attention. Autism is NOT a well-understood condition by majority, Lived experience play the ultimate role on how a person feel about their life situation, and Nothing about us without us is an important ethics rule in disability cultures.


It worth mentioning, each disabilities are unique, and their lived experiences are different. There are generally 2 paradigms:

  • (1) As if there is a fixed, "normal", gold standard "healthy people", a deviation from that is a pathology, and the society is flawless and 'just'; and any outliers must be assimilated or conformed into the mainstream, or eradicated. It externally defines what is a good life.
  • (2) The second paradigm says, a disability (better said disablement, or dis-abled as a verb) is that the human bodies and minds are inherently diverse, varying, and evolving, with no single fixed "one size fits all" baseline. Also a same person can vary in multiple dimensions (such as seen in Twice exceptional), and the value of a person shouldn't depend on productivity, coincidence of wants is a fallacy; society is NOT just, it needs to be accommodated.

It seems most disabilities fall between a spectrum between a medical impairment and a social incompatibility, rather than purely one end. However, Autism, being mostly a social and communication difference, falls mostly in the second type, and seem to be addressed better with the second (inside out) approach.


If we keep arguing from a narrow perspective of medical biology, we would never know the entire scenario. RIT RAJARSHI (talk) 06:26, 13 December 2024 (UTC)Reply

Without commenting on the actual topic, I would say this sounds like just a content dispute localised on one article, and should be undertaken at the talk page rather than here. If there are reliable relevant sources that are in scope then this topic could be added to the article, but it is your responsibility to find those sources and to defend them if questioned. BugGhost 🦗👻 11:35, 13 December 2024 (UTC)Reply
Thank you, but the dispute is too intense. Also some policies like "nothing about us without us" should be in Wikipedia policy, especially about when a majority voice can suppress a marginalized voice. Esp. information those affect minority groups. Or the voices not well represented, and therefore needs amplification. RIT RAJARSHI (talk) 12:39, 13 December 2024 (UTC)Reply
I've just had a look at the talk page, and I don't think it is by any means too intense. You said your view, with minimal sources, and @Димитрий Улянов Иванов replied cordially to you addressing your concerns. Your reply was to say "stop name calling" (I couldn't see any evidence of name calling) and not much else. Again: I'm not commenting on the actual substance of your point of view - just that, as it stands, the talk page is the right place for this discussion, and you should engage with it in earnest with sources to back your view up. (I support any editor who wants to hat this section.) BugGhost 🦗👻 14:52, 13 December 2024 (UTC)Reply
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

RfC: Voluntary RfA after resignation

edit

Should Wikipedia:Administrators#Restoration of admin tools be amended to:

  • Option 1 – Require former administrators to request restoration of their tools at the bureaucrats' noticeboard (BN) if they are eligible to do so (i.e., they do not fit into any of the exceptions).
  • Option 2 – Clarify