One super-fun Monday activity i have is going through this blog's comment filter to confirm the nuking of apparent comment Spam. For the longest time (like the last 15 years), it's been quick and easy to spot the Spam comment, which is totally off-topic promotions with a link to a commerce/scam sites. In a recent innovation in Spam comments, a bot apparently mixes up the Spam text with garbled excerpts from the individual post, so it takes a bit more time to identify it as Spam.
Now, evidently thanks to ChatGPT and other Large Language Models, blog comment Spam has taken a great leap in quality!
Pictured at right are a couple recent Spam comments, similar to several others that have been popping up on New World Notes recently. Unlike the Spam of the past, the comments are actually on topic to the specific post, and grammatically correct. (If totally bland.)
It's only when you check the commenter's URL do you realize it's a Spam bot promoting gift cards and obscure software or whatever. (Don't bother looking for the actual comments, they've already been nuked.) Ironically, many of these Spam comments generated by AI are attached to posts about AI.
So that's wonderful! It's great getting to waste an extra 5 minutes a week on comment moderation, so thanks for that, OpenAI.
There are some broader implications here:
There's been a lot of talk about how ChatGPT may replace writers, but in a case like this, I now actually have to do more work. And doubtless anyone running any site that allows reader comments is going through similar headaches right now. Scale that out to tens of thousands of websites, and we're talking about thousands of additional hours spent in comment moderation.
Scale that out further to millions of web sites, and (as Casey Newton writes) you have a crisis:
In December, when I first covered the promise and the perils of ChatGPT, I led with the story of Stack Overflow getting overwhelmed with the AI’s confident bullshit. From there, it was only a matter of time before platforms of every variety began to experience their own version of the problem.
To date, these issues have been covered mostly as annoyances. Moderators of various sites and forums are seeing their workloads increase, sometimes precipitously. Social feeds are becoming cluttered with ads for products generated by bots. Lawyers are getting in trouble for unwittingly citing case law that doesn’t actually exist.
For every paragraph that ChatGPT instantly generates, it seems, it also creates a to-do list of facts that need to be checked, plagiarism to be considered, and policy questions for tech executives and site administrators.
When GPT-4 came out in March, OpenAI CEO Sam Altman tweeted: “it is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.” The more we all use chatbots like his, the more this statement rings true. For all of the impressive things it can do — and if nothing else, ChatGPT is a champion writer of first drafts — there also seems to be little doubt that is corroding the web.
It may be quite possible that the sheer amount of collective human labor hours wasted cleaning up after ChatGPT-generated Spam and other effluvia may outweigh any gains in productivity from ChatGPT. At least that's my strong suspicion, but maybe I should check on that with ChatGPT.
There is one concept of a poisoned well, however. That future AI models will see all this garbage AI output and it will be in their training data, too.
Posted by: Name is required | Saturday, July 29, 2023 at 04:06 PM
Saw the post regarding this on reddit. Eagerly waiting for the updates.
Posted by: basketball stars | Tuesday, December 12, 2023 at 03:31 AM