This article was featured in One Great Story, New York’s reading recommendation newsletter. Sign up here to get it nightly.
Devoted users of ChatGPT, the chatbot that has become the mainstream face of consumer AI, have spent the past couple of months trying to solve a frustrating mystery: Is their trusty AI companion — their all-purpose assistant, their accomplice in cheating, their secret co-worker, their patient tutor — becoming less … intelligent?
In May, on Hacker News, worried ChatGPT devotees shared their hunches. “The responses feel a little cagier at times than they used to,” wrote one. “I assume it’s trying to limit hallucinations in order to increase public trust in the technology.” Another user agreed: “To me, it feels like it’s started giving superficial responses and encouraging follow-up elsewhere.” Others were more confident in a change. “There’s no doubt that it’s gotten a lot worse on coding,” one wrote. Peter Yang, a product lead at Roblox, lent his credibility to the conspiracy:
By July, the theory had gained steam as a post titled “I use chatGPT for hours every day and can say 100% it’s been nerfed over the last month or so” shot to the top of the ChatGPT sub-Reddit, garnering thousands of concurring replies. “It’s definitely gotten dumber,” wrote one person who uses ChatGPT to code. “It was amazing how intricate and novel it would write things. Now, it feels very cookie-cutter,” wrote another. Users shared examples of tasks that ChatGPT formerly handled well that it struggled with today and offered theories as to what was happening.
Is OpenAI optimizing for performance and trying to lower cost? Are its attempts to “align” the product — or, in another interpretation, to censor its output — interfering with its performance? Was ChatGPT, the fastest-growing online service in history, just speedrunning the process of corporate “enshittification?”
In a Friday Twitter space discussing his new AI firm, xAI, Elon Musk suggested that “trying to explicitly program morality into AI” had doomed OpenAI, which he helped found but has since disavowed. His “embryonic” firm, staffed with dozens of industry heavyweights and powered by at least 10,000 GPUs, wouldn’t let “political correctness” interfere with its products, or — to raise the stakes as high as possible — with its efforts to develop AGI and to settle “fundamental” questions about the universe that humans haven’t been able to crack.
Other users and commentators have pushed back, suggesting that this was simply a case of diminishing novelty or of people underestimating their own capacity for adjustment. New tech products are often temporarily dazzling; early adopters were accusing OpenAI of “nerfing” ChatGPT — video-game slang for reducing its effectiveness — about a month after its public release. Maybe, now that OpenAI is charging for access to the latest version of ChatGPT, this was just a case of grateful users being replaced by expectant paying customers. Eventually, OpenAI’s VP of product weighed in:
This settled nothing, of course — the customer is always right, even if the customer appears to have lost touch with reality and a sense of self after prompting a chatbot thousands of times a day — and soon there were formal contributions to the subject.
Comprehensively testing a product like ChatGPT is difficult. The range of tasks it might perform is wide, as is the range of ways to prompt it to complete said tasks; the industry’s own benchmarks are imperfect and hotly contested; it’s a proprietary product trained on data its parent company is not eager to discuss. Attempts to sketch out the model’s limits from the outside amount, in the end, to prompting a secretive and unreliable narrator to describe itself.
It’s also, maybe, not the point. One possibility is that ChatGPT’s most avid users — people who have found their purposes for the chatbot and who functioned as unpaid and in some cases paying testers for OpenAI — are in fact sensing a real shift beneath their feet, albeit not precisely, or exclusively, the one they’re worried about. Their assumption, as users of OpenAI’s flagship consumer product, is that the company’s goal is to make ChatGPT better and more capable in ways that matter to them; their mistake is assuming that their experiences are the ones that OpenAI cares about in the long run.
Consider another bit of news concerning OpenAI this week:
Microsoft shares rose as much as 5.8% Tuesday after the company announced a new artificial intelligence subscription service for Microsoft 365. The company will charge users an additional $30 per month for the use of generative AI with tools such as Teams, Excel and Word.
Microsoft, a multitrillion-dollar firm, is by far the largest investor in OpenAI, reportedly pouring more than $10 billion into the company so far. In exchange, Microsoft gets to integrate OpenAI’s technology across its product line: In its office-productivity software, its meetings software, its search engine, and through GitHub, its hugely popular programming platform. Microsoft seems to believe its customers — largely businesses purchasing access to Microsoft software for their employees — are willing to pay an extremely high premium for tools that promise more productivity via generative AI, nearly doubling the price of its most-expensive subscription tier and tripling the price of its standard subscription.
As Microsoft places its bet on massive growth in paid subscriptions, OpenAI’s most visible product might be slowing down, according to Reuters and SimiliarWeb:
Worldwide desktop and mobile traffic to the ChatGPT website decreased by 9.7% in June from May, while unique visitors to ChatGPT’s website dropped 5.7%. The amount of time visitors spent on the website was also down 8.5%, the data shows.
Through its own API, and through cloud computing services offered by Microsoft, OpenAI has been courting corporate customers of its own — in other words, clients who will pay far more than a few tens of dollars a month for back-end access to OpenAI’s tech. A future in which OpenAI’s products are primarily experienced through enterprise software — which is to say, at work, supplied by users’ employers — isn’t just a plausible outcome but one that Microsoft and OpenAI are banking on.
ChatGPT is, in hindsight, a terribly misleading avatar for the new generation of AI. Despite “eye-watering” operational costs, it was initially free for anyone to try, and, during this unsustainable period, more than 100 million people did. It was consumer-oriented and open-ended, inviting users to figure out how they could use it to their own advantage: to speed up their work as programmers, copywriters, emailers, or students; to become suddenly and mysteriously more productive. It suggested a world in which the benefits of automation might accrue first to enterprising users — a situation in which productivity flourished between OpenAI and its users, or individual customers, to their mutual and mostly exclusive benefit. This provided, for the most avid ChatGPT users, a feeling of genuine empowerment or relief and a sense of futuristic exclusivity. They were enjoying an unusually intense version of the early adopter experience — a sensation of living slightly in the future, ahead of everyone else. What came next was always going to be a disappointment, even if it was just everyone else catching up. In reality, for ChatGPT users who thought they were at the bleeding edge of automation, and who spent a year probing the limits of the model, it might be a bit worse than that: They’ve been doing research and development for products that their currently oblivious bosses — fools! — will soon expect them to use. They weren’t getting an edge. They were using a tech demo. They were helping with market research and marketing, and their work is nearly done.
In contrast, Google’s comparatively buzzless (and slightly delayed) AI rollout is instructive in a couple of ways. A companywide integration of AI features — in its productivity software, search engine, programming tools, and virtually everywhere else across its product offerings — has taken clear priority over its stand-alone chatbot product, which was too late and probably too cautious to compete for attention with ChatGPT. With purpose-built generative helpers coming to basically every Google product, the use cases for a chat-interface omnibot are even narrower: Why bother prompting a computer pretending to be a person for help when your email software now has a button labeled “Help Me Write”? Google seems to be executing a plan similar to Microsoft’s and OpenAI’s without any sense that it’s pivoting or changing the subject. It’s both doing the obvious and in a business sense necessary thing: slotting new productivity tech into existing productivity software. Years of disorienting hype about how AI was about to change everything are rather quickly reconverging on the two dominant productivity suites. Only in OpenAI’s case does this seem like a change, and only because ChatGPT suggested to its users that it, and they, were up to something else.
ChatGPT’s worried power users aren’t quite wrong, in other words; they’re just focused on the wrong question. Maybe ChatGPT is getting dumber! Maybe its users are running out of new things to do with it. Just as likely is the possibility that it has served its primary purpose, not just as a tool through which OpenAI could explore the potential uses of its own models with the help of millions of testers but as a historically effective fundraising tool for its parent company. Whether or not it’s changing, it might be leaving them behind.