The Claude party is almost over

opinion

Aug 7, 20254 mins

Artificial IntelligenceDevelopment ToolsGenerative AI

Anthropic has pulled the plug on all-you-can-eat LLM code generation. Where will developers turn next?

Cute dog wearing a red party hat against a light-blue background, celebration theme.

AI coding assistants are a frequent topic for InfoWorld and probably most tech publications. However, the dirty secret is that this landscape changes daily. Not long ago Claude Code practically obliterated the competition. How? Well, it wasn’t that the CLI was amazing (or Aider would have run the board a year ago). It was Anthropic’s new Claude Max pricing. $200 got you seemingly all you could eat and not just Sonnet but Opus 4, the company’s higher-end, smarter reasoning model, available through the Claude Code CLI.

The party is over next month. Anthropic just announced new weekly rate limits to go with their already de facto shrunk limits. This means that, for all of the people who have been using Claude to automatically generate code—well, suddenly it won’t be as good of a deal.

Gemini and Qwen

Meanwhile, Google launched an open-source alternative, Gemini CLI. Google’s plan seemed more generous at first, but the free tier throttles you from gemini-2.5-pro to gemini-2.5-flash rather quickly. Flash is kinda dumb, so if you were coding on pro and the tool switches to flash, you’re not going to like the result. The other option (at least at launch) was to give Google your API key, which combines a much more generous free tier and surprise billing (!). Google gonna Google. The service throws HTTP 500 errors at random and the CLI is designed to ask flash if pro should keep talking, sending your whole conversation multiple times. Because you don’t mind the token burn and latency, do you?

I forked Gemini CLI to support other providers, other models, and local models, and to let developers choose things like whether to knock down to flash or not. I wasn’t the only one thinking, “great idea but do I have to use Gemini?” The Qwen folks also forked Gemini CLI as Qwen Code. Weirdly, this tool still calls gemini-2.5-flash for both continues and tools like web-search and web-fetch. One has to wonder how it works in China.

However, Qwen3-Coder (not to be confused with Qwen Code, but also by the Qwen team) is the first open-source model I’ve been able to accept patches from. It isn’t by any means a Claude killer, but it feels like Claude 3.7 Sonnet, maybe even better. This is a huge step up from any previous open-source model. They are always hyped but then a week later forgotten. This is the first one that showed that there is a future for both open-source models and made-in-China models.

Supposedly GPT-5 and an improved o3 (or possibly in combo) will arrive soon. The question is whether OpenAI plans to let Anthropic keep the developer market or if they will make their $200 ChatGPT plan good for Codex? With OpenAI’s designs on Windsurf having fallen apart, if they want to stay in the coding game this is the opportunity.

Google vs. Google

Google clearly wants to win over developers, and their rumored Kingfall model could give them a chance. The trouble with Google is that they are the vendor you pick because they’re cheaper, but they will most likely pull the rug out from under you. The model you were testing may disappear without notice. It may throw HTTP 500s for an hour, but never appear on the incident page. It is why AWS remains the king of the cloud, and why Azure is second.

However, if you’re generating video or something visual, Google certainly has an interesting edge here. Could they make a reliable model and service to match the cheaper price? Do they have to do surprise billing and rug pulls? Do they have to, well, Google? If not, maybe they’re a contender.

Either way, all-you-can-eat Claude Opus is over. People running 32 parallel instances in a swarm all night will have to buy a lot more accounts or pay like 10x more by the token. While it has always been about whose investors will subsidize us most, this change gives the other vendors an opening—not for the hearts and minds, but for the chance to let us developers come in and clean out your refrigerator or at least burn your GPUs for a while.

As the patriot once said, “Give me subsidized LLM code generation or I’ll wait for China.”

by Andrew C. Oliver

Contributing Writer

Follow Andrew C. Oliver on X

Andrew C. Oliver is a columnist and software developer with a long history in open source, databases, and cloud computing. He founded Apache POI and served on the board of the Open Source Initiative. Oliver has helped with marketing in startups including JBoss, Lucidworks, and Couchbase. He advises startups on marketing, growth, and outreach.

Show me more

Topics

About

Policies

Our Network

More

The Claude party is almost over

Anthropic has pulled the plug on all-you-can-eat LLM code generation. Where will developers turn next?

Gemini and Qwen

Google vs. Google

More from this author

It’s time to completely change how data management works

What you absolutely cannot vibe code right now

OpenAI’s o3 price plunge changes everything for vibe coders

What the AI coding assistants get right, and where they go wrong

Sizing up the AI code generators

Why LLM applications need better memory management

Vibe code or retire

5 things to consider before you deploy an LLM

Show me more

Rust Innovation Lab launched, sponsors first project

Is Meta’s $10 billion cloud deal a good idea for you?

What makes JavaScript great

Getting encryption wrong (and getting it right, too)

How to build a native desktop app vs. a web UI app

PyApp: Build click-to-run Python apps with Rust