Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Price is unchanged from Gemini 3 Pro: $2/M input, $12/M output. https://ai.google.dev/gemini-api/docs/pricing

Knowledge cutoff is unchanged at Jan 2025. Gemini 3.1 Pro supports "medium" thinking where Gemini 3 did not: https://ai.google.dev/gemini-api/docs/gemini-3

Compare to Opus 4.6's $5/M input, $25/M output. If Gemini 3.1 Pro does indeed have similar performance, the price difference is notable.



Now compare the monthly plans for business users who want the CLI agent but who don’t want the models trained on their data.

OpenAI: no big deal — sign up, pick your number of seats, and you’re all set.

Anthropic: also no big deal but there’s an obnoxious minimum purchase.

Google: first you have to try to figure out what the product is called. Then you need to figure out how to set the correct IAM rules. Then you have to sign up and pay for it. Maybe you succeed. Maybe you give up after an hour or two of cursing. Gemini is, of course, completely unable to help. (OpenAI clearly has not trained their models on how to operate their tools. Google’s models hallucinate Google’s product offerings so outrageously that I’m not sure I could tell. I haven’t asked Claude about Claude Code.)

At least the monthly pricing is similar once you get over the hurdles.


Well some are using Anthropic on AWS Bedrock which is a bit more like the Google paragraph. Perhaps a good thing that Nova models aren't competitive (and many here are asking "What's a Nova model?"). And remember, many businesses aren't flinching at IAM controls and are asking for data privacy contracts.


Well some are masochists.


There's a reason Google model usage on OpenRouter is so high - it's easier to pay the OpenRouter tax than it is to figure out how to pay Gemini directly.


Google is a cloud provider so API usage is funneled thru GCP. It's the same for Microsoft and Amazon.


By that logic, G Suite should be funneled through GCP.

Also, are you sure you meant to mention Microsoft? Microsoft has this Copilot thing that they will gladly sell you, with generally inoffensive commercial terms, through more channels than you can shake a stick at. Got a $4 GitHub for Teams subscription? Add $20 or so and you will be swimming in Copilot outputs, and all you have to do is check the checkbox.


Got a free Gmail account? Add $20 or so and you'll be swimming in Gemini outputs. Yet both companies also have a cumbersome onboarding process if all you want to do is get an API token. So yeah, quite similar!

I can confirm the products bit, I tried to use Gemini to help with G Suite admin.


If we don't see a huge gain on the long-term horizon thinking reflected with the Vendor-Bench 2, I'm not going to switch away from CC. Until Google can beat Anthropic on that front, Claude Code paired with the top long-horizon models will continue to pull away with full stack optimizations at every layer.


You cannot just directly compare prices like this. It is like comparing share prices, it doesn't really mean much unless you also know how many tokens the models use.

For example, GPT-5.2 is even cheaper than Gemini, but in real-world usage it ends up costing similar amounts to Opus 4.6 because it uses a lot more tokens.


The only thing i don't like about gemini models (gemini cli) is that there's no transparency on which model I'm using. I can start with pro and it can be downgraded sometimes even to gemini 2.5 flash lite.


still no minimal reasoning in G3.1P :(

(this is why Opus 4.6 is worth the price -- turning off thinking makes it 3x-5x faster but it loses only a small amount of intelligence. nobody else has figured that out yet)


You can turn off thinking in Gemini pro models by using completion mode.

Essentially, append a message with role=model and minimal text part, such as a simple "A", at the end of the "contents" array. The model will try to complete the message without using any thought tokens.

You can also set the model message to start with "think" or something along that line and watch it thinks out loud (or melts down with over-thinking and stop due to reaching maximum output token)

``` [ { "parts": [{"text": "hello"}], "role": "user" }, { "parts": [{"text": "*think"}], "role": "model" } ] ```


TIL gemini still supports completion mode, that's super useful!

Thinking is just tacked on for Anthropic's models and always has been so leaving it off actually produces better results everytime.


What about for analysis/planning? Honestly I've been using thinking, but if I don't have to with Opus 4.6 I'm totally keen to turn it off. Faster is better.


I've always just used the "Plan mode" in Claude Code, I don't know if it uses thinking? I have "MAX_THINKING_TOKENS" in my settings.json set to "0", too. Didn't notice a drop in performance, I find it better because it doesn't overthink ("wait, let me try..."). Likely depends on a case-by-case basis (as so often with AI). For me, it's better without thinking.


> Knowledge cutoff is unchanged at Jan 2025.

Isn't that a bit old?


Old relative to its competitors, but the Search tool can compensate for it.


It could in practice. Just get ready for some very interesting thinking tokens, akin to a psychotic break, once it interacts with the "simulated reality" and "the user‘s fabrication of a nonexistent timeline within the hypothetical future".

Gemini 3.0 was convinced that my dependency versions pinned in package.json were hallucinated by an AI, because they "shouldn't yet exist". I just hope this kind of behavior is gone.


Looks like its cheaper than codex ??? this might be interesting then


It's not trained for agentic coding I don't think


Sounds like the update is mostly system prompt + changes to orchestration / tool use around the core model, if the knowledge cutoff is unchanged


knowledge cutoff staying the same likely means they didn't do a new pre-train. We already knew there were plans from deepmind to integrate new RL changes in the post training of the weights. https://x.com/ankesh_anand/status/2002017859443233017


This keeps getting repeated for all kinds of model releases, but isn’t necessarily true. It’s possible to make all kinds of changes without updating the pretraining data set. You can’t judge a model’s newness based on what it knows about.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: