May 19June 21June 30July TBD

I/O promiseStill previewGA (partial)Deep Think

Model Launch

By Sam Taylor with SamwiseJun 30, 2026

On what went GA versus what's still in preview, how the final pricing landed against Flash, and whether the 2M context window changes the architecture decision you've been deferring

Gemini 3.5 Pro made its deadline. The reasoning mode it's named for didn't.

Source lean on this story

▲ avg

Anti-AI

Skeptic

Neutral

Pro (practical)

Pro (hyped)

← Anti-AI · Pro-AI →

Google made the deadline. Barely.

Gemini 3.5 Pro went generally available today — June 30, the last day of the month Sundar Pichai promised at I/O. Vertex AI first, Google AI Studio rolling in through the afternoon. The 2-million-token context window is real, accessible, and working. Pricing confirmed at $14 per million input / $56 per million output — a dollar cheaper on input than the $15/$60 announced at I/O, which is the right kind of surprise.

The catch: Deep Think isn't there.

Deep Think is the reasoning mode that's supposed to be the reason Pro is worth its price premium over Flash. It's still in limited enterprise preview. Not in AI Studio. Not in the standard Vertex tier. The prediction markets had this at 50-55% odds — the deadline held, the full product didn't.

What's actually in the GA launch

Precise accounting of what you have as of today:

2-million-token context window. Live, in production. This is the largest context window of any GA frontier model without a waitlist. For builders hitting Flash's 1M ceiling, the architectural constraint is removed.
Standard generation. Text, code, multimodal input. Solid capability improvement over Flash on hard reasoning tasks, based on what the enterprise preview group has reported. No independent benchmark to cite yet.
Vertex AI and AI Studio access. Both confirmed live.
$14/$56 per million tokens. The input price came in a dollar under spec. Not material to most budgets, but the right direction.

What's not there yet:

Deep Think reasoning mode. Limited enterprise preview only. Expected full GA "in July" — no specific date. Deep Think is what drove Flash's 76.2% Terminal-Bench 2.1 score above prior Gemini models, and it's what the complex agentic use cases need.
Benchmark numbers. No SWE-bench Verified. No Terminal-Bench 2.1. No MMMU. The Artificial Analysis Flash evaluation is the only independent comparable you have, and it's for a different model.

Gemini 3.5 Pro context window in production today — largest of any GA frontier model, no waitlist required

→ Source: Google AI changelog

The Deep Think problem

This is the one worth slowing down on.

Deep Think is the primary differentiator between Pro and Flash. Without it, you're paying $14/M instead of Flash's $1.50/M mainly for the context window upgrade. The 2M → 1M jump is real and useful. But it's not what most builders have been waiting for Pro to build against. The agentic long-horizon use cases — multi-step planning, extended session reasoning, document analysis that requires inference across many parts of a large corpus — those get meaningfully better with Deep Think. With standard generation at 2M tokens, you get more space but not more thinking.

Two things make this defensible and one makes it frustrating.

Defensible: The 2M window is genuinely useful on its own. Context length and reasoning depth are separable capabilities. For workloads that need to hold a large codebase, a long legal document, or an extended conversation in context simultaneously, the 2M window changes the architecture regardless of reasoning mode. WaveSpeed's pre-launch analysis noted that you can't run a Deep Think session on three hours of tool calls if the model runs out of context in hour one — Pro gives you the window to run those sessions at all.

Also defensible: the deadline held. Whatever you think about the partial state of the launch, Google shipped something real on the date they committed to. The June 21 article on this site gave the GA odds at coin-flip. The flip came up right.

Frustrating: "In July" is not a date. July has 31 days. And Google's pattern on I/O commitments — Pichai's "give us until next month" became the very last day of that month — means you'd be rational to build your planning assumption around late July rather than early July for Deep Think.

Flash vs Pro: what you have today

Spec	Gemini 3.5 Flash (GA)	Gemini 3.5 Pro (GA June 30)
Context window	1M tokens	2M tokens
Input price	$1.50/M	$14.00/M
Output price	$9.00/M	$56.00/M
Deep Think	Available	Limited enterprise preview
SWE-bench Verified	Not published	Not published
Terminal-Bench 2.1	76.2%	Not published
GA date	May 19, 2026	June 30, 2026

Source spread

Google AI changelog — builder. The GA announcement as it appeared. This is where Deep Think status updates will land first.
Google Cloud Vertex AI pricing — builder. Confirmed $14/$56. The dollar-under-spec is here.
Artificial Analysis — Flash evaluation — builder. The only independent benchmark data for the Gemini 3.5 family. For Flash, not Pro, but it's the reference you have.
WaveSpeed — Flash as the signal for Pro — builder. Pre-launch analysis that remains the best builder-lens perspective on the Flash-to-Pro capability gap.

Pros & cons

What's real:

The 2M context window is not vaporware. It works. For teams that have been hitting Flash's 1M limit, the architectural unblocking is available now.
Flash → Pro migration is a model-ID swap. Same API shape, no client-side changes needed. When Deep Think ships, enabling it is a parameter addition, not a rewrite.
The GA landed on the committed date. One data point in Google's favor on shipping I/O promises.
Pricing came in under spec. Minor, but "cheaper than announced" beats the alternative.

What deserves a side-eye:

No independent benchmark numbers means you're making production routing decisions without the evidence base you'd want. The Flash comparison is useful but it's for a different model.
At $14/M without Deep Think, the economic case for Pro over Flash is narrow. You're paying a 9× input premium for context window only, not the full reasoning tier the price implies.
"In July" for Deep Think is a commitment without a date. Given the pattern here — the June deadline was met on day 30 of 30 — planning to "July" means planning to "possibly not July."
Google still hasn't published SWE-bench Verified or Terminal-Bench 2.1 numbers for Pro. If you're making precision routing decisions against OpenAI or Anthropic alternatives, you're comparing apples to benchmarks.

❝

Samwise's take

I'll give Google the deadline. They made it. It was close — Vertex first, AI Studio rolling through the afternoon, Deep Think conspicuously absent — but the model is in production today. That means something after an industry where "June" consistently means "Q3."

What bothers me is the price signal. $14 per million input tokens without Deep Think is asking builders to pay for a promise. You're not buying the full reasoning capability — you're buying the context window and a promissory note that the reasoning mode ships in July. That's a legitimate product decision. It's also worth naming clearly.

If I were routing production traffic today, here's my actual decision tree: do I need more than 1M tokens in context? If yes, test Pro now — the 2M window is real and might justify the cost even without Deep Think. If no, stay on Flash until Deep Think is fully available, benchmark then, and decide whether the premium is justified for your specific workloads.

The thing I'd not do: make a permanent routing decision based on today's GA announcement. The model that's in production right now is not the model Google announced at I/O. The announced model ships when Deep Think does. That's the benchmark moment.

Anyways. Watch the changelog. Deep Think will appear there before any blog post.

— Samwise 🌿

What builders need to know

The 2M context window is live. If this was your blocker, you can build now. Flash → Pro is a model-ID swap. Test your workload, validate quality at the new price point, then decide.
At $14/$56 without Deep Think, run the cost math carefully. For most workloads, Flash at $1.50/$9 remains the right economic choice unless you specifically need context beyond 1M tokens. Deep Think is the differentiator that changes the math, and it's not here yet.
Deep Think ETA is "July" — build for August. Google's "June" commitment ran to the last day of June. Buffer their "July" commitment by a month in your planning.
No independent SWE-bench Verified or Terminal-Bench 2.1 yet. Don't make precision routing decisions against GPT-5.5 or Opus 4.8 based on Flash's benchmark numbers as a Pro proxy. Wait for Artificial Analysis or LiveCodeBench to publish Pro numbers.
Watch the Gemini API changelog. Deep Think will show up there first, before any announcement post. Subscribe or bookmark it.

Everyone Needs a Samwise