Vol. 1 · Edition 023Free · No paywall

Everyone Needs a Samwise

AI news · Synthesized · Opinionated · 🌿

0.751
BixBench Pass@1 — tops field
GPT-Rosalind · OpenAI · April 2026
Safety
By Sam Taylor with Samwise

On the 0.751 BixBench score, the Lawrence Livermore and CEPI partnerships, and what changes when a frontier AI lab starts signing national security contracts.

OpenAI's drug discovery model just became a government biodefense tool.

Source lean on this story
▲ avg

Anti-AI

00

Skeptic

00

Neutral

01

Pro (practical)

02

Pro (hyped)

00

← Anti-AI · Pro-AI →

OpenAI launched GPT-Rosalind on April 16, 2026 as a life sciences research model — built for drug discovery, genomics interpretation, protein engineering, pathway analysis, and literature synthesis. The name is after Rosalind Franklin, whose structural research contributed to the discovery of DNA's double helix. The benchmark was 0.751 on BixBench, a bioinformatics evaluation designed by Edison Scientific around 53 real-world analytical scenarios and 296 questions — models get raw data files and an empty Jupyter notebook, and have to work through actual research tasks. That 0.751 was the top score in the field at launch. Initial pharma partners were Amgen, Moderna, the Allen Institute, and Thermo Fisher Scientific.

On May 29, OpenAI announced the Rosalind Biodefense program. Same model. Different set of partners: Lawrence Livermore National Laboratory, Johns Hopkins Applied Physics Laboratory, and CEPI — the Coalition for Epidemic Preparedness Innovations, which co-funded the Moderna COVID-19 vaccine. The use cases: epidemiological modeling, early outbreak detection, screening, and medical countermeasure development. OpenAI briefed the White House before the announcement.

This is worth being specific about. Pharma-to-biodefense is a different kind of step.

Source spread

Pros & cons

What holds up:

The BixBench benchmark is credible. 53 scenarios, 296 questions, real bioinformatics workflows — not a multiple-choice test. 0.751 is a real score on a task-representative evaluation, ahead of GPT-5.4 (0.732) and Grok 4.2 (0.728). On LAB-Bench2 — which evaluates literature retrieval, database access, sequence manipulation, and protocol design — GPT-Rosalind outperforms GPT-5.4 on six of eleven tasks. The largest improvement is CloningQA: end-to-end design of DNA and enzyme reagents for molecular cloning. This is not a benchmark stunt.

The access model is restrictive for a reason. Trusted Access only, US enterprise customers, via ChatGPT, Codex, and API. Organizations must meet safety and compliance requirements. OpenAI briefed the White House before announcing. For a biology-domain model, this level of access control is not excessive — it's baseline appropriate.

The government partners are substantive. CEPI co-funded the Moderna COVID-19 vaccine and exists specifically for epidemic preparedness. Johns Hopkins APL does real biosecurity research. Lawrence Livermore is a defense-science institution with decades of experience handling classified work. These are not marketing associations.

BixBench Pass@1 — Life Sciences AI Models
ModelBixBench Pass@1Provider
GPT-Rosalind0.751OpenAI
GPT-5.40.732OpenAI
Grok 4.20.728xAI
GPT-5.20.698OpenAI
GPT-50.611OpenAI
Gemini 3.1 Pro0.550Google
Source: Tech Insider / Edison Scientific BixBench. Pass@1 scored on 53 real-world bioinformatics scenarios.

What deserves scrutiny:

The dual-use concern doesn't go away because access is restricted. Biology domain knowledge that helps identify pandemic vulnerabilities is the same biology domain knowledge that could inform a threat model. These are not separate knowledge bases. OpenAI hasn't published a red-team analysis of GPT-Rosalind for adversarial biology use cases. The biodefense framing suggests they've thought about it. But OpenAI's published material describes the controls without publishing the reasoning behind them.

"Trusted Access" is a policy, not a technical constraint. The model goes out via API to approved US enterprise customers. What "approved" means as the program scales and commercial pressure increases is a legitimate ongoing question, not one the current announcement resolves.

This is also a new category of contract for OpenAI: national security work with government weapons labs. Lawrence Livermore is a nuclear-weapons simulation facility. The "biodefense" umbrella is wide. What OpenAI can be contracted to do through this program, how it interacts with the company's stated mission, and who provides oversight as the scope expands — none of that is answered by the press release.

The Rosalind Biodefense Program will help operationalize AI tools that can strengthen preparedness before the next biological threat emerges.

OpenAI — Rosalind Biodefense announcement, May 29
For builders
  • If you're in life sciences or pharma: GPT-Rosalind is worth applying for Trusted Access. The BixBench advantage over GPT-5.4 is real and task-specific — protein design, CloningQA, genomics workflows.
  • The Codex Life Sciences plugin connects to 50+ scientific tools and data sources and is more accessible than the API for most builders. Start there.
  • If you're not in pharma, government health, or biodefense: this model is not available to you and has no near-term implications for your stack. Rosalind is a vertical model, not a general capability update.
  • The biosafety conversation in AI is about to get much more specific. Watch what governance structures OpenAI publishes for Rosalind over the next two quarters. Whatever they establish here will set a template — or a precedent — for how other labs handle domain-specific models in sensitive fields.
  • For the dual-use discussion more broadly: the International AI Safety Report 2026 has a section on biological risk that's worth reading before forming an opinion.

Further reading

🌿

Your take

How'd I do on this one?

What did I miss?

Tell Samwise (and Sam).

Disagree with the take? Spotted a fact I got wrong? Have context I should have included? Drop it here. Anonymous unless you leave an email.

Liked this? Get the weekly digest.

Free. Monday mornings. The week's stories, synthesized. Unsubscribe anytime.