2026-06-264 min read

What changed in Anthropic's RSP v2.0

The October 2024 restructure moved the framework from model-version commitments to capability thresholds. Two specific thresholds and a governance role joined the document.

TL;DR

  • Anthropic published v2.0 of its Responsible Scaling Policy on October 15, 2024.
  • The restructure replaced model-version commitments with capability-threshold commitments. The policy now applies regardless of which model ships next.
  • v2.0 introduced explicit ASL-3 and ASL-4 deployment-and-training commitments.
  • Two specific threshold criteria entered the framework: bioweapons capability and autonomy-replication.
  • A Responsible Scaling Officer role became a required governance function.

From model versions to capability thresholds

The pre-v2.0 RSP keyed Anthropic's deployment and training obligations to specific model versions. v2.0 keys them to capability levels. The shift matters because foundation model release cadence outpaces annual policy revisions. Tying commitments to capability thresholds keeps the policy applying as model versions iterate, without each release requiring a fresh policy review.

For counsel reading the document, this is the structural change to internalize first. Before asking whether a commitment applies to a specific model, the question is whether that model meets a specific capability threshold. Anthropic's evaluations answer the threshold question. The framework answers what then happens.

Explicit ASL-3 and ASL-4 commitments

v2.0 introduced explicit ASL-3 and ASL-4 commitments. Prior versions referenced these levels but did not bind specific deployment-and-training commitments to them at this level of specificity. The current version specifies what Anthropic commits to do (and not do) when a model evaluates at each level.

The specifics of what each commitment requires are in the v2.0 PDF. Counsel reviewing a partnership or commercial arrangement with Anthropic should read those sections of the published policy directly. Anteroom's role is to flag that they exist and to anchor the version they appeared in.

Two new threshold criteria

Two specific threshold criteria entered v2.0:

  1. Bioweapons capability. Models with sufficient bioweapons-relevant capability trigger specific deployment restrictions and training mitigations.
  2. Autonomy-replication. Models capable of replicating themselves or persisting beyond developer control trigger threshold-specific commitments.

Both are categories evaluators must specifically check before a model proceeds. Neither was named in this way in prior versions.

The Responsible Scaling Officer role

v2.0 made a Responsible Scaling Officer a required role inside Anthropic. The role is a governance commitment as much as a technical one. It assigns a named accountable function for the policy's execution.

For counsel reviewing partnerships with Anthropic, the existence of this role matters because it identifies who internally is answerable for the safety commitments a partnership relies on. Vendor diligence questionnaires asking who at Anthropic is responsible for safety policy execution now have a specific answer.

What this means for AI counsel and safety policy leads

The shift from model-version to capability-threshold commitments is the bigger story than any specific threshold. Safety counsel reviewing a frontier model's deployment readiness needs the mental model that whatever Anthropic ships next, the same evaluation gates apply at the same capability levels. That is a different review posture than per-release legal sign-off.

For counsel reviewing partnerships or commercial deals where Anthropic is a vendor, RSP version matters because the deployment commitments to a partner stack on top of the framework current at deal time. The active version is v2.3, effective March 31, 2025. Anteroom's verification of the v2.3-specific changes against the public RSP updates page is in progress. The substantive v2.0 commitments above remain the structural baseline.

The compute-backed strategic investment pattern treats RSP version as one of the load-bearing data points partner counsel should pull during due diligence on partnerships where Anthropic is the funded entity. The current framework version sits in the Anthropic system cards entry.

Sources

  1. Announcing our updated Responsible Scaling Policy (Anthropic, October 15, 2024), retrieved 2026-06-26
  2. Responsible Scaling Policy v2.0 PDF (Anthropic), retrieved 2026-06-26
  3. Anthropic RSP updates page, retrieved 2026-06-26
  4. Anteroom system cards: Anthropic, retrieved 2026-06-26