Claude 1M Context Window Goes GA at Standard Pricing

Claude's 1 million token context window is now generally available at standard pricing, enabling analysis of entire codebases and lengthy documents.

Claude 1M Context Window Goes GA at Standard Pricing

Anthropic has eliminated the long-context premium. Starting today, Claude Opus 4.6 and Sonnet 4.6 offer their full 1 million token context window at standard pricing. A 900K-token request now costs the same per-token as a 9K one. For indie developers and bootstrapped startups, this is not just a price cut—it is a fundamental shift in what is economically possible to build with AI.

The Economics Just Changed

For months, the AI industry has treated long-context windows as a premium feature. Want to process a 500-page PDF or analyze an entire codebase? That would cost you exponentially more than a standard chat query. The pricing structure effectively gated serious long-context work behind enterprise budgets.

Anthropic just removed that gate entirely.

The new pricing is straightforward:

• Opus 4.6: $5 input / $25 output per million tokens (same rate regardless of context length)

• Sonnet 4.6: $3 input / $15 output per million tokens

Previously, long-context requests carried a multiplier that could make them 10x or 100x more expensive. That multiplier is now gone.

What 1M Tokens Actually Means

To understand the significance, you need to grasp what fits in a million-token context window:

• Entire codebases: Most startup codebases fit comfortably within 1M tokens

• Lengthy documents: A 500-page PDF is roughly 150K-200K tokens

• Hours of conversation: Complex multi-turn agent workflows with tool calls, observations, and reasoning

• Multiple research papers: Cross-reference findings across dozens of sources in a single session

But raw capacity only matters if the model can actually use it. Anthropic reports that Opus 4.6 scores 78.3% on MRCR v2—the highest recall performance of any frontier model at this context length.

Real-World Impact: What Developers Are Saying

Code Review at Scale

Cognition's Devin Review agent previously had to chunk large diffs due to 200K context limits, losing cross-file dependencies. With 1M context, they feed the full diff and get higher-quality reviews from a simpler architecture.

Debugging Without Compaction

Software engineer Anton Biryukov describes the pain of context compaction: "Claude Code can burn 100K+ tokens searching Datadog, Braintrust, databases, and source code. Then compaction kicks in. Details vanish. You are debugging in circles. With 1M context, I search, re-search, aggregate edge cases, and propose fixes—all in one window."

Bardia Pourvakil notes: "An in-house lawyer can bring five turns of a 100-page partnership agreement into one session and finally see the full arc of a negotiation. No more toggling between versions."

Incident Response

Mayank Agarwal explains: "Large-scale production systems have endless context. With Claude's 1M context window, we keep every entity, signal, and working theory in view from first alert to remediation."

Expanded Media Limits

Anthropic also increased media processing limits from 100 to 600 images or PDF pages per request—a 6x expansion that enables analyzing entire slide decks or document collections without splitting requests.

Strategic Implications

This pricing change signals a broader shift in the AI market. Context windows are becoming commoditized. For developers, the takeaway is clear: architecture decisions that required complex RAG systems or chunking strategies six months ago can now be solved with a single API call.

FAQ

How much does 1M context actually cost?

At Opus 4.6 rates ($5/$25 per million tokens), a full 1M token request costs approximately $5 for input. Processing a typical 300-page PDF (around 100K tokens) costs roughly $0.50 in input tokens.

Is there a performance difference at longer context lengths?

Opus 4.6 maintains strong performance across the full 1M window, scoring 78.3% on MRCR v2. However, some developers note that coherence can vary past 200K tokens, suggesting effective context management remains important.

Can I use 1M context with Claude Code?

Yes—Claude Code Max, Team, and Enterprise users with Opus 4.6 automatically get 1M context.

Should I still use RAG with 1M context?

For datasets fitting within 1M tokens, direct context loading is now economically viable and architecturally simpler. For larger datasets or persistent knowledge across sessions, RAG still makes sense.