Claude API pricing
Official model token pricing table.
Compute College
Learn why output tokens usually cost more than input tokens and how generation cost affects model serving economics, AI agents, and inference spend.
One concept connected to AI compute market decisions.
A practical introduction designed to be completed in one sitting.
Useful for developers, founders, procurement teams, and analysts tracking model-serving economics.
Plain-English definition
Output tokens often carry a higher listed API price than input tokens because generation is performed sequentially during the response, tying up serving resources as each next token is produced; actual pricing is set by each provider.
Why it matters
Output-heavy coding, reasoning, and agent workflows can create much larger serving bills than short responses, so token mix matters for model adoption and AI compute demand.
Simple example
Anthropic lists Claude Opus 4.7 at $5 per million input tokens and $25 per million output tokens. At those listed rates, an illustrative request with 100,000 input tokens and 20,000 output tokens costs $0.50 for input and $0.50 for output: much fewer output tokens contribute the same spend.
Example figures are illustrative calculations, not current quoted market prices.
Current example
Anthropic’s official pricing documentation lists Claude Opus 4.7 at $5 per million base input tokens and $25 per million output tokens. Last checked: May 24, 2026.
Official model token pricing table.
Pricing is current-source information and should be checked again before making a procurement decision.
Market signal
If buyers adopt agents or long-form reasoning workflows that produce more output, model-serving demand and spend can grow even when request count or input volume appears stable.
Market read: this metric becomes an AI compute signal only when it changes serving volume, effective workload cost, or the capacity buyers require.
Common mistake
Do not estimate a workload from input tokens alone, and do not assume every provider or model has the same input-to-output pricing relationship.
Practical takeaway
Track input and output token volumes separately, use current official pricing, and model how answer length, reasoning, and agent steps change cost per completed task.
Decision check: identify the capability measured, the serving cost driver it affects, and the buyer behavior that would make capacity demand change.
Helpful memory trick
Input loads the workbench; output keeps the generator working token by token.
Compute College
Follow model releases as AI compute market signals in the ComputeTape Morning Brief.
Compute College track
Continue this Compute College lesson path
Previous lesson
Continue the Model Costs track.
Next lesson
Continue the Model Costs track.