Previous lesson
Benchmark score vs production cost
Continue the Model Costs track.
Compute College
Learn how to estimate the full cost of an AI task, including input tokens, output tokens, retries, tool calls, latency, and model selection.
One concept connected to AI compute market decisions.
A practical introduction designed to be completed in one sitting.
Useful for developers, founders, procurement teams, and analysts tracking model-serving economics.
Plain-English definition
Cost per completed AI task is the total model-serving spend required to finish one useful workflow, including failed attempts and intermediate calls, rather than only the price of the first prompt.
Why it matters
This unit turns provider prices into buyer math. Coding agents, research agents, and verification workflows often invoke a model several times before one result is usable.
Simple example
An illustrative task uses 20,000 input tokens at $5 per million and 4,000 output tokens at $25 per million: $0.10 plus $0.10, or $0.20 per attempt. If only four of five attempts succeed, expected cost per completed task is $0.20 / 0.80 = $0.25 before tool fees or overhead.
Example figures are illustrative calculations, not current quoted market prices.
Market signal
A model may increase demand while lowering cost per successful task: higher completion rates can make many more workflows economically viable at the same posted token rates.
Market read: this metric becomes an AI compute signal only when it changes serving volume, effective workload cost, or the capacity buyers require.
Common mistake
Do not price only the first response when the real workflow routinely includes retries, verification, context refreshes, and tool rounds.
Practical takeaway
Estimate input cost plus output cost plus tool or orchestration cost for all expected attempts, then divide by the probability of an acceptable completion.
Decision check: identify the capability measured, the serving cost driver it affects, and the buyer behavior that would make capacity demand change.
Helpful memory trick
The useful unit is not cost per prompt. It is cost per finished job.
Compute College
Follow model releases as AI compute market signals in the ComputeTape Morning Brief.
Compute College track
Continue this Compute College lesson path
Previous lesson
Continue the Model Costs track.
Next lesson
Continue the Model Costs track.