Compute College

What is Cost per Million Tokens?

By ComputeTape Editorial

Cost per million tokens is how hosted AI APIs price inference — usually with input and output tokens priced separately.

It is the unit hosted APIs bill in, so it maps usage straight to spend.
Input and output tokens are usually priced differently, and output often dominates the bill.
It bridges API buyers, who think in tokens, and infrastructure buyers, who think in GPU-hours.

Multiply token counts by per-million rates, keeping input and output separate.
Output usually drives the bill, so estimate expected output length carefully.
Treat per-token rates as illustrative unless taken from a current provider quote.

Example figures are illustrative calculations, not current quoted market prices.

Separate a rate cut from a usage change before calling a cost trend.
Output-token pricing is where reasoning and long-answer features show up.
Per-token competition is a proxy for inference supply and efficiency gains.

Market read: cost per million tokens is the inference-demand unit that maps model usage to spend. Evidence discipline: record the model, the date, and whether a rate is for input or output before comparing per-token prices, and keep illustrative rates separate from quotes. Figures here are illustrative unless explicitly sourced and dated — see our methodology.

Buyers: estimate input and output tokens per request separately and apply each rate.
Founders and analysts: track cost per completed task, not just per token, since one task can take many tokens.
Compare against a self-hosted GPU-hour estimate to decide between an API and self-hosting.
Treat per-token rates as illustrative until taken from a current provider quote.
Keep observed rates separate from modeled per-request costs.

Decision check: compare providers on input and output rates together with your expected token mix, not on a single headline number.

Use the calculators

Compute College track

Model Costs

Step 3 of 7: What is cost per million tokens

What is Cost per Million Tokens?

Plain-English definition

Why it matters

Simple example

How to read the market signal

Common mistake

What you can do with this

Turn the lesson into a number

Model Costs

What is Cost per Million Tokens?

Plain-English definition

Why it matters

Simple example

How to read the market signal

Common mistake

What you can do with this

Turn the lesson into a number

Model Costs

Related lessons

Model Serving Cost Calculator

Why output tokens cost more than input tokens

Why reasoning models cost more to serve

What is frontier model serving cost?