Calculator

API vs Self-Hosted Calculator

Compare API serving cost to a self-hosted GPU cluster.

Enter your monthly token volume and API list prices, then size a GPU cluster against them. The calculator returns both monthly estimates, the break-even output volume, and how utilization changes the self-hosted cost per million useful tokens.

Back to Tools Compare providers

Interactive calculator

API vs self-hosted calculator

Monthly input tokens

Monthly output tokens

API price (input)

$/M

API price (output)

$/M

GPU type

Hourly rate per GPU

Provisioned GPUs

Throughput per GPU

tok/s

Utilization

Overhead

API at $1,050/mo vs self-hosted at $6,912/mo → API is cheaper this month.

API monthly estimate$1,050

Self-hosted monthly estimate$6,912

Self-hosted effective capacity181M output tok/mo

Self-hosted cost per useful 1M output$38.10

Break-even monthly output volume441M output tokens

Utilization sensitivity50% → $53.33/M · 75% → $35.56/M · 100% → $26.67/M

Starting values are illustrative defaults you can edit — not live ComputeTape benchmark prices. Replace them with a real quote.

API monthly estimate

Input tokens × input price plus output tokens × output price. List rates, not negotiated.

Self-hosted monthly estimate

Provisioned GPUs × hourly rate × 720 hours per month × (1 + overhead). You pay for the cluster whether it is busy or not.

Effective capacity

Output tokens the cluster can actually serve in a month at the throughput and utilization you entered. If demand exceeds capacity, self-hosted needs more GPUs before the comparison is honest.

Break-even output volume

Monthly output tokens at which API equals self-hosted, holding input volume and prices constant. Above the break-even, self-hosted is cheaper; below it, API wins.

What is GPU utilization?

How utilization is measured and why paid capacity costs more when it sits idle.

Open lesson →

Frontier model serving cost

How tokens per second, latency, and batch size translate into recurring spend.

Open lesson →

API vs Self-Hosted Calculator

API vs self-hosted calculator

What the numbers mean

API monthly estimate

Self-hosted monthly estimate

Effective capacity

Break-even output volume

A half-busy GPU is a fully paid GPU

What is GPU utilization?

Frontier model serving cost

API vs Self-Hosted Calculator

API vs self-hosted calculator

What the numbers mean

API monthly estimate

Self-hosted monthly estimate

Effective capacity

Break-even output volume

A half-busy GPU is a fully paid GPU

What is GPU utilization?

Frontier model serving cost

Next decisions

Model Serving Cost Calculator

AI GPU Provider Directory

How to compare GPU cloud quotes