AI compute market signals and learning
← Back to Compute College

Compute College

What is Model FLOPs Utilization (MFU)?

Model FLOPs Utilization (MFU) measures how much of a GPU's theoretical compute a job actually uses; goodput counts useful work delivered.

Compute & Pricing LessonsLearning path

One concept connected to AI compute market decisions.

5-8 minutesRead time

A practical introduction designed to be completed in one sitting.

Utilization / Efficiency / TrainingTags

Useful for ml engineers, operators, and analysts.

Plain-English definition

Plain-English definition

Model FLOPs Utilization (MFU) measures how much of a GPU's theoretical compute a training or inference job actually uses for useful model math. Goodput is the related idea of useful work delivered, as opposed to raw throughput. Both reveal how efficiently expensive accelerators are really being used.

Why it matters

Why it matters

A GPU's headline performance is a ceiling that real jobs rarely reach. Low MFU means you are paying for compute you are not using — overhead, waiting on memory or networking, or pipeline stalls eat into it. Improving MFU or goodput lowers the real cost per result without buying more hardware.

  • MFU compares useful model math to a GPU's theoretical peak; it is usually well below 100%.
  • Goodput counts useful work delivered, not raw throughput.
  • Higher MFU and goodput lower cost per result on the same hardware.

Simple example

Simple example

Suppose a training run achieves an illustrative 40% MFU. Improving software, batching, or interconnect to reach an illustrative 50% means the same hardware does about 25% more useful work — cutting cost per result without adding a single GPU.

  • A move from 40% to 50% MFU is roughly 25% more useful work on the same hardware.
  • Bottlenecks are often memory, networking, or pipeline stalls, not raw compute.
  • Treat any MFU percentage as workload- and setup-specific.

Example figures are illustrative calculations, not current quoted market prices.

Market signal

How to read the market signal

Rising MFU and goodput across the industry signal that more output is being squeezed from existing GPUs, which can ease effective demand growth. Watch efficiency gains alongside raw capacity: better utilization can lower cost per result even when chip supply is flat.

  • Efficiency gains can lower cost per result without new hardware.
  • Persistent low MFU points to memory, networking, or software bottlenecks.
  • Utilization metrics separate real efficiency from raw chip counts.

Market read: MFU and goodput reveal how much of paid compute is actually useful. Evidence discipline: state the workload and setup behind any utilization figure, and keep illustrative percentages separate from measured runs.

Common mistake

Common mistake

Reading a GPU's peak performance as what you will get. Real jobs reach a fraction of theoretical FLOPs; ignoring MFU and goodput overstates effective capacity and understates the true cost per result.

Practical takeaway

What you can do with this

Track MFU or goodput, not just GPU count, to find where paid compute is wasted and to lower cost per result.

  • Operators and ML engineers: measure MFU and remove memory, networking, and pipeline bottlenecks before buying more GPUs.
  • Analysts: read efficiency gains as a lever on effective compute supply and cost.
  • Compare cost per useful result, not per raw GPU-hour.
  • Treat utilization percentages as workload- and setup-specific.
  • Keep measured runs separate from illustrative efficiency targets.

Decision check: before adding GPUs, check whether higher MFU or goodput on existing hardware would meet the need more cheaply.

Helpful memory trick

Helpful memory trick

Peak FLOPs is the speedometer's top number; MFU is how fast you are actually going.

Compute College

Turn the lesson into a number

Use the GPU-Hour Cost Calculator, AI Training Cost Calculator, or Model Serving Cost Calculator.

Use the calculators

Compute College track

Model Costs

Continue this Compute College lesson path

Previous lesson

How fast does an H100 depreciate

Continue the Model Costs track.

Next lesson

How AI model benchmarks are calculated

Continue the Model Costs track.