Visualizing the process, price, and physical limitations of AI compute

Tracking the AI Compute Bottlenecks

Memory bandwidth, interconnect, packaging, and power — where physics gates AI compute.

A modern GPU peaks at 1,979 TFLOPS in FP8. In decode at batch = 1, a 70B model uses 3% of that. The other 97% is tensor cores waiting for memory. This site is about that gap.

choose your path

THE TOUR

How it works.

A guided zoom from the data center down to a single transistor. Seven stages, plain language, minimal math.

7 stages · ~7 minutes · light readingstart the tour →

THE TOOL

What it costs.

Configure a training or inference run. See the bill in dollars, watts, and floor space. Find the wall.

interactive · keyboard navigableopen the tool →