The physics of AI compute

Tracking the
AI Compute
Bottlenecks

Memory bandwidth, interconnect, packaging,
and power — where physics gates AI compute.

A modern GPU peaks at 1,979 TFLOPS in FP8. In decode at batch = 1, a 70B model uses ~3% of that. The other 97% is tensor cores waiting for memory. This site is about that gap.

start exploring
HBM3HBM3COMPUTE DIE · 814 mm²NVIDIA H100 SXM5 PACKAGE
explore the physics