# Most AI Inference Time Goes to Memory Transfer, Not Math - Date: 2026-03-23 - Category: Artificial Intelligence Running a mixture-of-experts model in production has a quiet bottleneck that benchmark papers rarely discuss: the CPU-GPU transfer. ---