AI coding agents can now train robots overnight, no human required
NVIDIA's ENPIRE training harness lets frontier coding models iterate on robot control code until it works, turning physical manipulation training into a software debugging problem.
NVIDIA's ENPIRE training harness lets frontier coding models iterate on robot control code until it works, turning physical manipulation training into a software debugging problem.
The bottleneck for getting robots to handle new physical tasks is no longer the robot. It is the code that tells the robot what to do, and a new NVIDIA-led project shows that frontier AI coding agents can write and rewrite that code overnight, with no human in the loop.
Researchers at NVIDIA's GEAR lab, working with Carnegie Mellon University and UC Berkeley, built a four-module harness called ENPIRE that lets coding agents autonomously train real robot arms to manipulate objects, including inserting pins, cutting zip ties, and installing GPUs on motherboards (NVIDIA GEAR project page). The system ran continuous training cycles for hours, reading logs, rewriting the policy, and rebooting the robot, mirroring the way a software engineer iterates on a buggy program.
ENPIRE was tested with three frontier coding agents: OpenAI's Codex model based on GPT-5.5, Anthropic's Claude Code using Opus 4.7, and Moonshot AI's Kimi Code built on Kimi K2.6. Coding-agent teams reached roughly 99% success on four ENPIRE-built tasks—a Push-T block manipulation test, pin insertion and organization, zip-tie cutting, and GPU insertion and removal—according to Ars Technica (Ars Technica). On Push-T, an eight-agent team hit 99% in about two hours, compared with roughly three hours for a four-agent team and five hours for a single agent.
Those headline tasks are narrow. The robots are lab-bench single arms with auto-reset hardware, not factory-floor systems, and the 99% figure is pass@8 across the four ENPIRE tasks, not a generalized manipulation score. The speed numbers are wall-clock research time, not production throughput. The structural shift is still real: closing the sim-to-real gap for a manipulation task now reads as a code-iteration problem rather than a human-labor problem.
That framing has costs the Ars Technica report flags directly. Robots sat idle while agents read logs or waited on the language-model backbone. Larger teams spent more time cross-summarizing rather than acting. Agents underused the parallel compute available to them. Bigger teams finished faster on Push-T but burned more tokens, exposing a tradeoff that is already reshaping how agent SDKs are billed (Ars Technica). Anthropic recently paused billing on its Claude Agent SDK, citing exactly this kind of cost exposure as frontier coding agents scaled.
Jim Fan, NVIDIA's director of AI and a leader of the GEAR group, said the system "self-improves tirelessly overnight" and that the team plans to open-source ENPIRE so researchers can run self-directed robot training at home (Jim Fan LinkedIn post). On the pin insertion task, ENPIRE's coding-agent teams matched a prior human-in-the-loop method, described in a November 2025 paper, faster than the original researchers did.
The paper went up on June 16, 2026. What to watch next is whether token-cost efficiency, not raw capability, becomes the binding constraint on autonomous robot training, and whether open-sourcing ENPIRE moves the bottleneck from "can the robot learn" to "what does it cost to run the agents that teach it."