While Chinese humanoid labs chase generality, Beijing based XBOT scoped to coffee, deployed 1,000 robots, served 4 million cups, and passed RMB 100M in 2025 revenue.
A four-year-old Beijing company called XBOT has deployed more than 1,000 coffee robots across 100 cities, served over four million cups, and crossed 100 million yuan in 2025 revenue on hardware no one mistakes for a person.
XBOT's edge is not a cleverer brain but a smaller one. Its XOS 3.0 operating system, as founder Tang Mu tells 36kr 硬氪, layers a ZhiWei food-service language model trained on roughly four million real coffee runs above a "cerebellum" that converts tasks into arm motion in under ten milliseconds. The system reaches for one vision-language-action model (the general-purpose AI paradigm most humanoid robotics companies are betting on) only as a fallback for exceptions, not as the primary control loop.
Scoping the problem is the architectural choice. Vision-language-action models generalize poorly when asked to fold laundry, plate dishes and pour espresso inside the same model, Tang argues in the same 36kr exclusive and a long Sspai interview. Hold the scope tight, keep the compute and the failure rate low, and the unit economics start to work.
At a Yiwu mall pilot, XBOT's Lite-series unit, priced in the low six figures of yuan, produced about 200 cups a day at roughly 20 yuan each, generating about 60,000 yuan a month in revenue and clearing more than 30,000 yuan in net income. Tang pegs the payback period at six to eight months on a five-year design life, numbers that line up with the deployment and revenue figures PConline collected at CES 2026, where the dual-arm coffee debut was independently called the most commercially grounded embodied-AI product on the show floor. The unit-economics proof point is real and rests on a single deployment.
The other differentiator is paperwork. In July 2024, Beijing's E-Town district issued China's first Food Operation Licence to a hot-food robotics operator, under the State Council's 2020 framework. XBOT holds a national full-category version, which lets the company control the bean-to-cup supply chain, bill operators per consumable, and collect an "Agent Token" fee for its Aibao digital-human manager. That licence is what makes a Robot-as-a-Service model, where operators buy the hardware and pay monthly for software, supplies and maintenance, workable as a closed loop at all.
XBOT is not pretending the humanoid bet is wrong. The company showed an X1 humanoid at CES with dual seven-axis arms, sub-millimeter dual-arm coordination and a single 500-TOPS domestic chip, scheduled for late-2026 production. But X1 is the research bet. The coffee arm is the business.
The capital wave is split. XBOT's new rounds put Series A at 200 million yuan from Hong Kong-based GPTX Capital (JianKun) and Series B at a reported 300 to 500 million yuan from a syndicate of government guidance funds, dollar funds and strategic industrial investors. Inside the broader Chinese embodied-AI sector, Robotsj.cn's June 2026 tracker counted roughly 30 deals totaling about 17.8 billion yuan, with most dollars flowing toward generalist humanoid plays. X Square Robot's four-round climb to a US$2.8 billion valuation is the clearest expression of that bet.
XBOT targets 3,000 units in 2026 and orders the founder pegs at 300 to 500 million yuan. Four years in, the revenue line is real and small. The test for the vertical-first thesis is whether 1,000 specialized robots in 100 cities can compound into a defensible data flywheel while the rest of the industry's dollars keep funding the harder product. The payback figure was a Yiwu single store. The next payback figure is a fleet.