From PokéStops to foundation models: the dataset Pokémon Go players never met
The scans were optional inside the game. The foundation model they trained is now licensed outward, and the consent question is the part still catching up.
The scans were optional inside the game. The foundation model they trained is now licensed outward, and the consent question is the part still catching up.
A decade ago, millions of people held up their phones at statues, storefronts, and park fountains to earn digital rewards in Pokémon Go. Few of them imagined those short videos would outlive the game's hype cycle, get folded into a spun-out mapping company, and help train AI systems now licensed to navigate delivery robots and, possibly, military drones.
That is the throughline in a new Ars Technica report on the dataset Pokémon Go players quietly built. The scans were "optional" in the game's interface, but they sat inside a gamified reward loop: players captured short clips of real-world landmarks to advance the AR gameplay, contributing what the source characterizes as billions of real-world images from millions of players.
The successor to that data is Niantic Spatial, which Niantic spun out in May 2025, shortly after Niantic sold its licensed games, including Pokémon Go, to the Saudi-backed publisher Scopely. The spinout and the game sale are separate corporate events, and both matter. Niantic Spatial inherited the geolocated scan pipeline and the stated ambition to turn it into a "large geospatial model," a 3D foundation model of the physical world that competing AI systems can query the way language models are queried for text.
Niantic has been explicit that those ground scans are part of the training set. The company has said that "ground scans were one component to help train Niantic Spatial's real-world foundation model," per the Ars Technica report on the spinout and dataset lineage. That phrasing matters. The scans were not an incidental side effect of gameplay. They were the raw material.
The downstream market the report flags is dual-use. The closer, consumer-visible application is visual positioning for delivery robots — confirmed in a partnership with Coco Robotics, whose fleet of sidewalk delivery robots operates in Los Angeles, Chicago, Jersey City, Miami, and Helsinki. The further application, confirmed through a partnership with Vantor — formerly Maxar Intelligence — is GPS-denied positioning for both flying drones and ground vehicles. Vantor has multiple US government contracts with the National Geospace-Intelligence Agency, various branches of the US military, and the Department of Homeland Security. During the Defence Geospatial Intelligence conference in February 2026, Niantic Spatial's director of product management described early testing of the integrated positioning system as achieving a 70 percent reduction in positioning error with accuracy to within 1.5 meters in many scenarios.
What makes the case worth examining is not the novelty of any single claim. It is the lineage. A consumer AR game built a scan pipeline, the company packaged the result as a foundation model, the foundation model was licensed outward, and the licensing lands in a market where commercial navigation and defense-adjacent navigation use much of the same substrate. The pattern is generalizable: any app that gamifies real-world capture is, structurally, a data pipeline for geospatial AI, whether or not anyone calls it that.
The consent question does not resolve easily. Niantic disclosed the scan feature in its 2019 privacy policy, and the scans were technically opt-in inside the game. Disclosure and opt-in are not the same as informed consent to a downstream use that did not yet exist when the scans were captured, especially when the use is now framed around defense-adjacent autonomy. Players who tapped through a 2019 terms update to catch a Snorlax were not consenting to a foundation-model training set in 2025. They were consenting to a game.
The case is also not a one-off. Scaniverse, Niantic's standalone scanning app, runs the same playbook with a smaller user base. The next wave of AR glasses is likely to run a similar pipeline at a larger scale, with less obvious opt-in friction. The Pokémon Go lineage is the most legible version of the pattern, not the only one.
What to watch next is whether Niantic Spatial publishes its current scan-data retention and licensing terms in plain language, and whether the company discloses named commercial and government customers beyond those already named. The pipeline is already built. The accountability layer is the part that still has to catch up.