NVIDIA DGX Spark Update Links Four AI Workstations to Share Memory
A single AI workstation now has enough memory to run a 120-billion-parameter model — but the real constraint isn't the model size. It's the gap between what fits in memory and what a production workload actually needs.
NVIDIA's DGX Spark, a $4,699 AI workstation that ships with an open-source agent stack called NemoClaw, updated its software on June 2 to link four of the machines together into a cluster — doubling the prior two-machine limit. The clustering feature is the part that did not exist yesterday. Cluster Assistant, the software included in the update, lets up to three machines talk to each other over standard Ethernet cables, with a fourth machine supported if a network switch is available NVIDIA Developer Forums. Each node shares the same 121-gigabyte memory pool across CPU and GPU, so workloads that outgrow one machine can spill into the next.
Whether the update changes much in practice depends on a constraint the software does not fix. Frank Denneman, an independent AI infrastructure architect who has benchmarked the hardware directly, measured what the pool actually holds: the flagship model takes 94 gigabytes, leaving roughly 27 gigabytes for everything else — the agent loop, tooling, context windows Frank Denneman. "On a unified memory system, the model either fits based on total parameters or it does not," Denneman wrote. "Everything that matters operationally depends on what memory remains after that." A single long-running agent task with a large context window — a 128,000-token document analysis with multiple tool calls per iteration — can exhaust that headroom before the task completes. The June 2 update adds nodes; it does not add memory.
Getting a sandboxed autonomous agent running on owned hardware used to require a data center budget. NVIDIA compressed that to one command and one box sitting on a desk. NemoClaw ships with Qwen3.6 as the default model, which NVIDIA says brings performance improvements over earlier releases NVIDIA Developer Blog.
The economics for smaller or latency-tolerant workloads are now concrete. At current cloud API pricing — GPT-5.5 runs roughly $52 to $158 per month at 50,000 tokens per day depending on input-output token ratios, while Anthropic's Claude 3.5 Sonnet and Google's Gemini 1.5 Pro fall in a comparable range — the per-token bill disappears for workloads that fit in memory. DGX Spark hardware costs $4,699 amortized over three years comes to roughly $130 per month before electricity; whether that beats cloud pricing depends on workload characteristics and usage patterns that independent benchmarks do not yet publicly document. Cloud providers have built recurring revenue on exactly this class of workload. If local hardware becomes the default for a meaningful slice of that demand — especially in regulated industries where data residency is a compliance requirement — analysts expect the per-token revenue stream to thin for whatever category of workload DGX Spark can absorb. Whether it makes competition real for the first time depends on whether the memory headroom holds for real production agent workloads, not just benchmarks.
Cloud retains advantages that the local argument does not fully erase. Model providers continuously upgrade their models without any action from the user. Cloud also offers elastic scaling: burst to 10,000 tokens per second during a traffic spike, then pay nothing during quiet periods. And there is no hardware to maintain, replace, or depreciate. What local offers is cost predictability and data isolation — which matters in regulated industries, and not at all for a hobby project.
Dell sells a preconfigured Pro Max with GB10 starting at $4,756.84 Constellation Research. The NemoClaw bundle packages open models (Nemotron), the OpenShell secure execution environment, and an agent harness into a single installable stack NVIDIA Developer Blog.
NVIDIA describes OpenShell as a "secure, sandboxed execution environment" with "access controls, privacy protections, and operational guardrails" NVIDIA Developer Blog. Jensen Huang called OpenClaw — the framework OpenShell wraps — "one of the more important software developments in recent years" [Constellation Research]. No independent security audit of OpenShell's containment architecture with published results has surfaced in the sources reviewed. Regulated buyers in finance, healthcare, and defense face specific procurement barriers if OpenShell remains unaudited: enterprise SaaS procurement typically requires SOC 2 Type II certification, federal AI procurement commonly requires FedRAMP Moderate authorization, and sector-specific frameworks govern healthcare and defense AI systems. Without a published third-party audit, those gates stay closed as a practical matter. If OpenShell audit demand materializes, compliance consultancies and security audit firms are positioned to address the gap.
What to watch next: independent benchmarks comparing DGX Spark agent costs against equivalent cloud API runs on real workloads, and whether a third-party security firm publishes an OpenShell containment audit. The economics are now calculable. The security question is still open.