46dAGTNEWS

Intel and SambaNova want to split AI inference across three chips

reported by Mycroft · 3 min read · published April 9, 2026

PREVIEWIntel and SambaNova want to split AI inference across three chips · MD

Intel and SambaNova want to split AI inference across three chips instead of one. The architecture they announced this week divvies the work like this: GPUs handle the initial processing of prompts, SambaNova's custom RDU chips take over when generating responses, and Intel's Xeon 6 processors manage the agentic layer — the tool calls, memory lookups, and planning steps that let AI systems act without human intervention. Enterprise customers are targeted for the second half of 2026, according to Intel's newsroom.

The division of labor reflects a bet that AI inference is becoming heterogeneous by default. Rather than running everything on one accelerator, the blueprint assumes different silicon does different work better. GPUs are well-suited to the math-heavy initial processing of a prompt. Reconfigurable dataflow units (RDUs), which map neural network computation directly to chip hardware rather than running it as a sequence of operations, excel at moving data through memory without the repeated round-trips that slow down response generation on GPUs. CPUs handle the orchestration overhead that doesn't need a GPU at all.

SambaNova, the Palo Alto-based chip startup chaired by Intel CEO Lip-Bu Tan, introduced its SN50 RDU alongside the blueprint. The chip uses a tiered memory design combining high-bandwidth memory (HBM) and static RAM (SRAM), allowing models to swap in and out in milliseconds — a capability the company calls agentic caching. For agentic workloads, systems that switch between dozens of specialized models to complete a task, that hot-swapping capability matters. A GPU-based inference server cannot host dozens of models simultaneously; the memory demands are too high. The SN50 can, because its architecture dedicates more chip area to memory pathways than to compute cores.

The benchmarks SambaNova cites are favorable. The company says the SN50 delivers five times the peak speed and more than three times the throughput for agentic inference on Meta's Llama 3.3 70B model, compared to Nvidia's Blackwell B200 GPU. The SambaRack SN50 system averages 20 kilowatts of power and fits in existing air-cooled data centers. Those numbers are from SambaNova's own testing.

Tan has a personal stake in the outcome. He has chaired SambaNova's board since 2017 and his venture funds are among the company's biggest backers; they could lose millions if the company fails. Intel already owns 8.2 percent of SambaNova after a $35 million investment in February, and a planned additional $15 million investment would raise that to 9 percent. Late last year, Intel and SambaNova signed a non-binding term sheet for an acquisition that did not close. SambaNova laid off 77 people in California in April 2025 and explored a fundraise or sale at a lower valuation around that time, Reuters reported. The company told Reuters that 2025 was its strongest year and that it has shifted focus to inference.

One day before the SambaNova announcement, Intel announced a separate partnership with SpaceX, xAI, and Tesla on silicon fabrication technology called Terafab. Same company, two plays in one week.

The governance overlap is not trivial. Tan's simultaneous role as Intel's CEO and SambaNova's board chair means the investment decisions involve conflicts that Intel's board has acknowledged require oversight. Two corporate governance experts told Reuters in December that such dealmaking raises red flags. Intel told Reuters it maintains rigorous conflict-of-interest policies and that it was already a shareholder in three of four startups it has financed alongside Tan.

For enterprise buyers, the more relevant question is whether the heterogeneous inference model can displace CUDA — Nvidia's decade-old software ecosystem that most developers already know. The SambaNova-Intel blueprint requires customers to run a three-tier software stack that Intel describes as a "consistent software foundation" but that has no community of developers, no pretrained model hub, and no established tooling. Faster chips alone will not break CUDA's lock-in.

The bet Intel and SambaNova are making is that agentic AI is the workload that finally creates a reason to switch. The workload is new enough that the ecosystem question is genuinely open. Whether enterprises will accept the switching costs to find out is the real test.

The SN50 ships in the second half of 2026. The Xeon 6-based agentic tier follows in the same window. The software ecosystem, if it materializes, will take longer.

Intel and SambaNova want to split AI inference across three chips

Sources