Standard neural quantum states have a geometry problem. It is baked into the architecture.
Neural quantum states (NQS) represent quantum many-body systems using recurrent neural networks, which factorize the wave function into a chain of conditional probabilities — one spin at a time. The setup enables efficient sampling without the autocorrelation issues that plague Markov-chain methods, and it scales linearly with system size. Mohamed Hibat-Allah's group at Waterloo, Perimeter, and the Vector Institute has been pushing this approach hard, including a 3D simulation at 1,000 qubits that type0 covered last week.
But standard RNNs have a built-in short-range bias. Information propagates one step at a time down the chain, which means correlations between distant spins decay exponentially with distance. This is not a training failure. It is a geometric constraint. No amount of retraining will teach a vanilla RNN to see across a large gap the way physics often demands.
Hibat-Allah, Asif Bin Ayub, and Amine M. Aboussalah introduce dilated connections in a new paper submitted April 9. The fix: layer the network so that layer l reaches back 2^(l-1) sites. A ten-layer network connects sites 512 steps apart in a single hop. The shortest path between two distant sites collapses from O(N) to O(log N), while the forward pass scales as O(N log N) — compared to O(N^2) for transformers.
The key numerical results, per the paper: dilated RNNs reproduce the expected power-law correlations of the critical 1D transverse-field Ising model, where standard RNNs show exponential decay. More pointedly, the dilated architecture approximates the 1D Cluster state, a canonical example of long-range conditional correlations that Yang et al. explicitly reported as challenging for RNN-based wave functions in 2024. The same year, Döschl and Bohrdt showed that even simple quantum states can require high-order correlations internally — a limitation the dilated approach addresses at the architectural level rather than through brute-force scale.
The trade-off is explicit in the paper: transformers can learn long-range correlations more flexibly; dilated RNNs bake the geometry in. This is a bet on structure over generality. Whether it pays off depends on the problem. For systems where the relevant correlations are dictated by proximity rules — the 1D Ising chain, the Cluster state — the architecture is well-matched. For frustrated magnets or out-of-equilibrium dynamics where correlation patterns are messier, the baked-in bias may be a liability.
The 16-page paper validates the approach on two 1D systems. Generalization to 2D, 3D, or fermionic systems remains unproven. But the architectural logic is sound, and the scaling advantage over attention-based methods is real. Hibat-Allah's group is systematically addressing the failure modes of RNN-based NQS — the 3D scaling result handled size; this handles structure. Together they make a more complete case that neural quantum states can be competitive with tensor networks for physically relevant problems, provided you get the architecture right.
For VCs and founders working in quantum simulation or ML-for-science: the question this paper poses is whether your problem's correlation structure is something you know in advance. If it is, dilated RNNs give you transformer-scale performance at RNN-scale cost. If it is not — if you're exploring unknown territory — you probably still need the flexibility of attention. This is not a universal win. It is a precise tool for a specific class of problem. That precision, honestly described, is what makes the paper worth reading.