Memory as a Living Graph: How FluxMem Rewires Agent Recall on the Fly
A new preprint proposes modeling LLM agent memory as a continuously evolving connectivity graph — and reports consistent state of the art results on LoCoMo, Mind2Web, and GAIA.
A new preprint proposes modeling LLM agent memory as a continuously evolving connectivity graph — and reports consistent state of the art results on LoCoMo, Mind2Web, and GAIA.
Most LLM agents today remember the world the way a filing cabinet does: open a drawer, pull out a folder, hope it's the right one. A new preprint called FluxMem argues the cabinet is the wrong metaphor entirely. Memory, the authors write, should be treated as a continuously evolving connectivity graph that rewires itself as the agent acts — and on three long-horizon agent benchmarks, they report consistent state-of-the-art results.
The paper, titled "Rethinking Memory as Continuously Evolving Connectivity" and posted to arXiv as 2605.28773, takes aim at a known weakness in current agent stacks: the brittle, fixed retrieval pipeline. Whether an agent leans on vector search, a sliding-window context, or a hand-tuned summary store, the memory substrate tends to be static. Connections between memories are fixed at write time, and the system has no graceful way to repair, prune, or reorganize them as the agent encounters new tasks.
FluxMem's answer is a heterogeneous graph memory organized around a three-stage pipeline: initial connection formation, feedback-driven refinement, and long-term consolidation. At execution time, the framework repairs missing links, prunes interference between overlapping memories, aligns abstraction granularity across the graph, and distills recurrent successful trajectories into reusable procedural circuits, guided by one metric designed to capture both memory generalizability and evolutionary maturity.
The authors evaluate the framework on three benchmarks commonly used to stress long-horizon agent reasoning: LoCoMo, Mind2Web, and GAIA. Across all three, the paper claims consistent state-of-the-art performance. Because this is a preprint rather than a peer-reviewed result, those SOTA claims should be read as author-attributed, not settled. The abstract does not provide per-benchmark figures, and the full PDF was not pulled for this explainer; readers who need absolute scores against prior baselines should consult the paper directly.
The team behind the work is large: 15 authors are listed on the arXiv page, consistent with a multi-institution collaboration. Specific affiliation breakdowns were not verified for this draft and should be confirmed before publication.
The authors say the code will be open-sourced, with a GitHub link referenced in the abstract pointing to https://github.com/zjunlp/LightMem. The link was truncated in the source material received; the repo's public availability at draft time was not independently verified, so treat this as a forward-looking commitment. A LightMem repo is mentioned in the source stream; whether FluxMem and LightMem are the same project, related projects, or unrelated was not confirmed with the authors and should not be conflated in copy.
The broader lesson of FluxMem is conceptual as much as empirical. Agent memory is usually framed as a retrieval problem — store more, retrieve better. The paper reframes it as a topology problem: not just what to remember, but how the remembered pieces connect, and how those connections should change. If that framing holds up under scrutiny, the next generation of long-horizon agents may look less like search engines and more like nervous systems.