Full execution traces boost failure pinpointing in AI agents, but still only 30% accurate — type0 | type0