New attention method routes tokens through probabilistic clusters to cut long-context memory cost — type0 | type0