Testing 6 top AI models in 13,590 scenarios finds manipulation doesn't transfer across tasks — type0 | type0