He stopped trusting AI benchmarks. He built 240 tests of his own. — type0 | type0