Even OpenAI's best model solves fewer than 3 in 10 real science tasks end-to-end. — type0 | type0