Article Markdown

Raw .md Rich view All markdown articles

# The AI Benchmark Gap: 77% on Computing Tasks, 39% on Scientific Reasoning

- Date: 2026-04-16
- Category: Artificial Intelligence

The Stanford HAI AI Index shows AI agents at 77.3% on real-world computing tasks — but the benchmark testing genuine scientific reasoning puts the same systems at 38.78% against an 83.5% PhD expert baseline. A 45-point gap nobody is reporting.

---