OpenAI’s search bot is starting to matter more than its training bot
Publishers who thought blocking OpenAI's training bot settled their AI exposure problem may need to revisit that assumption. New Botify data, first reported by Search Engine Journal, suggests OpenAI's search crawler is now busier than its training crawler. That matters because OpenAI says sites that block OAI-SearchBot, the crawler that decides what can appear in ChatGPT search answers, can disappear from those answers.
That turns a familiar robots.txt choice into a distribution decision. In Botify's dataset, OAI-SearchBot activity rose 3.5 times after GPT-5 while GPTBot, the crawler OpenAI uses to gather public web data for model training, rose 2.9 times, flipping the ratio between the two from 0.95 before GPT-5 to 1.14 after it. If that pattern holds beyond Botify's customers, visibility inside ChatGPT's search product is starting to matter more than whether OpenAI can train on your pages.
The source needs some skepticism. Botify, an enterprise SEO company, says the analysis draws from roughly seven billion log events across a broader pool of more than 250 billion log files from November 2024 through March 2026. That is large enough to be interesting, but it is still a commercial dataset shaped by the kinds of large sites Botify serves, not a census of the public web.
Still, the directional signal lines up with OpenAI's own product split. OpenAI's bot documentation says OAI-SearchBot is used for search inclusion, GPTBot is used to make foundation models more useful and safe, and ChatGPT-User covers user-triggered visits rather than automatic crawling. That separation matters because publishers have often treated AI crawling as a training-consent question. OpenAI's docs make clear it is also a search-discovery question.
Botify's sector breakdown suggests that pressure is not spread evenly. The company says healthcare sites saw OAI-SearchBot activity jump 740.94 percent after GPT-5 and media and publisher sites saw a 701.91 percent increase. Those are exactly the categories where being visible in AI-generated answers could matter commercially, and where blocking crawlers has become part of a larger fight over traffic, licensing, and control.
The weirdest number in the dataset may be the one moving the other way. Botify says ChatGPT-User events fell 28 percent between Dec. 1, 2025 and March 14, 2026, compared with the prior period. Separately, SISTRIX reported in March that ChatGPT had plateaued at about six billion monthly visits while Gemini traffic tripled in the second half of 2025. Those are different datasets measuring different things, but together they hint at a product shift: more machine-to-machine retrieval, less direct user click-through.
That does not mean OpenAI suddenly controls web distribution. Botify's numbers do not show how often a crawl becomes an answer, how much referral traffic publishers get back, or whether the trend is the same outside enterprise-heavy sites. Cloudflare reported separately that GPTBot's share of AI crawler traffic more than doubled year over year in 2025, which supports the broader idea that OpenAI's web appetite is growing, but not Botify's narrower claim that search crawling has overtaken training crawling everywhere.
What changed here is simpler than the vendor framing. The web's AI access debate used to center on model training. OpenAI's own documentation, and now Botify's logs, suggest the more immediate pressure point may be search visibility. For publishers, that means the next fight is not only whether AI companies may learn from their pages. It is whether refusing that relationship also means becoming harder to find inside the interfaces readers are starting to use.