The teams that spent the most time tuning prompts for older GPT models just learned their work is now working against them. GPT-5.5 does not want your careful instructions. It wants to figure things out itself. OpenAI published a prompting guide on April 25, two days after launch, telling developers the opposite of what the marketing promised: carry nothing forward from your existing prompt stack.
The guide is specific about the mechanism. Legacy prompts over-specify reasoning steps, enumerating every edge case and documenting every exception. On older models that hand-holding produced reliable results. On GPT-5.5 it adds noise and narrows the space the model searches for answers, producing answers that sound mechanical. The fix is not to adjust the stack. The fix is to delete it and begin again with the smallest prompt that captures the product goal.
The guide contrasts two prompts for a customer service task. The negative example walks through every step in order: inspect A, then B, then compare every field, then evaluate exceptions, then decide which tool to call. The positive example defines only the goal and what a resolved query looks like. The model is supposed to find the path itself.
The reasoning effort default also shifted. Previous GPT models defaulted to high reasoning effort. GPT-5.5 defaults to medium, a change that affects cost and latency calculations teams built around the higher setting.
Role definitions illustrate the expertise inversion at work. The prompting community had largely written off role definitions, opening a prompt with "you are a customer service agent," as a GPT-3 artifact that stopped helping in later models. Some researchers had called it counterproductive. GPT-5.5's guide puts role definitions back at the top of its recommended structure. The community had moved on. OpenAI moved it back.
Simon Willison, an independent AI practitioner who writes extensively about model behavior, called the shift notable. "OpenAI is recommending starting from scratch rather than trusting that existing prompts optimized for previous models will continue to work effectively with GPT-5.5," he wrote on his blog. One developer blogger was blunter: every instruction you added to paper over a quirk in the previous model is now debt.
The pressure falls unevenly. Teams that built careful prompt libraries, ran systematic ablations, and accumulated institutional knowledge about how their models behave are now carrying debt against the new architecture. They followed the best practices that OpenAI's own documentation codified for years. Investment in the previous generation became a liability for the next one. Teams using GPT-5.5 more naively may be closer to the new baseline by accident.
The guide dropped before most teams finished migrating. By April 25, a wave of teams had moved GPT-5.5 into production with legacy prompts baked in, running on a model that actively works against the instructions they gave it. The fix is not incremental.
This is specific to GPT-5.5. The behavioral change reflects genuine architectural differences in how the model processes instruction density. Previous transitions did not require this kind of break. The model that was supposed to be the upgrade broke the compact on which the industry upgraded. The expectation was that newer models would do the same things better. OpenAI's own documentation admits the compact is broken. The benchmarks still show real improvement. The story is that progress now comes with a condition: your old work has to go.