Disagreeable AI Agents Sabotage Negotiations but Ship Code Just Fine

Disagreeable AI Agents Sabotage Negotiations but Ship Code Just Fine — type0 | type0

PREVIEWDisagreeable AI Agents Sabotage Negotiations but Ship Code Just Fine · MD

When you give a team of AI agents a hostile personality, the team's bargaining and open-ended research can collapse. Their code still ships. That split is the durable finding from a new Arizona State University preprint, and it gives builders a more useful rule of thumb than "adversarial agents are bad" or "adversarial agents are fine."

The paper, posted Monday as arXiv:2606.27443 under the title "When Does Personality Composition Matter for Multi-Agent LLM Teams?", tested teams of large language model agents (LLMs, the AI systems behind chatbots and coding assistants) prompted with varied levels of two Big Five personality traits. The traits in play were agreeableness, ranging from cooperative to adversarial, and openness, ranging from exploratory to rigid. The three task types were structured coding, open-ended research collaboration, and competitive bargaining.

In the research and bargaining conditions, low-agreeableness agents produced hostile communication, broke consensus, and measurably dragged down outcomes. In the structured coding condition, the same hostile prompts barely moved milestone completion. The asymmetry held even when the agents' words were openly adversarial, which is the part of the result that actually stings. The problem was not just the tone. It was what tone did to a multi-step task that depended on agreement.

The mechanism is what makes the finding worth a builder's attention. "Personality" in these agents is context-window conditioning: a prompt, not a weight change. Researchers steered each agent's behavior with an OCEAN-style profile, the standard Big Five personality model (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism), layered into the system prompt. That means the design knob is cheap. There is no fine-tuning bill, no retraining run, no model release cycle. A team operator can change an agent's agreeableness score between runs and watch the team's communication change with it. The code result is not reassurance that hostile agents are safe. It is evidence that the cost of an adversarial personality is task-shaped, not task-uniform.

That shape is the decision rule. Builders running structured coding pipelines against a fixed spec, a test suite, an interface contract, an output schema, can treat personality as decoration. Builders running open-ended research or any flavor of competitive bargaining are paying for personality composition whether they have thought about it or not. The bargaining and research failures in the paper are not a personality preference. They are a communication cost that compounds as the team gets larger and the task gets more open.

There are two honest limits to keep next to this. The paper is an arXiv preprint, not peer-reviewed, so the magnitudes and per-task metrics should be treated as provisional until the experimental details are confirmed in the HTML version and any peer review. And the result covers the three task types the authors tested. "Structured coding" is not a stand-in for all structured work. "Negotiation" is not a stand-in for every adversarial business process. The split is real. The borders of the split are still being drawn.

The next signal to watch is whether the same task-conditional pattern survives a more capable model. Personality is a prompt, and prompt sensitivity changes as models do. The builder question this paper actually answers is narrower and more useful than the one most readers will first hear. Not "should my agents be nice?" but "for which tasks does the cost of a hostile prompt show up in the output, and for which tasks does it stay in the chat?"

Disagreeable AI Agents Sabotage Negotiations but Ship Code Just Fine

Sources