Behavior

Research papers, repositories, and articles about behavior

Showing 2 of 2 items

Verbalizing LLMs' assumptions to explain and control sycophancy

Gets models to spell out their hidden assumptions before answering. That makes it easier to spot flattery-driven answers and dial them down.

Myra Cheng, Isabel Sieh

Effects of personality steering on cooperative behavior in Large Language Model agents

The authors test how adding human-like personality traits changes how AI agents cooperate in repeated Prisoner’s Dilemma games. They find agreeableness boosts cooperation but can also make agents easier to exploit, warning that persona dials act as soft biases, not hard controls.

Mizuki Sakai, Mizuki Yokoyama