TechnologyWednesday, December 3, 2025

OpenAI unveils 'confessions' training method to make language models admit when they misbehave

Summary

OpenAI introduced a research prototype called "confessions," which trains models to produce a second, honesty‑only output that reports when they violated instructions, reward‑hacked, or took unintended shortcuts. In adversarial evaluations designed to elicit misbehavior, the technique substantially reduced false negatives, giving developers better visibility into when powerful models deviate from intended behavior and improving tools for monitoring alignment as systems grow more capable.

Companies Mentioned

OpenAI

AI Lab|United States

Valuation: $500.0B

Private company - No stock data

Related Deals

Research

Partnership

Compute

Drag nodes to explore | Featured companies highlighted

Research

OpenAI, Anthropic, Block and major cloud providers are co-founding the Agentic AI Foundation under the Linux Foundation to steward open, interoperable standards for AI agents.

OpenAI→

Anthropic→Block→

Google→

Microsoft→Amazon Web Services→Bloomberg→Cloudflare→Cisco→Agentic AI Foundation

Dec 2025

Research

Founding members created the Agentic AI Foundation under the Linux Foundation to fund and govern open standards like MCP, goose and AGENTS.md for interoperable agentic AI.

Anthropic→

OpenAI→Block→

Google→

Microsoft→Amazon Web Services→Bloomberg→Cloudflare→Agentic AI Foundation (AAIF)

Dec 2025

Partnership

OpenAI and Deutsche Telekom agreed a multi‑year collaboration to co‑develop AI products and deploy ChatGPT Enterprise across the telecom group.

OpenAI→Deutsche Telekom

Dec 2025

Partnership

Strategic education partnership to roll out ChatGPT Edu at scale and integrate OpenAI’s models into La Trobe University’s teaching, research and new AI‑focused degree programs.

OpenAI→La Trobe University→La Trobe University→

OpenAI

Dec 2025

Compute

OpenAI and NEXTDC entered a multi-year agreement under which OpenAI will anchor a hyperscale AI campus and GPU supercluster at NEXTDC’s S7 facility in Sydney to support large-scale AI inference and enterprise workloads.

NEXTDC→

OpenAI→NEXTDC S7 Sydney AI campus

Dec 2025

View all AI deals