Be ahead of the curve
Research papers, repositories, and articles about frontier models
Showing 1 of 1 items
This report compares seven frontier language and vision models across many safety tests, from basic benchmarks to adversarial red-teaming. It finds GPT-5.2 clearly safest overall while others trade off safety across languages, modalities, and threat models.