On April 3, 2026, Arcee AI released Trinity-Large-Thinking, an Apache 2.0–licensed 400B-parameter sparse Mixture-of-Experts reasoning model that activates 13B parameters per token. The model scores 91.9 on PinchBench, within two points of Anthropic’s Claude Opus 4.6, while Arcee prices output at $0.90 per million tokens, roughly 96% cheaper than Opus. Trinity-Large-Thinking is available via OpenRouter, DigitalOcean’s Agentic Inference Cloud and downloadable weights on Hugging Face.
This article aggregates reporting from 3 news sources. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
Trinity-Large-Thinking is the most serious open-weight challenger we’ve seen to proprietary frontier models since the early Llama era. Arcee has essentially recreated a Claude-class reasoning system with fully downloadable weights, Apache 2.0 licensing, and sub‑$1 per million token pricing, all built by a 30‑person lab that spent about $20 million on a single 33‑day training run. That combination of capability, openness, and cost radically lowers the barrier for anyone who wants to run long‑horizon agents on their own infrastructure rather than renting closed models at premium prices.
Strategically, Trinity lands just as Chinese labs like Qwen, MiniMax and Z.ai retreat from open weights and Western hyperscalers focus on tightly controlled APIs. It gives the “American open weights” camp a concrete flagship: a 400B sparse MoE model that is competitive on agentic benchmarks like PinchBench and Tau2 while remaining economical enough to use at scale. That matters because open models have historically lagged one full generation behind closed labs; Arcee is closing that gap on the dimension that matters most for AGI-relevant work—multi‑step reasoning.
The competitive implication is twofold. First, it pressures big labs to justify their pricing when an open alternative is within a few percentage points on key agent benchmarks. Second, it gives enterprises and research groups a credible foundation for sovereign AI stacks and bespoke distillations. If Trinity proves robust in real deployments, it could tilt the ecosystem toward a future where frontier‑ish capabilities are widely self‑hosted rather than monopolized by a handful of APIs.