On May 7, reporting from India and Europe confirmed that Google DeepMind, Microsoft and xAI have signed new agreements with the US Commerce Department’s Center for AI Standards and Innovation (CAISI) to let it test advanced models before and after deployment. The deals extend earlier arrangements with OpenAI and Anthropic, giving the US government structured access to nearly all major US‑based frontier AI labs’ systems.
This article aggregates reporting from 7 news sources. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
By bringing Google DeepMind, Microsoft and xAI under the same evaluation umbrella as OpenAI and Anthropic, CAISI is quietly building a de facto US standard for pre‑deployment testing of frontier models. That doesn’t yet amount to hard regulation, but it does create a coordinated chokepoint where government experts can see what’s coming before it hits the market. In practice, this is how many complex technologies—from aviation to telecom—evolve: voluntary regimes harden into expectations, and expectations harden into rules.
For the AGI race, the agreements cut both ways. On one hand, a common testing pipeline can surface systemic risks—autonomous cyber capabilities, model‑assisted bio threats, cascading misalignment failures—before labs deploy the most capable systems. On the other, the fact that all major US labs are willing to participate suggests they don’t yet view federal scrutiny as a material brake on deployment velocity. CAISI’s influence will depend on how blunt its findings are and whether they ever translate into “no‑go” recommendations that industry or the White House feel obliged to follow.