Foundation Models & Reasoning

Steady40%

Core model architectures, training methods, chain-of-thought reasoning, and test-time compute scaling. The backbone of modern AI capabilities.

transformersscaling lawschain-of-thoughto1reasoningtest-time computeworld models

220

Papers

146

Milestones

$97.2B

Funding

Benchmarks

Key Benchmarks

GPQA Diamond

Graduate-level science questions requiring PhD-level expertise

95.5%Human: 70%

Leader: Fuguhigh saturation

MMLU-Pro

Massive Multitask Language Understanding - Pro version with 10 answer choices and harder reasoning

89.6%Human: 89%

Leader: Qwen3.7 Maxhigh saturation

HLE (Humanity's Last Exam)

2,500 questions at the frontier of human knowledge across 100+ subjects

46.44%Human: 95%

Leader: Gemini 3.1 Pro (thinking high)low saturation

Recent Milestones

Grok 4.5 launches as ‘Opus‑class’ rival

On July 8, 2026 Elon Musk said SpaceXAI’s Grok 4.5 model will be made available to the public on July 9 after internal beta testing at SpaceX and Tesla. Musk described Grok 4.5 as an “Opus-class” model that is faster, more token‑efficient and lower cost than rivals.

Jul 8, 2026releaseImpact: 70/100

GPT‑5.6 clears US review for broad release

On July 8, 2026, OpenAI confirmed that its GPT‑5.6 model family—flagship Sol, midrange Terra and low‑cost Luna—will be released to the public on Thursday after weeks of limited access during a US government security review. Reports say the Trump administration allowed a broader launch following technical testing, while the White House stressed it does not formally “approve” private AI model releases.

Jul 8, 2026releaseImpact: 90/100

OpenAI rolls out GPT‑5.6 Sol, Terra, Luna

OpenAI said on July 8, 2026 that its GPT-5.6 Sol, Terra and Luna models will become publicly available on July 9 after weeks of restricted testing. The broader release follows a U.S. government safety review that temporarily limited access to a small group of partners.

Jul 8, 2026releaseImpact: 90/100

Microsoft routes Copilot to in‑house MAI models

On July 8, 2026 Egypt’s Youm7, citing Bloomberg, reported that Microsoft is increasingly routing Copilot and Office workloads to its in‑house MAI models instead of OpenAI’s ChatGPT and Anthropic’s Claude. The move aims to cut rising inference costs as AI usage grows across products like Excel, Outlook and GitHub Copilot.

Jul 8, 2026releaseImpact: 70/100

OpenAI Publicly Launches GPT‑5.6 Sol, Terra, Luna

On July 8, 2026, OpenAI said its GPT‑5.6 model family—flagship Sol, mid‑range Terra and low‑cost Luna—will launch publicly on July 9 after a US‑only preview period. The company confirmed the news in statements cited by outlets including The Star in Malaysia and HuffPost Spain.

Jul 8, 2026releaseImpact: 90/100

US Clears GPT‑5.6 for Wide Release

The US Department of Commerce has approved a broad release of OpenAI’s GPT‑5.6 model after additional safety testing, Axios reported on July 8, 2026. OpenAI is expected to roll out GPT‑5.6 more widely later this week following a staggered preview limited to vetted partners.

Jul 8, 2026releaseImpact: 90/100

Google Launches Gemini 3 With 1M Context

Google quietly rolled out its Gemini 3 model in early July 2026, adding multimodal capabilities, a one‑million‑token context window and a Deep Think reasoning mode. The model is now available across Gemini app, AI Studio, Vertex AI and popular developer tools, with Google claiming top scores on reasoning and coding benchmarks.

Jul 5, 2026releaseImpact: 90/100

MGX seals $49B sovereign AI mega‑fund

Abu Dhabi–based MGX announced on July 3, 2026 that it has closed its first AI fund at $49 billion, exceeding a $45 billion target, with capital from Mubadala, G42 and global institutional investors. The fund has already backed Anthropic, OpenAI, xAI, Binance, Together AI and a $40 billion Aligned Data Centres acquisition. ([easternherald.com](https://easternherald.com/2026/07/03/mgx-abu-dhabi-ai-fund-49-billion-mubadala/))

Jul 3, 2026fundingImpact: 90/100

China’s GLM‑5.2: open 1M‑context frontier model

On July 3, 2026, Euronews’ Turkish edition reported on GLM‑5.2, a large open‑weight AI model from Chinese firm Zhipu AI positioned as a cheaper competitor to Anthropic and OpenAI. The piece highlights vendor-reported benchmarks where GLM‑5.2 leads other open models and closes much of the gap with top U.S. proprietary systems at lower cost.

Jul 3, 2026releaseImpact: 80/100

AI Soaks Up 70% of Record $510B Startup Funding

Crunchbase reports that global startup investment reached a record $510 billion in the first half of 2026, with more than 70% of Q2 funding going to AI-focused companies. Sixteen startups raised billion‑dollar rounds in Q2, including several frontier labs in the US, China and the UK.

Jul 2, 2026fundingImpact: 80/100

China’s GLM-5.2 Undercuts US Frontier Models

Reuters reports that Beijing-based startup Z.ai’s GLM-5.2 model, launched in June 2026, is drawing strong Western developer interest with agentic and coding performance close to leading US models at roughly one‑sixth the cost. The model has climbed OpenRouter usage charts above Anthropic’s models and is sparking debate over whether Chinese labs are closing the AI gap with the US.

Jul 2, 2026releaseImpact: 90/100

ChatGPT falls below 50% as rivals surge

Chile’s TVN reported on June 30, 2026 that ChatGPT’s global market share among AI assistants has dropped to 46.4%, according to Sensor Tower’s State of AI Report 2026. Google’s Gemini now holds 27.7% share and Anthropic’s Claude 10.3%, with other assistants like Grok, Perplexity, DeepSeek and Meta AI each below 5%.

Jun 30, 2026benchmarkImpact: 70/100

Grok 4.5: 1.5T‑param dev‑data giant hits beta

On June 28, 2026, Elon Musk said on X that xAI’s Grok 4.5, built on a 1.5‑trillion‑parameter V9 foundation model trained with Cursor coding data, has entered closed beta testing inside SpaceX and Tesla. Follow‑up coverage reports Musk claims Grok 4.5 approaches or may surpass Anthropic’s Claude Opus/Mythos family on internal benchmarks.

Jun 28, 2026releaseImpact: 80/100

GLM‑5.2: 1M‑Token Open Coding Model Nears Frontier

On June 28, 2026, AI News Blitz highlighted China’s Zhipu AI (Z.ai) GLM‑5.2 model, an open‑weight large language model released earlier in June under the MIT license with a 1M‑token context and strong coding performance, now drawing interest from Silicon Valley developers. The model’s weights are available via Z.ai and Hugging Face, positioning it as a serious open alternative to closed US frontier systems. ([ainewsblitz.com](https://www.ainewsblitz.com/brief/OZY4iM8puhrh))

Jun 28, 2026releaseImpact: 90/100

GPT‑5.6 launches under tight US controls

On June 28, 2026, LLM Rumors published an in‑depth analysis of OpenAI’s new GPT‑5.6 family, launched June 26 in a limited preview with three tiers: Sol, Terra and Luna. The piece details pricing, capabilities and notes that US government-requested restrictions mean only a small set of vetted partners can use the flagship Sol model for now.([llmrumors.com](https://www.llmrumors.com/news/openai-gpt56-sol-terra-luna-government-preview))

Jun 28, 2026releaseImpact: 90/100

GPT‑5.6 Sol/Terra/Luna enter restricted preview

On June 27, 2026, OpenAI’s GPT-5.6 model family—Sol, Terra and Luna—became available in a limited preview to about 20 government‑approved partner organizations via API and Codex. Media reports and developer blogs confirm the rollout remains tightly restricted at the request of the US government, with broader access promised in the coming weeks.

Jun 27, 2026releaseImpact: 90/100

OpenAI previews GPT‑5.6 Sol under US limits

On June 26, 2026, OpenAI began a limited preview of its GPT‑5.6 model family—Sol, Terra and Luna—while delaying a full public rollout at the request of the US government. Access is initially restricted to about 20 “trusted partners” whose participation was cleared by federal officials, with broader availability promised in the coming weeks.

Jun 27, 2026releaseImpact: 90/100

Anthropic passes OpenAI in $8T AI unicorn boom

On June 26, 2026, Business Standard reported that the number of global unicorns hit a record 1,603 with a combined valuation of $8 trillion, driven largely by artificial intelligence startups. Citing the Hurun Global Unicorn Index 2026, the piece notes that Anthropic’s valuation has surged to $965 billion, overtaking OpenAI at $852 billion as the world’s most valuable unicorn. AI unicorns now represent 36% of total unicorn value, despite having a similar count to fintech firms.

Jun 26, 2026fundingImpact: 90/100

Reasoning LLMs Waste Tokens on Wrong Answers

On June 28, 2026, 24 AI summarized a new arXiv paper (2606.26502) by Han‑yu Wang showing that large reasoning models expend more tokens on tasks they ultimately get wrong than on those they solve, in sharp contrast to human behavior on the same benchmarks. The study measures this effect across multiple models on the H‑ARC benchmark, finding large effect sizes (Cohen’s d 1.47–3.13). ([24-ai.news](https://24-ai.news/en/?utm_source=openai))

Jun 25, 2026paperImpact: 70/100

Menlo raises $3B on Anthropic’s AI boom

Menlo Ventures announced on June 23, 2026 that it has raised $3 billion across new funds, the largest capital raise in the firm’s 50-year history. TechCrunch reports the haul is driven heavily by Menlo’s early and concentrated bet on Anthropic, whose stake is now valued at roughly $14 billion according to sources cited from Bloomberg.

Jun 23, 2026fundingImpact: 70/100

Leading Organizations

OpenAI

DeepMind

Anthropic

ArXiv Categories

cs.LGcs.AIcs.CL

Related Frontiers

Memory Multimodal

Foundation Models & Reasoning

Key Benchmarks

GPQA Diamond

MMLU-Pro

HLE (Humanity's Last Exam)

Recent Papers

From SRA to Self-Flow: Data Augmentation or Self-Supervision?

Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Program-as-Weights: A Programming Paradigm for Fuzzy Functions

Multi-Resolution Flow Matching: Training-Free Diffusion Acceleration via Staged Sampling

Valdi: Value Diffusion World Models

The State-Prediction Separation Hypothesis

AutoTrainess: Teaching Language Models to Improve Language Models Autonomously

IsoSci: A Benchmark of Isomorphic Cross-Domain Science Problems for Evaluating Reasoning versus Knowledge Retrieval in LLMs

CausalMix: Data Mixture as Causal Inference for Language Model Training

AI-Model Network: Concept, Current State and Future

Recent Milestones

Grok 4.5 launches as ‘Opus‑class’ rival

GPT‑5.6 clears US review for broad release

OpenAI rolls out GPT‑5.6 Sol, Terra, Luna

Microsoft routes Copilot to in‑house MAI models

OpenAI Publicly Launches GPT‑5.6 Sol, Terra, Luna

US Clears GPT‑5.6 for Wide Release

Google Launches Gemini 3 With 1M Context

MGX seals $49B sovereign AI mega‑fund

China’s GLM‑5.2: open 1M‑context frontier model

AI Soaks Up 70% of Record $510B Startup Funding

China’s GLM-5.2 Undercuts US Frontier Models

ChatGPT falls below 50% as rivals surge

Grok 4.5: 1.5T‑param dev‑data giant hits beta

GLM‑5.2: 1M‑Token Open Coding Model Nears Frontier

GPT‑5.6 launches under tight US controls

GPT‑5.6 Sol/Terra/Luna enter restricted preview

OpenAI previews GPT‑5.6 Sol under US limits

Anthropic passes OpenAI in $8T AI unicorn boom

Reasoning LLMs Waste Tokens on Wrong Answers

Menlo raises $3B on Anthropic’s AI boom

Leading Organizations

ArXiv Categories

Related Frontiers