Foundation Models & Reasoning
Core model architectures, training methods, chain-of-thought reasoning, and test-time compute scaling. The backbone of modern AI capabilities.
Key Benchmarks
Recent Papers
Fast-weight Product Key Memory
Tianyu Zhao, Llion Jones
Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling
Chulun Zhou, Chunkang Zhang, Guoxin Yu +4 more
Deep Delta Learning
Yifan Zhang, Yifeng Liu, Mengdi Wang +1 more
Generative Adversarial Reasoner: Enhancing LLM Reasoning with Adversarial Reinforcement Learning
Qihao Liu, Luoxin Ye, Wufei Ma +2 more
Constructive Circuit Amplification: Improving Math Reasoning in LLMs via Targeted Sub-Network Updates
Nikhil Prakash, Donghao Ren, Dominik Moritz +1 more
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Peter Chen, Xiaopeng Li, Ziniu Li +3 more
How Good is Post-Hoc Watermarking With Language Model Rephrasing?
Pierre Fernandez, Tom Sander, Hady Elsahar +6 more
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Peter Chen, Xiaopeng Li, Ziniu Li +3 more
Next-Embedding Prediction Makes Strong Vision Learners
Sihan Xu, Ziqiao Ma, Wenhao Chai +5 more
Differences That Matter: Auditing Models for Capability Gap Discovery and Rectification
Qihao Liu, Chengzhi Mao, Yaojie Liu +2 more
Recent Milestones
MiniMax raises $619M in blockbuster Hong Kong AI IPO
Chinese generative AI startup MiniMax Group surged as much as 81% and closed up around 78% on its Hong Kong trading debut on January 9, 2026, after raising about HK$4.8 billion (~$619 million) in an IPO priced at the top of its range. Cornerstone investors including Alibaba and Abu Dhabi Investment Authority bought roughly 56% of the deal, underscoring strong demand for China’s first wave of public LLM companies.
Emory Debuts Unified Framework for AI Training
On January 4, 2026, Emory University researchers unveiled a unified mathematical framework for multimodal AI systems, comparing it to a “periodic table” that organizes successful methods. The Variational Multivariate Information Bottleneck Framework reframes many loss functions as instances of a single information‑compression tradeoff and was detailed in a Journal of Machine Learning Research paper.
Asia markets ride AI; China eyes $70B chip push
Moneycontrol, citing Bloomberg, reported on Jan. 4, 2026 at 7:22 a.m. IST that Asian equities have started 2026 strongly but face risks from a potential AI bubble and diverging monetary policies. The piece highlights China’s consideration of up to $70 billion in semiconductor incentives and strong AI‑driven rallies in markets like Korea and Taiwan.([moneycontrol.com](https://www.moneycontrol.com/news/business/markets/ai-bubble-fears-and-policy-splits-loom-over-asia-stocks-in-2026-13755495.html?utm_source=openai))
LeCun quits Meta to build AMI world-model AI
Meta’s longtime chief AI scientist Yann LeCun is leaving the company to co‑found Advanced Machine Intelligence Labs, a new startup pursuing alternatives to large language models. In a January 3, 2026 interview, he alleged Meta’s Llama 4 benchmarks were “fudged” and criticized the company’s LLM‑centric strategy while detailing his new world‑model‑based research agenda.
DeepSeek mHC signals V4-scale model is ready
Chinese AI lab DeepSeek released a new architecture, “mHC: Manifold-Constrained Hyper-Connections,” that stabilizes training for very large models while boosting reasoning and reading performance. Chinese coverage and a new summary from Sina on January 3 say internal experiments and wording in the paper strongly suggest DeepSeek’s next flagship model, DeepSeek V4, has already finished training and could launch around Lunar New Year 2026. ([arxiv.org](https://arxiv.org/abs/2512.24880?utm_source=openai))
Baidu Chip Arm Kunlunxin Targets Hong Kong IPO
On January 2, 2026, Baidu disclosed that its AI chip subsidiary Kunlunxin has confidentially filed for an initial public offering on the Hong Kong Stock Exchange. The planned spin-off will keep Kunlunxin as a Baidu-controlled subsidiary while giving the chip unit direct access to capital markets.
DeepSeek’s mHC aims to cut big-model training costs
On January 2, 2026, Computerworld reported that Chinese AI firm DeepSeek has introduced Manifold-Constrained Hyper-Connections (mHC), an evolution of the Hyper-Connections technique originally developed at ByteDance. DeepSeek claims mHC enables more stable, scalable training of large language models up to 27 billion parameters without increasing compute cost.
Biren’s $717M IPO turbocharges China’s AI GPU push
Shanghai Biren Technology raised about US$717 million in its Hong Kong IPO and saw its shares jump around 76% on the first day of trading on January 2, 2026. The GPU-focused AI chipmaker became the first listing of 2026 on the Hong Kong exchange, with the retail tranche reportedly oversubscribed more than 2,300 times.
Baidu’s Kunlunxin moves toward Hong Kong AI chip IPO
Baidu said on January 2, 2026 that its AI semiconductor unit Kunlunxin filed a confidential application to list on the Hong Kong Stock Exchange on January 1. The filing was made under a non-public procedure and deal size has not yet been disclosed.
Taylor-based matrix exponential speeds up AI flows
Researchers from Universitat Politècnica de València and collaborators published a refined Taylor-based algorithm for computing matrix exponentials that outperforms classical methods like Paterson–Stockmeyer. The work claims higher accuracy and lower computational cost, with explicit applications to speeding up training and inference in flow-based generative models that rely on matrix exponentials.
UAE bets its AI strategy on K2 Think reasoning model
A January 2 Arabic-language article, drawing on UAE sources and KPMG analysis, highlights the country’s proactive AI strategy built around initiatives like the open-source K2 Think reasoning model from MBZUAI and G42. The piece notes that over half of UAE CEOs see AI integration as a top strategic priority, with 74% expecting returns on AI investments within one to three years.
Biren $717m IPO fuels China GPU race
Chinese GPU maker Biren Technology debuted on the Hong Kong Stock Exchange on January 2, raising about HK$5.58 billion (roughly $717 million) in the city’s first listing of 2026. Shares opened more than 80% above the IPO price, making Biren the first dedicated GPU stock in Hong Kong and underscoring investor demand for domestic AI compute. ([prnewswire.com](https://www.prnewswire.com/ae/news-releases/domestic-gpu-leader-biren-technology-listed-on-hong-kong-stock-exchange-302651667.html?utm_source=openai))
SoftBank drops $41B on 11% stake in OpenAI
SoftBank completed a $41 billion multi‑tranche investment in OpenAI, giving it roughly an 11% stake, according to a January 1, 2026 report by Italian outlet FIRSTonline that cites SoftBank and Reuters. The final $22–22.5 billion tranche closed at the end of December 2025, valuing OpenAI at around $260 billion pre‑money.
Baidu spins out Kunlunxin for HK chip IPO
Baidu announced that its AI chip subsidiary Kunlunxin has confidentially filed for a spin-off listing on the Hong Kong Stock Exchange main board. Hong Kong and U.S. Baidu shares jumped roughly 9–12% on January 2 after the plan was disclosed, as Chinese and global outlets detailed the move and its estimated multibillion-dollar valuation. ([prnewswire.com](https://www.prnewswire.com/news-releases/baidu-announces-proposed-spin-off-and-separate-listing-of-kunlunxin-302651578.html?utm_source=openai))
New theory sharpens LLM data–compute tradeoffs
On December 31, 2025, Quantum Zeitgeist covered new theoretical work from Sun Yat-sen University that models transformer learning dynamics as a continuous system. The research derives scaling laws and an upper bound on excess risk, showing how generalization error transitions from exponential to power-law decay as data and compute increase.
AI Futures delays full coding automation to 2031
On December 31, 2025, the AI Futures Project released a major update to its quantitative AI timelines and takeoff model. The new model shifts the median forecast for fully automated coding from roughly 2027–2028 to around 2031, while still projecting rapid capability growth and a superintelligence median in the mid-2030s.
DeepSeek, MCP, orbital training mark 2025 AI leap
The Indian Express published a year-end feature on December 31, 2025 highlighting five major AI breakthroughs of 2025, including China’s DeepSeek-R1 reasoning model and Anthropic’s Model Context Protocol moving under the Linux Foundation. The list also cites OpenAI’s ChatGPT image generation craze, gold‑medal‑level math models from OpenAI and Google DeepMind, and Starcloud’s first generative model trained in orbit. ([indianexpress.com](https://indianexpress.com/article/technology/artificial-intelligence/deepseek-ghibli-art-5-biggest-ai-breakthroughs-2025-10448067/))
Tianshu Zhixin IPO Boosts China AI GPU Push
Chinese GPU maker Tianshu Zhixin (9903.HK) began its Hong Kong IPO book‑build, aiming to raise about HK$3.7 billion at an implied valuation of roughly HK$35.4 billion. The company positions itself as a leading domestic general‑purpose GPU vendor serving over 450 AI models across cloud, finance, healthcare and other sectors.([acnnewswire.com](https://www.acnnewswire.com/press-release/simplifiedchinese/104420/%E5%A4%A9%E6%95%B0%E6%99%BA%E8%8A%AF%E4%BB%8A%E8%B5%B7%E6%AD%A3%E5%BC%8F%E6%8B%9B%E8%82%A1-%E4%BB%A5%E7%A1%AC%E5%AE%9E%E5%8A%9B%E7%AD%91%E7%89%A2%E5%9B%BD%E4%BA%A7ai%E7%AE%97%E5%8A%9B%E6%A0%87%E6%9D%86))
SoftBank’s $41B Bet on OpenAI
SoftBank Group announced on December 31, 2025 that it has completed an additional $22.5 billion investment in OpenAI, bringing its total commitment to $41 billion. The deal raises SoftBank’s aggregate ownership in OpenAI to about 11%, with the rest of the round filled by $11 billion from third‑party co‑investors.([group.softbank](https://group.softbank/en/news/press/20251231?utm_source=openai))
Nvidia Moves to Buy AI21 in $3B Talent Grab
On December 30–31, 2025, multiple outlets reported that Nvidia is in advanced negotiations to acquire Israeli generative AI startup AI21 Labs for between $2 billion and $3 billion, citing Calcalist. Reuters said the talks focus heavily on acquiring AI21’s roughly 200-strong large language model talent base, while Times of India and Chinese financial media echoed the reported valuation range.([reuters.com](https://www.reuters.com/business/nvidia-advanced-talks-buy-israels-ai21-labs-up-3-billion-report-says-2025-12-30/))