Scaling

Research papers, repositories, and articles about scaling

Showing 3 of 3 items

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

The InternVL 2.5 work pushes an open multimodal model to match or beat top proprietary systems on tough benchmarks. It digs into how model size, data curation, and smart test-time tricks together move the performance frontier.

Zhe Chen, Weiyun Wang

NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark

AgentPerf, the first benchmark for agent workloads, shows NVIDIA’s Blackwell platform running many more agents per megawatt than older GPUs. It frames agent performance as an energy and density game, not just raw tokens per second.

NVIDIA Blog

MoEBlaze: Breaking the Memory Wall for Efficient MoE Training on Modern GPUs

MoEBlaze redesigns mixture‑of‑experts training to cut activation memory and data movement on GPUs. It claims over 4× speedups and 50% memory savings versus existing frameworks, which directly matters for anyone pushing bigger sparse models.

Jiyuan Zhang, Yining Liu