Scaling
Research papers, repositories, and articles about scaling
Showing 3 of 3 items
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling
The InternVL 2.5 work pushes an open multimodal model to match or beat top proprietary systems on tough benchmarks. It digs into how model size, data curation, and smart test-time tricks together move the performance frontier.
NVIDIA Blackwell Leads on First Agentic AI Infrastructure Benchmark
AgentPerf, the first benchmark for agent workloads, shows NVIDIA’s Blackwell platform running many more agents per megawatt than older GPUs. It frames agent performance as an energy and density game, not just raw tokens per second.
MoEBlaze: Breaking the Memory Wall for Efficient MoE Training on Modern GPUs
MoEBlaze redesigns mixture‑of‑experts training to cut activation memory and data movement on GPUs. It claims over 4× speedups and 50% memory savings versus existing frameworks, which directly matters for anyone pushing bigger sparse models.