ArXiv Paper

MaxCode: A Max-Reward Reinforcement Learning Framework for Automated Code Optimization

Jiefu Ou, Sapana Chaudhary, Kaj Bostrom +4January 12, 2026

Summary

MaxCode treats code optimization as a reinforcement learning search over code edits guided by runtime feedback. It uses natural-language critiques and a reward model to steer generation, beating past systems at speeding up CUDA and C++ kernels.

Topics

coding rl optimization

View Original View PDF

Related Content

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

The authors build SpatialClaw, a code-driven agent that uses a stateful Python kernel plus vision tools to solve 3D and 4D spatial puzzles. It beats prior spatial agents across 20 benchmarks and six vision-language backbones, showing that the action interface design can unlock much stronger spatial reasoning.

anomalyco/opencode

OpenCode is an open-source coding agent that edits and writes code for you, wired into modern tooling. Use it as a local, hackable alternative to proprietary AI dev environments.

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

Wraps real robots in a closed-loop system where coding agents iteratively reset scenes, run policies, check results, and improve code. If you’re serious about autonomous robot labs, this is basically a blueprint.

anthropics/claude-code

Terminal-native coding agent that understands your repo, runs commands, and handles git via natural language. It’s a reference design for serious AI coding tools. If you’re building dev agents, study its workflows and safety checks before reinventing them.