Post Training

Research papers, repositories, and articles about post training

Showing 1 of 1 items

Co-Evolving Policy Distillation

Unifies two popular post‑training styles and shows why naively merging many expert policies can lose capabilities. Proposes a bidirectional distillation loop where student and experts improve together. If you juggle multiple specialist models, this offers a more stable way to fold them into one. ([huggingface.co](https://huggingface.co/papers/2604.27083))

Naibin Gu, Chenxu Yang