Small Models

Research papers, repositories, and articles about small models

Showing 1 of 1 items

Stop Rewarding Hallucinated Steps: Faithfulness-Aware Step-Level Reinforcement Learning for Small Reasoning Models

This paper trains small reasoning models with rewards that check whether each intermediate step actually follows from earlier ones. That reduces reward hacks where the model spews long but logically broken chains of thought.

Shuo Nie, Hexuan Deng