Developmental Approach Reveals the Statistical Learning of Neural Language Models
Trains transformers on a synthetic grammar and snapshots them over time to see how they internalize patterns. Finds that they first pick up global statistics, then refine local rules. If you care about curriculum design or "how models learn," this gives concrete evidence, not just anecdotes. ([arxiv.org](https://arxiv.org/list/cs.CL/new))
Wang Bojun, Holly Jenkins