Back to AI Lab
Moe
Research papers, repositories, and articles about moe
Showing 2 of 2 items
MoEBlaze: Breaking the Memory Wall for Efficient MoE Training on Modern GPUs
MoEBlaze redesigns mixture‑of‑experts training to cut activation memory and data movement on GPUs. It claims over 4× speedups and 50% memory savings versus existing frameworks, which directly matters for anyone pushing bigger sparse models.
Jiyuan Zhang, Yining Liu
The Expert Strikes Back: Interpreting Mixture-of-Experts Language Models at Expert Level
Studies how mixture-of-experts language models actually route work between experts. Offers tools to inspect which expert fires and why, instead of treating MoE as a black box.
Jeremy Herbst, Jae Hee Lee