Back to AI Lab

Training

Research papers, repositories, and articles about training

Showing 9 of 9 items

Stronger Normalization-Free Transformers

Introduces Derf, a simple point-wise activation that replaces normalization layers like LayerNorm and RMSNorm while improving generalization across vision, speech, DNA sequence modeling, and GPT-style language models. The authors systematically study properties of point-wise functions, run a large-scale search, and show Derf outperforms prior normalization-free approaches (e.g., Dynamic Tanh) with similar or better stability. ([arxiv.org](https://arxiv.org/abs/2512.10938))

Mingzhi Chen, Taiming Lu

Stronger Normalization-Free Transformers

HF highlights this work for showing that a carefully designed point-wise activation (Derf) can fully replace normalization layers in Transformers and still improve performance across multiple domains. For practitioners, it points toward simpler, potentially faster architectures without layer norm’s synchronization and batch-size headaches. ([huggingface.co](https://huggingface.co/papers/2512.10938))

Mingzhi Chen, Taiming Lu

Reflective Preference Optimization (RPO): Enhancing On-Policy Alignment via Hint-Guided Reflection

Builds on Direct Preference Optimization but tackles its weak learning signal when both preferred and rejected responses share similar flaws. RPO adds a hint-guided reflection step that encourages the model to produce more contrastive, informative preference pairs before optimizing them. The result is a more stable and data-efficient on-policy alignment pipeline that still avoids full RLHF/RLAIF complexity.

Zihui Zhao, Zechang Li

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Describes the QwenLong-L1.5 post-training recipe for extending LLM context windows while keeping reasoning quality intact. The work focuses not just on positional encodings but also on memory management strategies and training curricula that keep long-context performance from collapsing. This is highly relevant for anyone trying to turn a baseline LLM into a stable long-context model without re‑training from scratch.

Weizhou Shen, Ziyi Yang

nanoGPT

Karpathy’s minimalist GPT training repo continues to trend, reflecting ongoing interest in from-scratch pretraining and fine-tuning for medium-sized LLMs. Still one of the best learning references if you want to understand the guts of GPT-style models. ([github.com](https://github.com/trending?since=daily))

51,051

GPT-SoVITS

GPT-SoVITS is a hugely popular WebUI and pipeline for few-shot TTS and voice conversion, enabling convincing voice cloning with as little as 5 seconds to 1 minute of audio, plus dataset prep tools (separation, ASR, labeling) and multi-lingual support (EN/JA/KO/ZH/Cantonese). If you’re experimenting with custom voices, VTuber-style content, or rapid TTS prototyping on consumer GPUs, this is effectively the community standard toolkit. ([github.com](https://github.com/RVC-Boss/GPT-SoVITS?utm_source=openai))

53,110

GeoDM: Geometry-aware Distribution Matching for Dataset Distillation

Proposes GeoDM, a dataset distillation framework that performs distribution matching in a product space of Euclidean, hyperbolic, and spherical manifolds, with learnable curvature and weights. This geometry-aware approach yields lower generalization error bounds and consistently outperforms prior distillation methods by better aligning synthetic and real-data manifolds. ([arxiv.org](https://arxiv.org/abs/2512.08317?utm_source=openai))

Xuhui Li, Zhengquan Luo

tinker-cookbook

tinker-cookbook provides practical, end‑to‑end examples of post‑training LLMs using Tinker, a managed fine‑tuning API from Thinking Machines Lab that handles distributed training while you control the algorithms and data. The repo includes recipes for instruction tuning, math reasoning, RLHF-style preference learning, tool use, prompt distillation, and multi-agent setups, making it a strong starting point if you want to fine‑tune open-weight models like Llama or Qwen without building your own training stack. ([github.com](https://github.com/thinking-machines-lab/tinker-cookbook?utm_source=openai))

2,434

geoai

geoai is a Python package from the opengeos ecosystem that integrates deep-learning frameworks (PyTorch, Transformers, segmentation models) with geospatial tooling to handle everything from remote-sensing data download and tiling to training, inference, and interactive map visualization. It’s aimed at practitioners who want a higher-level, batteries-included stack for tasks like land-cover classification, building footprint extraction, and change detection, without reinventing all the GIS + ML plumbing. ([github.com](https://github.com/opengeos/geoai?utm_source=openai))

2,116