Finetuning
Research papers, repositories, and articles about finetuning
Showing 6 of 6 items
T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground
T-pro 2.0 is an open-weight Russian large language model focused on hybrid reasoning: it can answer directly or emit explicit reasoning traces, and it’s optimized for low-latency inference via speculative decoding. Alongside the model, the authors release a Russian instruction corpus, a math benchmark, and an EAGLE-based inference stack, making it a practical foundation for Russian-language reasoning applications.
An Introduction to Large Language Models: Prompt ...
This introductory post explains what LLMs are and why they’re powerful, then walks through practical prompt‑engineering patterns (zero‑shot, few‑shot, chain‑of‑thought) and P‑tuning as a lightweight way to specialize models for particular tasks. Developers new to LLMs get concrete examples of how to structure prompts and when to switch from prompting to parameter‑efficient tuning, along with intuition about the trade‑offs in scale and data. ([developer.nvidia.com](https://developer.nvidia.com/blog/an-introduction-to-large-language-models-prompt-engineering-and-p-tuning/))
tinker-cookbook
tinker-cookbook provides practical, end‑to‑end examples of post‑training LLMs using Tinker, a managed fine‑tuning API from Thinking Machines Lab that handles distributed training while you control the algorithms and data. The repo includes recipes for instruction tuning, math reasoning, RLHF-style preference learning, tool use, prompt distillation, and multi-agent setups, making it a strong starting point if you want to fine‑tune open-weight models like Llama or Qwen without building your own training stack. ([github.com](https://github.com/thinking-machines-lab/tinker-cookbook?utm_source=openai))
Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization
Omni-Attribute is an open-vocabulary attribute encoder that learns to isolate specific visual factors—like style, lighting, or expression—rather than entangling everything into a single holistic embedding. Using curated positive/negative pairs and a dual generative/contrastive objective, it produces attribute-specific embeddings that are better for retrieval, personalization, and compositional image generation.
ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning
ReViSE defines a new Reason-Informed Video Editing task and benchmark, then introduces a unified video model that edits while continuously self-evaluating its own reasoning. A built-in VLM judges whether the edited video logically satisfies the instruction, providing self-reflective feedback that tightens the link between "understanding" and actual visual edits.
RO-ViT: Region-aware pre-training for open-vocabulary ...
RO‑ViT proposes a region-aware pretraining scheme for vision transformers that uses cropped positional embeddings and focal loss to better align image–text pretraining with region-level object detection. Developers building open‑vocabulary detectors can reuse these ideas—plus the released code—to boost novel‑class detection without changing model capacity, especially when fine‑tuning ViT backbones on detection datasets. ([ai.googleblog.com](https://ai.googleblog.com/2023/08/ro-vit-region-aware-pre-training-for.html))