Back to AI Lab

Finetuning

Research papers, repositories, and articles about finetuning

Showing 6 of 6 items

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

T-pro 2.0 is an open-weight Russian large language model focused on hybrid reasoning: it can answer directly or emit explicit reasoning traces, and it’s optimized for low-latency inference via speculative decoding. Alongside the model, the authors release a Russian instruction corpus, a math benchmark, and an EAGLE-based inference stack, making it a practical foundation for Russian-language reasoning applications.

Dmitrii Stoianov, Danil Taranets

An Introduction to Large Language Models: Prompt ...

This introductory post explains what LLMs are and why they’re powerful, then walks through practical prompt‑engineering patterns (zero‑shot, few‑shot, chain‑of‑thought) and P‑tuning as a lightweight way to specialize models for particular tasks. Developers new to LLMs get concrete examples of how to structure prompts and when to switch from prompting to parameter‑efficient tuning, along with intuition about the trade‑offs in scale and data. ([developer.nvidia.com](https://developer.nvidia.com/blog/an-introduction-to-large-language-models-prompt-engineering-and-p-tuning/))

NVIDIA AI

tinker-cookbook

tinker-cookbook provides practical, end‑to‑end examples of post‑training LLMs using Tinker, a managed fine‑tuning API from Thinking Machines Lab that handles distributed training while you control the algorithms and data. The repo includes recipes for instruction tuning, math reasoning, RLHF-style preference learning, tool use, prompt distillation, and multi-agent setups, making it a strong starting point if you want to fine‑tune open-weight models like Llama or Qwen without building your own training stack. ([github.com](https://github.com/thinking-machines-lab/tinker-cookbook?utm_source=openai))

2,434

Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization

Omni-Attribute is an open-vocabulary attribute encoder that learns to isolate specific visual factors—like style, lighting, or expression—rather than entangling everything into a single holistic embedding. Using curated positive/negative pairs and a dual generative/contrastive objective, it produces attribute-specific embeddings that are better for retrieval, personalization, and compositional image generation.

Tsai-Shien Chen, Aliaksandr Siarohin

ReViSE: Towards Reason-Informed Video Editing in Unified Models with Self-Reflective Learning

ReViSE defines a new Reason-Informed Video Editing task and benchmark, then introduces a unified video model that edits while continuously self-evaluating its own reasoning. A built-in VLM judges whether the edited video logically satisfies the instruction, providing self-reflective feedback that tightens the link between "understanding" and actual visual edits.

Xinyu Liu, Hangjie Yuan

RO-ViT: Region-aware pre-training for open-vocabulary ...

RO‑ViT proposes a region-aware pretraining scheme for vision transformers that uses cropped positional embeddings and focal loss to better align image–text pretraining with region-level object detection. Developers building open‑vocabulary detectors can reuse these ideas—plus the released code—to boost novel‑class detection without changing model capacity, especially when fine‑tuning ViT backbones on detection datasets. ([ai.googleblog.com](https://ai.googleblog.com/2023/08/ro-vit-region-aware-pre-training-for.html))

Google AI Blog