Synthetic Data
Research papers, repositories, and articles about synthetic data
Showing 2 of 2 items
Synthetic Computers at Scale for Long-Horizon Productivity Simulation
Builds thousands of synthetic "computers" with realistic files and calendars to simulate month-long knowledge work for AI agents. Each run spans 8+ hours and ~2,000 steps, yielding dense signals for training long-horizon productivity agents. If you are designing office copilots or agent training curricula, copy this setup to cheaply generate rich experience data. ([arxiv.org](https://arxiv.org/abs/2604.28181))
Revisiting Vehicle Color Recognition in Long-Tailed Surveillance Scenarios
This work studies vehicle color recognition when rare colors are heavily underrepresented in surveillance data. It mixes synthetic images from diffusion and commercial image editors with tailored training tricks to lift macro accuracy by over eight points on a real-world dataset.