Image
Research papers, repositories, and articles about image
Showing 5 of 5 items
Image Diffusion Preview with Consistency Solver
From DeepMind, this work uses consistency-based solvers to let users preview diffusion model outputs much more quickly than running a full sampling schedule. The idea is to generate rough-but-faithful previews that can guide prompt iteration and editing, then refine on demand. It’s another example of how inference-side tricks—not just bigger models—are improving practical usability of image generation.
MAOAM: Unified Object and Material Selection with Vision-Language Models
MAOAM uses a single vision-language model to select both objects and materials in images from text or clicks. That enables more precise, flexible photo and video editing than today’s simple masks. If you build creative tools, this points to where AI-powered selection is heading.
NanoFLUX: Distillation-Driven Compression of Large Text-to-Image Generation Models for Mobile Devices
NanoFLUX distills big image generators into much smaller models that still follow prompts well on phones. It uses smart loss functions to keep visual quality while slashing memory and compute.
HairPort: In-context 3D-aware Hair Import and Transfer for Images
HairPort separates hair removal from transfer using a LoRA-adapted image model that first creates realistic bald faces. It then reconstructs the source hairstyle in 3D and re-renders it from the target viewpoint, enabling robust hair transfer under large pose and scale changes.
upscayl/upscayl
A cross‑platform image upscaler that uses open models to sharpen low-res photos. It’s a simple way to add high-quality upscaling to creative pipelines.