Back to AI Lab

Image

Research papers, repositories, and articles about image

Showing 5 of 5 items

Image Diffusion Preview with Consistency Solver

From DeepMind, this work uses consistency-based solvers to let users preview diffusion model outputs much more quickly than running a full sampling schedule. The idea is to generate rough-but-faithful previews that can guide prompt iteration and editing, then refine on demand. It’s another example of how inference-side tricks—not just bigger models—are improving practical usability of image generation.

Fu-Yun Wang, Hao Zhou

MAOAM: Unified Object and Material Selection with Vision-Language Models

MAOAM uses a single vision-language model to select both objects and materials in images from text or clicks. That enables more precise, flexible photo and video editing than today’s simple masks. If you build creative tools, this points to where AI-powered selection is heading.

Jaden Park, Valentin Deschaintre

NanoFLUX: Distillation-Driven Compression of Large Text-to-Image Generation Models for Mobile Devices

NanoFLUX distills big image generators into much smaller models that still follow prompts well on phones. It uses smart loss functions to keep visual quality while slashing memory and compute.

Ruchika Chavhan, Malcolm Chadwick

HairPort: In-context 3D-aware Hair Import and Transfer for Images

HairPort separates hair removal from transfer using a LoRA-adapted image model that first creates realistic bald faces. It then reconstructs the source hairstyle in 3D and re-renders it from the target viewpoint, enabling robust hair transfer under large pose and scale changes.

Alireza Heidari, Amirhossein Alimohammadi

upscayl/upscayl

A cross‑platform image upscaler that uses open models to sharpen low-res photos. It’s a simple way to add high-quality upscaling to creative pipelines.

43,133