Inference
Research papers, repositories, and articles about inference
Showing 9 of 9 items
stable-diffusion-webui
stable-diffusion-webui by AUTOMATIC1111 is the de facto standard local web interface for Stable Diffusion, providing a massive feature set—txt2img, img2img, inpainting/outpainting, upscaling, LoRA/embeddings support, training utilities, and a huge extension ecosystem—on top of consumer GPUs. If you’re doing any kind of image generation or fine-tuning with Stable Diffusion in a local or lab environment, this is usually the first tool people reach for and the one most community workflows target. ([github.com](https://github.com/AUTOMATIC1111/stable-diffusion-webui?utm_source=openai))
T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground
T-pro 2.0 is an open-weight Russian large language model focused on hybrid reasoning: it can answer directly or emit explicit reasoning traces, and it’s optimized for low-latency inference via speculative decoding. Alongside the model, the authors release a Russian instruction corpus, a math benchmark, and an EAGLE-based inference stack, making it a practical foundation for Russian-language reasoning applications.
ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
ReFusion is a masked diffusion model for text that decodes in parallel over contiguous ‘slots’ instead of individual tokens. By combining diffusion-based planning with autoregressive infilling, it recovers much of the quality of strong autoregressive LLMs while massively speeding up generation and allowing KV-cache reuse. This is one of the more serious attempts to rethink LLM decoding beyond the usual left-to-right paradigm.
The new ChatGPT Images is here
OpenAI announces a new image generation model powering ChatGPT’s ‘Images’ experience, with a focus on more precise edits, better consistency across parts of an image, and finer control over style. The post walks through examples like detailed object editing and iterative refinement inside the chat UI, positioning images as a first-class modality alongside text and code. For developers, it signals that OpenAI’s flagship image stack is now accessible through a very productized, user-facing interface.
Image Diffusion Preview with Consistency Solver
From DeepMind, this work uses consistency-based solvers to let users preview diffusion model outputs much more quickly than running a full sampling schedule. The idea is to generate rough-but-faithful previews that can guide prompt iteration and editing, then refine on demand. It’s another example of how inference-side tricks—not just bigger models—are improving practical usability of image generation.
daytona
Daytona is a secure, elastic runtime for executing AI-generated code and agent workflows in isolated sandboxes, with Python and TypeScript SDKs to spin up environments in sub‑100ms and run arbitrary code, processes, or dev tools. It’s quickly becoming a go-to “agent runtime” layer for teams that need safe, persistent, and massively parallel sandboxes (including LangChain’s open-source coding agent), instead of gluing together ad‑hoc Docker or VM setups. ([github.com](https://github.com/daytonaio/daytona?utm_source=openai))
GPT-SoVITS
GPT-SoVITS is a hugely popular WebUI and pipeline for few-shot TTS and voice conversion, enabling convincing voice cloning with as little as 5 seconds to 1 minute of audio, plus dataset prep tools (separation, ASR, labeling) and multi-lingual support (EN/JA/KO/ZH/Cantonese). If you’re experimenting with custom voices, VTuber-style content, or rapid TTS prototyping on consumer GPUs, this is effectively the community standard toolkit. ([github.com](https://github.com/RVC-Boss/GPT-SoVITS?utm_source=openai))
BEAVER: An Efficient Deterministic LLM Verifier
BEAVER is a deterministic verifier for large language models that computes tight, provably-sound bounds on the probability that a model satisfies a given semantic constraint. Instead of sampling and hoping for the best, it systematically explores the token space with specialized data structures, yielding much sharper risk estimates for correctness, privacy, and security-critical applications.
geoai
geoai is a Python package from the opengeos ecosystem that integrates deep-learning frameworks (PyTorch, Transformers, segmentation models) with geospatial tooling to handle everything from remote-sensing data download and tiling to training, inference, and interactive map visualization. It’s aimed at practitioners who want a higher-level, batteries-included stack for tasks like land-cover classification, building footprint extraction, and change detection, without reinventing all the GIS + ML plumbing. ([github.com](https://github.com/opengeos/geoai?utm_source=openai))