On Device

Research papers, repositories, and articles about on device

Showing 3 of 3 items

ggml-org/llama.cpp

llama.cpp keeps pushing local LLM performance on CPUs and small GPUs. It’s still the reference for running big models on modest hardware. If you care about running the AI cheaply or on-device, you should track every major change here.

115,330

ggml-org/whisper.cpp

A fast C/C++ port of OpenAI’s Whisper that runs on laptops, phones, and edge devices. It’s the go-to option when you need offline speech transcription.

46,559

NanoFLUX: Distillation-Driven Compression of Large Text-to-Image Generation Models for Mobile Devices

NanoFLUX distills big image generators into much smaller models that still follow prompts well on phones. It uses smart loss functions to keep visual quality while slashing memory and compute.

Ruchika Chavhan, Malcolm Chadwick