Running The Ai

Research papers, repositories, and articles about running the ai

Showing 2 of 2 items

huggingface/transformers

The standard library for state-of-the-art models in text, vision, audio, and combined formats. If you build with open models, you almost certainly depend on this already.

156,240

Make Your LVLM KV Cache More Lightweight

Targets the memory blow-up from vision tokens in large vision–language models when you run the AI. Uses a prompt-aware method, LightKV, to merge redundant vision tokens before decoding. If you ship LVLMs, this is a concrete way to cut GPU memory and costs without killing quality. ([arxiv.org](https://arxiv.org/list/cs.CV/pastweek?show=100))

Anonymous (ICLR and TMLR drafts; arXiv metadata lists named authors)