Back to AI Lab
Running The Ai
Research papers, repositories, and articles about running the ai
Showing 2 of 2 items
huggingface/transformers
The standard library for state-of-the-art models in text, vision, audio, and combined formats. If you build with open models, you almost certainly depend on this already.
156,240
Make Your LVLM KV Cache More Lightweight
Targets the memory blow-up from vision tokens in large vision–language models when you run the AI. Uses a prompt-aware method, LightKV, to merge redundant vision tokens before decoding. If you ship LVLMs, this is a concrete way to cut GPU memory and costs without killing quality. ([arxiv.org](https://arxiv.org/list/cs.CV/pastweek?show=100))
Anonymous (ICLR and TMLR drafts; arXiv metadata lists named authors)