Back to AI Lab
Kv Cache
Research papers, repositories, and articles about kv cache
Showing 2 of 2 items
Information-Aware KV Cache Compression for Long Reasoning
InfoKV mixes attention scores with an information-theory signal that tracks how much a token affects future predictions. This lets the model drop uninformative tokens while keeping rare but important ones, improving long-context reasoning under tight memory. If you fight KV blowup, this suggests a smarter eviction policy. ([huggingface.co](https://huggingface.co/papers/2606.26875))
Jushi Kai, Zhuiri Xiao
KV-CoRE: Benchmarking Data-Dependent Low-Rank Compressibility of KV-Caches in LLMs
KV-CoRE systematically measures how well key-value caches in different layers and tasks can be compressed with low-rank methods. It helps engineers know where cache compression will save memory without wrecking accuracy.
Jian Chen, Zhuoran Wang