Infrastructure

Research papers, repositories, and articles about infrastructure

Showing 17 of 17 items

HuggingFace's Transformers: State-of-the-art Natural Language Processing

This 2019 paper launched the Transformers library, giving a clean API around many transformer models and pretrained checkpoints. It turned cutting-edge NLP into a reusable software layer that underpins most open-source LLM work today.

Thomas Wolf, Lysandre Debut

exo-explore/exo

Exo turns a pile of Macs or PCs into one AI cluster so you can run huge models at home. It auto-discovers devices, shards models across them, and uses high-speed links like Thunderbolt to get near data-center performance. ([github.com](https://github.com/trending))

35,600

Confucius Code Agent: An Open-sourced AI Software Engineer at Industrial Scale

HF pitches Confucius Code Agent as an industrial-strength open coding agent with hierarchical working memory, persistent notes, and a meta-agent that continuously refines configurations. If you care about reproducible, extensible coding agents rather than opaque SaaS tools, this is a substantial systems paper. ([huggingface.co](https://huggingface.co/papers/2512.10398))

Zhaodong Wang, Zhenting Qi

HP Inc. launches Frontier strategic partnership with OpenAI

HP is rolling out OpenAI Frontier across the company after pilots proved value in real workflows. Frontier becomes a "connective layer" tying together tools, data, and long-running agents. If you're in enterprise IT, this is a signal that agent platforms are moving from experiments to operating model. ([openai.com](https://openai.com/index/hp-frontier-partnership/?utm_source=openai))

OpenAI

ObjectGraph: From Document Injection to Knowledge Traversal — A Native File Format for the Agentic Era

Proposes a new file format that treats documents as typed graphs instead of long strings dumped into context windows. Agents query and traverse nodes, cutting tokens used by up to ~95% while keeping task accuracy. If your agents still paste whole PDFs into prompts, this hints at a cleaner architecture layer. ([arxiv.org](https://arxiv.org/abs/2604.27820))

Mohit Dubey, Open Gigantic

AI-Model Network: Concept, Current State and Future

Proposes an "AI-ModelNet" that connects many smaller models into a network that can share skills, route requests, and collaborate like the internet of AIs. Useful if you're thinking beyond one giant model and toward fleets of specialized models that talk to each other. ([arxiv.org](https://arxiv.org/list/cs.AI/new))

Li Zhetao, Zeng Xiyu

RyanCodrai/turbovec

Turbovec is a vector index built on TurboQuant with Rust internals and Python bindings. It targets high-speed similarity search for embeddings. Drop it into your stack if your current vector store is the bottleneck.

7,194

daytonaio/daytona

Daytona provides secure, elastic environments for running AI-generated code. If you're worried about letting agents touch prod, study this isolation model.

38,700

mindsdb

Markets itself as a "federated query engine for AI" and "the only MCP server you’ll ever need," exposing AI models and tools through a unified interface. Useful if you’re standardizing on MCP and want a batteries-included orchestration backend. ([github.com](https://github.com/trending?since=daily))

37,856

opencv/opencv

Classic computer vision library still climbing the charts. It powers everything from simple image filters to production vision systems. If you touch images or video at all, you should know what OpenCV can now do alongside modern deep models.

88,091

Efficient Training on Multiple Consumer GPUs with RoundPipe

Introduces a new pipeline schedule that avoids tight weight sharing constraints across stages when customizing large models. Targets setups with several consumer GPUs and slow interconnects, squeezing more throughput from cheap hardware. If your lab or startup runs on gamer cards, this is immediately actionable. ([huggingface.co](https://huggingface.co/papers/2604.27085))

Yibin Luo, Shiwei Gao

Back to Bytes: Revisiting Tokenization Through UTF-8

The authors propose UTF8Tokenizer, which maps bytes directly to token IDs and encodes control signals using old-school control bytes. This keeps embedding tables tiny, speeds up tokenization, and can be bolted onto existing models to improve convergence without changing how you run them.

Amit Moryossef, Clara Meister

The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines

This paper systematically measures how settings like batch size and max tokens affect throughput for common LLM engines. It shows that smart hyperparameter tuning can beat naive defaults by double-digit percentages, even when hardware stays the same.

Matias Martinez

cocoindex-io/cocoindex

A high-performance data transformation engine built for AI pipelines. It focuses on incremental processing, so you can keep large feature stores and training datasets in sync cheaply. ([github.com](https://github.com/trending))

4,395

labring/sealos

An "AI-native" cloud OS on Kubernetes that lets you spin up full stacks for modern AI apps. It targets teams that want their own mini-cloud for models and data.

16,930

dinoki-ai/osaurus

Osaurus is a native macOS server for local and cloud LLMs with OpenAI- and Anthropic-style APIs. Mac developers can swap providers without rewriting code.

2,200

nautechsystems/nautilus_trader

A high-performance trading backtester and live engine used in many ML-driven strategies. It’s battle-tested infrastructure if you want to train and deploy quant agents.

18,828