ArXiv Paper

GeoDM: Geometry-aware Distribution Matching for Dataset Distillation

Xuhui Li, Zhengquan Luo, Zihui Cui +1December 9, 2025

Summary

Proposes GeoDM, a dataset distillation framework that performs distribution matching in a product space of Euclidean, hyperbolic, and spherical manifolds, with learnable curvature and weights. This geometry-aware approach yields lower generalization error bounds and consistently outperforms prior distillation methods by better aligning synthetic and real-data manifolds. ([arxiv.org](https://arxiv.org/abs/2512.08317?utm_source=openai))

Topics

dataset-distillation geometry theory training

View Original View PDF

Related Content

huggingface/transformers

The standard library for state-of-the-art models in text, vision, audio, and combined formats. If you build with open models, you almost certainly depend on this already.

HuggingFace's Transformers: State-of-the-art Natural Language Processing

This 2019 paper launched the Transformers library, giving a clean API around many transformer models and pretrained checkpoints. It turned cutting-edge NLP into a reusable software layer that underpins most open-source LLM work today.

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

The authors build SpatialClaw, a code-driven agent that uses a stateful Python kernel plus vision tools to solve 3D and 4D spatial puzzles. It beats prior spatial agents across 20 benchmarks and six vision-language backbones, showing that the action interface design can unlock much stronger spatial reasoning.

ENPIRE: Agentic Robot Policy Self-Improvement in the Real World

Wraps real robots in a closed-loop system where coding agents iteratively reset scenes, run policies, check results, and improve code. If you’re serious about autonomous robot labs, this is basically a blueprint.