Back to AI Lab
ArXiv Paper

Reasoning for Mobile User Experience with Multimodal LLMs: Task, Benchmark, and Approach

Ruichao Mao, Zhou Fang, Teng Guo +9June 12, 2026

Summary

The authors introduce a benchmark where multimodal models must judge mobile app UX directly from full UI screenshots. They also propose a baseline model that reasons over layout, text and visual cues, highlighting how current systems miss many usability issues humans spot instantly.

Related Content