On June 28, 2026, Sina Finance reported that Google has moved its "computer use" tool from a separate Gemini 2.5 model into Gemini 3.5 Flash as a built‑in capability, allowing the model to see screens and operate browsers, mobile apps and desktop software. The feature is in public preview and targets long‑running enterprise automation workloads, with tight integration into Google Cloud.
This article aggregates reporting from 5 news sources. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
Computer use inside Gemini 3.5 Flash is a concrete step from “chatbot” toward full‑fledged agent. Instead of just generating text or code, the model can now perceive a live UI, click, type and scroll across browser, mobile and desktop surfaces, with built‑in safeguards against irreversible actions and prompt injection. That matters because a lot of real‑world productivity work is glue work between tools: moving data between CRM and spreadsheets, navigating internal dashboards, or triaging tickets. Embedding that ability into Google’s fastest, cheapest Gemini tier makes agentic workflows far more accessible to mainstream developers. ([blog.google](https://blog.google/innovation-and-ai/models-and-research/gemini-models/introducing-computer-use-gemini-3-5-flash/?utm_source=openai))
Strategically, this is Google trying to seize the “AI OS” slot before OpenAI or Anthropic. By tying computer use tightly to Google Cloud and Workspace, the company is betting that enterprises will want a single vendor that provides the model, the orchestration stack and the underlying productivity apps. The Sino coverage also highlights how investors are already treating this as a cloud story—cross‑sell and attach rates—rather than a pure AI demo. For other labs, the bar for being considered frontier is shifting: it’s no longer enough to have a strong model; you need an agent platform that can safely act in the world.


